I was reading the arrow tutorial. https://en.wikibooks.org/wiki/Haskell/Understanding_arrows#Avoiding_leaks. And I got confused reading this part
solfege :: Parser Char String
solfege = string "Do" <|> string "Re" <|> string "Mi"
Through this last example, we can indicate which issue Swierstra and Duponcheel were trying to tackle. When solfege is run on the string "Fa", we can't detect the parser will fail until all of the three alternatives have failed. If we had more complex parsers in which one of the alternatives might fail only after attempting to consume a lot of the input stream, we would have to descend down the chain of parsers in the very same way. All of the input that can possibly be consumed by later parsers must be retained in memory in case one of them happens to be able to consume it. That can lead to much more space usage than you would naively expect − a situation often called a space leak.
I don't understand that solfege has risk of causing space leaks. Every alternative in solfege(string "Do", string "Re", and string "Mi") fails on the comparison with 'F', the first character of "Fa". If these alternatives consume more than one character and fail, then I can accept that solfege causes space leaks. However, that's only possible when string's implementation is dumb. Also, it's hard for me to imagine a monadic parser that fails late after consuming unnecessary tokens, unless it's poorly written on purpose.
Am I missing something in the article?
The key part that you probably glossed over is this bit:
If we had more complex parsers in which one of the alternatives might fail only after attempting to consume a lot of the input stream, ...
So, string "Do" is not meant to be taken literally, but as standing in for a parser that does more work and may have to look arbitrarily far ahead in the input to know whether it should fail or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With