How can I use parsec to parse all matched input in a string and discard the rest?
Example: I have a simple number parser, and I can find all the numbers if I know what separates them:
num :: Parser Int
num = read <$> many digit
parse (num `sepBy` space) "" "111 4 22"
But what if I don't know what is between the numbers?
"I will live to be 111 years <b>old</b> if I work out 4 days a week starting at 22."
many anyChar doesn't work as a separator, because it consumes everything.
So how can I get things that match an arbitrary parser surrounded by things I want to ignore?
EDIT: Note that in the real problem, my parser is more complicated:
optionTag :: Parser Fragment
optionTag = do
    string "<option"
    manyTill anyChar (string "value=")
    n <- many1 digit
    manyTill anyChar (char '>')
    chapterPrefix
    text <- many1 (noneOf "<>")
    return $ Option (read n) text
  where
    chapterPrefix = many digit >> char '.' >> many space
The replace-megaparsec package allows you to split up a string into sections which match your pattern and sections which don't match by using the sepCap parser combinator.
import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char
let num :: Parsec Void String Int
    num = read <$> many digitChar
>>> parseTest (sepCap num) "I will live to be 111 years <b>old</b> if I work out 4 days a week starting at 22."
[Left "I will live to be "
,Right 111
,Left " years <b>old</b> if I work out "
,Right 4
,Left " days a week starting at "
,Right 22
,Left "."
]
For an arbitrary parser myParser, it's quite easy:
solution = many (let one = myParser <|> (anyChar >> one) in one)
It might be clearer to write it this way:
solution = many loop
    where 
        loop = myParser <|> (anyChar >> loop)
Essentially, this defines a recursive parser (called loop) that will continue searching for the first thing that can be parsed by myParser. many will simply search exhaustively until failure, ie: EOF.
You can use
 many ( noneOf "0123456789")
i'm not sure about "noneOf" and "digit" types but you can give e try also to
many $ noneOf digit
To find the item in the string, the item is either at the start of the string, or consume one character and look for the item in the now-shorter string. If the item isn't right at the start of the string, you'll need to un-consume the characters used while looking for it, so you'll need a try block. 
hasItem = prefixItem <* (many anyChar)
preafixItem = (try item) <|> (anyChar >> prefixItem)
item = <parser for your item here>
This code looks for just one occurrence of item in the string. 
(AJFarmar almost has it.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With