Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsec permutation parser with separators

Tags:

haskell

parsec

I want to parse assembly programs. I have a fixed format for parsing an assembly address: [ register + offset + label ] I implemented parsers for registers, offsets and labels. Now I want to create a parser which parses the whole address.

The combinations I want to accept:

[register]
[offset]
[label]
[register + offset]
[register + label]
[offset + label]
[register + offset + label]

And what I don't want to accept:

[]
[register offset]
[register + ]
...

Of course the simple solution is to have something like:

choice $ try (parseRegister >>= \r -> Address (Just r) Nothing Nothing)
       <|> try ...

But it is ugly and does not scale well with more types of elements. So I'm looking for a cleaner solution.

like image 304
Boldizsár Németh Avatar asked May 09 '26 09:05

Boldizsár Németh


2 Answers

If you reorder your table, you see it’s a series of choices:

[register + offset + label]
[register + offset        ]
[register          + label]
[register                 ]
[           offset + label]
[           offset        ]
[                    label]

The grammar for which might be written:

address      = '[' (register ('+' offset-label)? | offset-label) ']'
offset-label = offset ('+' label)? | label

Which in Applicative style is pretty straightforward, made only slightly noisy by wrapping everything in constructors:

parseAddress :: Parser Address
parseAddress = do
  (register, (offset, label)) <- between (char '[') (char ']') parseRegisterOffsetLabel
  return $ Address register offset label

parseRegisterOffsetLabel :: Parser (Maybe Register, (Maybe Offset, Maybe Label))
parseRegisterOffsetLabel = choice
  [ (,)
    <$> (Just <$> parseRegister)
    <*> option (Nothing, Nothing) (char '+' *> parseOffsetLabel)
  , (,) Nothing <$> parseOffsetLabel
  ]

parseOffsetLabel :: Parser (Maybe Offset, Maybe Label)
parseOffsetLabel = choice
  [ (,)
    <$> (Just <$> parseOffset)
    <*> option Nothing (char '+' *> (Just <$> parseLabel))
  , (,) Nothing . Just <$> parseLabel
  ]

If we add a couple of utility functions:

plus :: Parser a -> Parser a
plus x = char '+' *> x

just :: Parser a -> Parser (Maybe a)
just = fmap Just

We can clean up these implementations a bit:

parseRegisterOffsetLabel = choice
  [ (,)
    <$> just parseRegister
    <*> option (Nothing, Nothing) (plus parseOffsetLabel)
  , (,) Nothing <$> parseOffsetLabel
  ]

parseOffsetLabel = choice
  [ (,)
    <$> just parseOffset
    <*> option Nothing (plus (just parseLabel))
  , (,) Nothing <$> just parseLabel
  ]

Then factor out the repetition, giving us a decent final solution:

parseChain begin def rest = choice
  [ (,) <$> just begin <*> option def (plus rest)
  , (,) Nothing <$> rest
  ]

parseRegisterOffsetLabel = parseChain
  parseRegister (Nothing, Nothing) parseOffsetLabel

parseOffsetLabel = parseChain
  parseOffset Nothing (just parseLabel)

I’ll let you take care of whitespace around + and inside [].

like image 146
Jon Purdy Avatar answered May 11 '26 23:05

Jon Purdy


Something like that:

parsePlus = many1 (char ' ') >> char '+' >> many1 (char ' ')

parseRegisterModified = parsePlus >> parseOffsetLabel

parseOffsetModified = parsePlus >> parseLabel

parseRegister' = do
    Address r _ _ <- parseRegister 
    optionMaybe parseRegisterModified >>=
    return $ maybe 
           (Address r Nothing Nothing) 
           (\Address _ o l -> Address r o l) 

parseOffset' = do
    Address _ o _ <- parseOffset 
    optionMaybe parseOffsetModified >>=
    return $ maybe 
           (Address Nothing o Nothing) 
           (\Address _ _ l -> Address Nothing o l)

parseOffsetLabel = try parseOffset' <|> parseLabel

parseAddress = 
     try parseRegister'
     <|> parseOffset'
     <|> parseLabel
like image 43
wit Avatar answered May 11 '26 22:05

wit