As practice, I'm trying to parse some standard text that is an output of a shell command.
pool: thisPool
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: none requested
config:
NAME STATE READ WRITE CKSUM
homePool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD5000AZLX-00CL5A0_WD-WCC3F7NUE93C ONLINE 0 0 0
ata-WDC_WD5000AZLX-00CL5A0_WD-WCC3F7RE2A4F ONLINE 0 0 0
cache
ata-KINGSTON_SV300S37A60G_50026B7261025D7E-part3 ONLINE 0 0 0
errors: No known data errors
I want to use a Perl6 grammar and I want to capture each of the fields in a separate token or regex. So, I made the following grammar:
grammar zpool {
regex TOP { \s+ [ <keyword> <collection> ]+ }
token keyword { "pool: " | "state: " | "status: " | "action: " | "scan: " | "config: " | "errors: " }
regex collection { [<:!keyword>]* }
}
My idea is that the regex finds a keyword, then begins collecting all the data until the next keyword. However, each time, I just get "pool: " -> all the remaining text.
keyword => 「pool: 」
collection => 「homePool
state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(5) for details.
scan: none requested
config:
NAME STATE READ WRITE CKSUM
homePool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-WDC_WD5000AZLX-00CL5A0_WD-WCC3F7NUE93C ONLINE 0 0 0
ata-WDC_WD5000AZLX-00CL5A0_WD-WCC3F7RE2A4F ONLINE 0 0 0
cache
ata-KINGSTON_SV300S37A60G_50026B7261025D7E-part3 ONLINE 0 0 0
errors: No known data errors
」
I don't know how to get it to stop eating the characters when it finds a keyword and then treat that as another keyword.
Problem 1
You've written <:!keyword> instead of <!keyword>. That's not what you want. You need to delete the :.
The <:foo> syntax in a P6 regex matches a single character with the specified Unicode property, in this case the property :foo which in turn means :foo(True).
And <:!keyword> matches a single character with the Unicode property :keyword(False).
But there is no Unicode property :keyword.
So the negative assertion will always be true and will always match a single character of input each time.
So the pattern just munches its way thru the rest of the text, as you know.
Problem 2
Once you fix problem 1, a second problem arises.
<:!keyword> matches a single character with the Unicode property :keyword(False). It automatically munches some input (a single character) each time it matches.
In contrast, <!keyword> does not consume any input if it matches. You have to make sure the pattern that uses it munches input.
After fixing those two problems you'll get the sort of output you expected. (The next problem you'll see is that the config keyword doesn't work because the : in config: in your input file example isn't followed by a space.)
So, with a few clean ups:
my @keywords = <pool state status action scan config errors> ;
say grammar zpool {
token TOP { \s+ [ <keyword> <collection> ]* }
token keyword { @keywords ': ' }
token collection { [ <!keyword> . ]* }
}
I've switched all the patterns to token declarations. In general, always use token unless you know you need something else. (regex enables backtracking. That can dramatically slow things down if you're not careful. rule makes spaces in the rule significant.)
I've extracted the keywords into an array. @keywords means @keywords[0] | @keywords[1] | ....
I've added a . after <!keyword> in the last pattern (to consume a character's worth of input, to avoid the infinite loop that would otherwise occur given that <!foo> does not consume any input).
In case you haven't seen them, note that the available grammar debugging options are your friend.
Hth
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With