Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx string splitter

What am I doing wrong here? I am trying to extract from this "list"

ARTICLE 11 - Title AA
ARTICLE 22 Title BB
ARTICLE 33
ARTICLE 44 - Title DD
ARTICLE 55 Title EE

all the article numbers and the titles (if any) for each article. The "-" is optional when title exists.

With this RegEx

(article)(\s*)([^\s]*)((\s*)(-)?(\s*)(.*))

I get only 4 items. The item 33 and 44 are considered one article only and this is I suppose just because "ARTICLE 33" has no title.

11|Title AA
22|Title BB
33|ARTICLE 44 - Title DD
55|Title EE

Please see the code here: http://jsfiddle.net/Z94wf/

EDIT

What I expect to get is this:

11|Title AA
22|Title BB
33|
44|Title DD
55|Title EE

Thanks

like image 554
leoinfo Avatar asked Dec 03 '25 21:12

leoinfo


2 Answers

You second \s* is matching the newline char on the 3rd line, so if you change to explicitly match only space and dash as follows

(article)(\s*)([^\s]+)(([ -]*)(.*))

you get the desired result

http://jsfiddle.net/Z94wf/37/

like image 99
cordsen Avatar answered Dec 06 '25 11:12

cordsen


I can't be sure on all of the forms of your input but what about something with a few less groups and a bit more explicit...

ARTICLE\s+(\d+)[\s-]*(.*)

This should match the starting literal followed by some space followed by the number and then an optional set of spaces and the "-" char and then everything else.

like image 26
Andrew White Avatar answered Dec 06 '25 12:12

Andrew White



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!