Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to strip BBCode

I need a regular expression to strip out any BBCode in a string. I've got the following (and an array with tags):

new RegExp('\\[' + tags[index] + '](.*?)\\[/' + tags[index] + ']');

It picks up [tag]this[/tag] just fine, but fails when using [url=http://google.com]this[/url].

What do I need to change? Thanks a lot.


2 Answers

I came across this thread and found it helpful to get me on the right track, but here's an ultimate one I spent two hours building (it's my first RegEx!) for JavaScript and tested to work very well for crazy nests and even incorrectly nested strings, it just works!:

string = string.replace(/\[\/?(?:b|i|u|url|quote|code|img|color|size)*?.*?\]/img, '');

If string = "[b][color=blue][url=www.google.com]Google[/url][/color][/b]" then the new string will be "Google". Amazing.

Hope someone finds that useful, this was a top match for 'JavaScript RegEx strip BBCode' in Google ;)

like image 177
JonusC Avatar answered Mar 08 '26 21:03

JonusC


You have to allow any character other than ']' after a tag until you find ' ]'.

new RegExp('\\[' + tags[index] + '[^]]*](.*?)\\[/' + tags[index] + ']');

You could simplify this to the following expression.

\[[^]]*]([^[]*)\[\\[^]]*]

The problem with that is, that it will match [WrongTag]stuff[\WrongTag], too. Matching nested tags requires using the expression multiple times.

like image 20
Daniel Brückner Avatar answered Mar 08 '26 22:03

Daniel Brückner



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!