Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to remove empty <span> tags

Tags:

html

regex

php

I would like such empty span tags (filled with &nbsp; and space) to be removed:

<span> &nbsp; &nbsp; &nbsp; </span>

I've tried with this regex, but it needs adjusting:

(<span>(&nbsp;|\s)*</span>)

preg_replace('#<span>(&nbsp;|\s)*</span>#si','<\\1>',$encoded);


2 Answers

Translating Kent Fredric's regexp to PHP :

preg_match_all('#<span[^>]*(?:/>|>(?:\s|&nbsp;)*</span>)#im', $html, $result);

This will match :

  • autoclosing spans
  • spans on multilines and whatever the case
  • spans with attributes
  • span with unbreakable spaces

Maybe you should about including spans containings only <br /> as well...

As usual, when it comes to tweak regexp, some tools are handy :

http://regex.larsolavtorvik.com/

like image 58
mmm Avatar answered Jan 24 '26 12:01

mmm


.

qr{<span[^>]*(/>|>\s*?</span>)}

Should get the gist of them. ( Including XML style-self closing tags ie: )

But you really shouldn't use regex for HTML processing.

Answer only relevant to the context of the question that was visible before the formatting errors were corrected

like image 32
2 revsKent Fredric Avatar answered Jan 24 '26 12:01

2 revsKent Fredric