Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need regular expression to remove /> between two HTML markup tags except img tag

I need some help crafting a regular expression which removes /> between two HTML markup tags.

<!-- The line could look like this -->
<td align=right valign=bottom nowrap><div>January 24, 2013 /></div></td>

<!-- Or this -->
<div>Is this system supported? /></div>

<!-- Even this -->
<span>This is a span tag /></div>

<!-- It could look like any of these but I do not want /> removed -->
<img src="example.com/example.jpg"/></img>
<img src="example.com/example.jpg"/>
<img src="example.com/example.jpg"/></img>
<div id="example"><img src="example.com/example.jpg"/></div>

(Yes, I realize the img tag has no closing tag associated with it. I am dynamically editing a myriad of pages I have not created; it's not my markup.)

Here's the regex I came up with (using perl):

s|(<.*?>(?!<img).*?)(\s*/>)(?!</img>)(</.*?>)|$1$3|gi;

Is there a better regex that's more efficient or faster?

After regex is applied to the above examples, here are the results:

<!-- The line could look like this -->
<td align=right valign=bottom nowrap><div>January 24, 2013></div></td>

<!-- Or this -->
<div>Is this system supported?></div>

<!-- Even this -->
<span>This is a span tag></div>

<!-- It could look like any of these but I do not want /> removed -->
<img src="example.com/example.jpg"/></img>
<img src="example.com/example.jpg"/>
<img src="example.com/example.jpg"/></img>
<div id="example"><img src="example.com/example.jpg"/></div>
like image 438
user717236 Avatar asked Dec 08 '25 09:12

user717236


1 Answers

A shorter solution would be:

s/(<[^>]*>[^<]*)\/>/$1/g

It groups an opening tag and the possibly following content, excluding the opening angular bracket - which would indicate another tag. Then it looks for />. If it is found, substition is used to remove it.

Update: The question was extended to remove possible whitespace before the />. This can be done by making the [^<]* part "lazy" like so:

s/(<[^>]*>[^<]*?)\s*\/>/$1/g

See for yourself on regex101 (link updated).

like image 105
zb226 Avatar answered Dec 09 '25 23:12

zb226



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!