I'm battling building a regular expression for a html scan. The regex should match any <script> html element that does not contain either the async or the defer attributes.
<script defer src="a.js"> should not match<script src="a.js" defer> should not match<script src="a.js"> should match<script async src="a.js"> should not match<script src="a.js" async> should not matchI've fiddled with a solution for setting this up on single attributes (^.*<script((?!defer).)*$ and ^.*<script((?!async).)*$ respectively) and I also created one that matches if the html element does contain either attribute (<script.+(?=defer|async).+>) but I can't grok it the other way around.
Any ideas?
Your experiments were very close!
Try the following regular expression, and demonstrated with your examples here plus a few other use cases that might occur like <script test defer src="a.js"> or <script test src>.
<script(?:(?!defer|async).)*?>
Explanation
<script - < character followed by the word script(?:(?!defer|async).)*? - negative lookahead for the word defer or async followed by a single character, matching multiple times if possible but preferring fewer matches (i.e. match a single character not preceded by the word defer or
async, matching multiple times if possible but preferring fewer
matches)> - > characterTests
|--------------------------------|-----------|
| Use case | Matches |
|--------------------------------|-----------|
| <script defer src="a.js"> | NO |
| <script async src="a.js"> | NO |
| <script src="a.js"> | YES |
| <script test src="a.js"> | YES |
| <script test async src="a.js"> | NO |
| <script src="a.js" defer> | NO |
| <script src="a.js" async> | NO |
| <script test src> | YES |
|--------------------------------|-----------|
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With