Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to remove JavaScript double slash (//) style comments

Tags:

c#

.net

regex

I'm trying to do remove JavaScript comments via a regular expression in C# and have become stuck. I want to remove any occurrences of double slash // style comments.

My current regex is (?<!:)//[^\r\n]* which will catch all comments and prevent matching of http://. However, the negative lookbehind was lazy and of course bit me back in the following test case:

var XSLPath = "//" + Node;

So I'm looking for a regular expression that will perform a lookbehind to see if an even number of double quotes (") occurs before the match. I'm not sure if this is possible. Or is there maybe a better way to do this?

like image 201
Gavin Miller Avatar asked Sep 01 '25 06:09

Gavin Miller


1 Answers

(Updated based on comments)

It looks like this works pretty well:

(?<=".*".*)//.*$|(?<!".*)//.*$

It appears that the test cases in Regex Hero show that it'll match comments the way I think it should (almost).

For instance, it'll completely ignore this line:

var XSLPath = "//" + Node;

But it's smart enough to match the comment at the end of this line:

var XSLPath = "//"; // stuff to remove

However, it's not smart enough to know how to deal with 3 or more quotation marks before the comment. I'm not entirely sure how to solve that problem without hard-coding it. You need some way to allow an even number of quotes.

like image 116
Steve Wortham Avatar answered Sep 02 '25 19:09

Steve Wortham