I'm trying to pull out comments in a MATLAB file. In MATLAB, comments are denoted with % so the sensible thing would be to search for %.*. However, MATLAB also has functions like sprintf and fprintf which allow something like sprintf('x = %d', 5) and that regex would find %d', 5) as well, which I don't want. Of course I'd also want to ignore variations such as %s or %f. Is there a way to capture only those segments that match %.* but which are not enclosed in ' characters? I suppose I should clarify that I'm generally trying to capture comments starting with %, but ignoring any % within string literals. The sprintf was simply an example of such an occurence that I want to ignore.
I found this question, which seems related, but no solutions posted there solve my problem.
My final regex :
^(^[^']+|[^']+('.*')+[^']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)
regexp('%i am a comment', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')
response:
com2: 'i am a comment'
com: []
regexp('printf () ; %i am a comment after a command','^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')
response:
com2: []
com: 'i am a comment after a command'
regexp('printf ('' % i m not a comment '') , %i am a comment after a command followed by comma', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')
Response:
com2: []
com: 'i am a comment after a command followed by comma'
This case to make sure the comment isnt caught:
regexp('printf('' ;%i m not a comment '');', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')
ans =
0x0 struct array with fields:
com2
com
the comments are stored in variables com and com2
This doesn't meet the question's requirements, but I thought I'd share it anyway.
If MATLAB is accessible, then you can use the publish function, then pull out the comments with grep.
So for the following function in myfun.m
function [out] = myfun(n)
% Comment
out = ['% Not a ',... this is a comment too
'comment'];
fprintf('%d',n)%do this
%{
Multiline
comment
%}
we run
publish('myfun.m')
which produces the file html/myfun.html. Now with e.g. bash, we can run
egrep -o -e "<span class=\"comment\">.*?</span>" html/myfun.html
which returns
<span class="comment">% Comment</span>
<span class="comment"> this is a comment too</span>
<span class="comment">%do this</span>
<span class="comment">%}</span>
This is not quite there, since publish has split lines like this
<span class="comment">%{
</span><span class="comment"> Multiline
</span><span class="comment"> comment, n>2
</span><span class="comment">%}</span>
This needs How can I search for a multiline pattern in a file?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With