Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different behavior of regexp() on linux and windows in matlab

I am encountering a different behavior of regexp() on linux and windows using MatLab. I am trying to separate a string based on a separator. Here is a minimal example:

Linux

test_string = '<some_path>/tool/test/unit_test'
seperator = sprintf('%stest%s',filesep,filesep)
regexp(test_string, seperator,'split')

Output:

1×2 cell array
{'<some_path>/tool'}    {'unit_test'}

Windows

test_string = '<some_path>\tool\test\unit_test'
seperator = sprintf('%stest%s',filesep,filesep)
regexp(test_string, seperator,'split')

Output

1×1 cell array
{'<some_path>\src\tool\test\unit_test'}

The output of this code snippet on Linux represents the behavior I want. Could anyone explain or point towards resources to understand what is going on?

like image 499
Zeitproblem Avatar asked Oct 20 '25 07:10

Zeitproblem


1 Answers

The path separators are different in Linux (/) and Windows (\).

The \ character is a special regex metacharacter, it is used to form "regex escapes", like \d to match digits, etc. To match a literal backslash, it must be doubled, or escaped.

To escape any special regex metacharacters, you can use regexptranslate(op, str) with op set to escape:

seperator = sprintf('%stest%s',regexptranslate('escape',filesep), regexptranslate('escape',filesep))

Other op possible values are:

Type of Translation Description
'escape' Translate all special characters in str, such as '$', '.', '?','[', so that they are treated as literal characters when used in regexp, regexpi, and regexprep. The translation inserts a backslash, or escape, character, '\', before each special character in str.
'wildcard' Translate all wildcard and '.' characters in str so that they are treated as literal wildcard characters and periods when used in regexp, regexpi, and regexprep. The translation replaces all instances of '*' with '.*', all instances of '?' with '.', and all instances of '.' with '\.'.
'flexible' Replace text in str with a regular expression that matches the text. If you specify 'flexible', then also specify a regular expression to use as a replacement: newStr = regexptranslate('flexible',str,expression). The expression input can be a character vector or string scalar.

This syntax is equivalent to newStr = regexprep(str,expression,regexptranslate('escape',expression)).
like image 186
Wiktor Stribiżew Avatar answered Oct 21 '25 22:10

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!