I am encountering a different behavior of regexp() on linux and windows using MatLab. I am trying to separate a string based on a separator. Here is a minimal example:
Linux
test_string = '<some_path>/tool/test/unit_test'
seperator = sprintf('%stest%s',filesep,filesep)
regexp(test_string, seperator,'split')
Output:
1×2 cell array
{'<some_path>/tool'} {'unit_test'}
Windows
test_string = '<some_path>\tool\test\unit_test'
seperator = sprintf('%stest%s',filesep,filesep)
regexp(test_string, seperator,'split')
Output
1×1 cell array
{'<some_path>\src\tool\test\unit_test'}
The output of this code snippet on Linux represents the behavior I want. Could anyone explain or point towards resources to understand what is going on?
The path separators are different in Linux (/) and Windows (\).
The \ character is a special regex metacharacter, it is used to form "regex escapes", like \d to match digits, etc. To match a literal backslash, it must be doubled, or escaped.
To escape any special regex metacharacters, you can use regexptranslate(op, str) with op set to escape:
seperator = sprintf('%stest%s',regexptranslate('escape',filesep), regexptranslate('escape',filesep))
Other op possible values are:
| Type of Translation | Description |
|---|---|
'escape' |
Translate all special characters in str, such as '$', '.', '?','[', so that they are treated as literal characters when used in regexp, regexpi, and regexprep. The translation inserts a backslash, or escape, character, '\', before each special character in str. |
'wildcard' |
Translate all wildcard and '.' characters in str so that they are treated as literal wildcard characters and periods when used in regexp, regexpi, and regexprep. The translation replaces all instances of '*' with '.*', all instances of '?' with '.', and all instances of '.' with '\.'. |
'flexible' |
Replace text in str with a regular expression that matches the text. If you specify 'flexible', then also specify a regular expression to use as a replacement: newStr = regexptranslate('flexible',str,expression). The expression input can be a character vector or string scalar. This syntax is equivalent to newStr = regexprep(str,expression,regexptranslate('escape',expression)). |
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With