I want to remove the ISO codes and leading underscore from all elements in an array while keeping the file extension. The ISO code always comes before the file extension.
The source array:
var SrcFiles = [
"File_with_nr1_EN.txt",
"File_has_NR_3_ZHHK.txt",
"File_yy_nr_2_DE.pdf"
];
I want it to look like this:
var SrcFiles = [
"File_with_nr1.txt",
"File_has_NR_3.txt",
"File_yy_nr_2.pdf"
];
How do I go about this? Probably with a regex, but how? I found a good regex to only match the file endings, but don't really know how this might help me.
const re = /(?:\.([^.]+))?$/;
Look for a _ followed by anything that isn't a _ ([^_]), followed by: a . followed by anything that is't a _ at the end ($). The part in bold should be captured as $1.
var SrcFiles = [
"File_with_nr1_EN.txt",
"File_has_NR_3_ZHHK.txt",
"File_yy_nr_2_DE.pdf"
];
var re = /_[^_]+(\.[^_]+)$/;
console.log(SrcFiles.map(f => f.replace(re, "$1")));
REGEX101 DEMO!
You may capture all up to the last _, match _ and 1+ uppercase letters, and then capture a dot and the subsequent 1+ chars other than a dot up to the end of string:
/^(.*)_[A-Z]+(\.[^.]+)$/
and replace with $1$2 where $1 is the backreference to Group 1 and $2 refers to the value in Group 2.
The [A-Z]+ can be enhanced to [A-Z]{2,} (since ISO codes usually consist of at least 2 chars) and if a hyphen can appear there, use _[A-Z-]{2,}.
See the JS demo:
var SrcFiles = [
"File_with_nr1_EN.txt",
"File_has_NR_3_ZHHK.txt",
"File_yy_nr_2_DE.pdf"
];
var res = SrcFiles.map(x => x.replace(/^(.*)_[A-Z]+(\.[^.]+)$/, '$1$2'));
// ES5
//var res = SrcFiles.map(function(x) {return x.replace(/^(.*)_[A-Z]+(\.[^.]+)$/, '$1$2'); });
console.log(res);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With