Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extract meta tag using regex

I need to extract meta tag from a string for which I am using \<meta[\s\S]*?\> but along with this, I want to ignore if a meta have some ignore (or someIgnore) attribute in it.

<meta property="position" content="1" someIgnore ignore="metaextract"/>.
This is my sample function.


function parseMetas(locals) {
    var str = locals.body, regex = /\<meta[\s\S]*?\>/g;
    if (regex.test(str)) {
        locals.body = str.replace(regex, '');
        locals.meta = str.match(regex).join('\n');
    }
}

like image 268
Aditya Sharma Avatar asked Oct 25 '25 15:10

Aditya Sharma


1 Answers

You could use negative lookahead in regex.

function parseMetas(locals) {
    var str = locals.body, 
    let regex = /<meta(?!.*?(ignore|someIgnore))[\s\S]*?\/?>/g;
    if (regex.test(str)) {
        locals.body = str.replace(regex, '');
        locals.meta = str.match(regex).join('\n');
    }
}

Demo:

let regex = /<meta(?!.*(ignore|someIgnore))[\s\S]*?\/>/g;
let input = `
    <meta property="position" content="1" someIgnore ignore="metaextract"/>,
    <meta property="position" content="1" ignore="metaextract"/>,
    <meta property="position" content="1"/>,
    <meta property="position" content="1" someIgnore />,
    <meta name="description" content="type_your_description_here"/>,
    <meta charset="utf-8"/>'
`;


console.log(input.match(regex));
like image 158
Sohail Ashraf Avatar answered Oct 27 '25 04:10

Sohail Ashraf