Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract the content attribute of the meta name=generator tag?

I am using the below code to extract meta 'generator' tag content from a web page using Jsoup:

Elements metalinks = doc.select("meta[name=generator]");
boolean metafound=false;

if(metalinks.isEmpty()==false)
{ 
    metatagcontent = metalinks.first().select("content").toString();
    metarequired=metatagcontent;
    metafound=true;
}
else 
{
    metarequired="NOT_FOUND";
    metafound=false;
}

The problem is that for a page that does contain the meta generator tag, no value is shown (when I output the value of variable 'metarequired'. For a page that does not have meta generator tag, the value 'NOT_FOUND' is shown correctly. What am I doing wrong here?

like image 330
Arvind Avatar asked Feb 02 '26 05:02

Arvind


1 Answers

From your code,

metalinks.first().select("content").toString();

This is not correct. This is merely selecting

<meta ...>
    <content ... /> <!-- This one, which of course doesn't exist. -->
</meta>

while you actually want to get the attribute

<meta ... content="..." />

You need to use attr("content") instead of select("content").

metatagcontent = metalinks.first().attr("content");

See also:

  • Jsoup cookbook - Selector syntax
  • Jsoup Selector API documentation
  • W3 CSS3 selector specification

Unrelated to the concrete problem, you don't need to test against a boolean inside an if block. The isEmpty() already returns a boolean:

if (!metalinks.isEmpty())
like image 128
BalusC Avatar answered Feb 03 '26 20:02

BalusC



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!