Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify and extract html element by curl

when I tried to curl some pages.

curl http://test.com

I can get like following result

<html>
<body>
<div>
  <dl>
    <dd> 10 times </dd>
  </dl>
</div>
</body>
</html>

my desired result is like simply 10 times..

Are there any good way to achieve this ?

If someone has opinion please let me know

Thanks

like image 961
Heisenberg Avatar asked Aug 31 '25 17:08

Heisenberg


1 Answers

If you are are unable to use a html parser for what ever reason, for your given simple html example, you could use:

 curl http://test.com | sed -rn 's@(^.*<dd>)(.*)(</dd>)@\2@p'

Redirect the output of the curl command into sed and enable regular expression interpretation with -r or -E. Split the lines into three sections and substitute the line for the second section only, printing the result.

like image 53
Raman Sailopal Avatar answered Sep 02 '25 10:09

Raman Sailopal