Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I invoke external command to transform json using jq?

Tags:

jq

I have the following json input:

{
  "url": "https://www.example.com", 
  "html": "<html>...</html>"
}

How can I use jq to extract all JavaScript <script> tags from html using pup?

For an example, I can extract all scripts I want from a single html using a pipe call:

cat example.json | jq -r .html | pup 'script[type="text/javascript"] text{}'

I would like to put all of these extracted scripts in a new resulting json:

{
  "url": "https://www.example.com",
  "scripts": [
    "<script>...",
    "<script>..."
  ]
}

If I try using:

jq -c '{url: .url, scripts: [.html | pup "script[type=text/javascript] text{}"]}'

it will not work because pup is an external command and not part of jq.

How can I achieve this?

like image 633
alturkovic Avatar asked Nov 02 '25 09:11

alturkovic


2 Answers

It's not possible directly from jq (jq cannot call external programs from within a jq program). But if your input only contains a single object with those two properties, the following should work in POSIX shells:

{
  jq '{url}' input.json;
  jq -r '.html' input.json | pup ... | jq -Rs '{scripts: .}';
} | jq -s 'add'

It's also possible to invoke jq with --arg – which still invokes jq 2 times and reads your input twice:

jq --arg scripts "$(jq -r '.html' input.json | pup ... )" \
'{url, $scripts}' input.json

Sample output:

{
  "url": "http://example.com",
  "scripts": "...."
}
like image 141
knittl Avatar answered Nov 04 '25 02:11

knittl


Use json{} instead of text{} to enable post-processing with jq. For example:

jsonfile='example.json'
jq '.scripts = (input | map(.text // empty))' "$jsonfile" <(
  jq -r '.html' "$jsonfile" | pup 'script[type=text/javascript] json{}'
)
like image 42
pmf Avatar answered Nov 04 '25 03:11

pmf