Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to parse XML tags in BigQuery Standard SQL?

I have read that it's a bad idea to parse XML/HTML using regular expressions. The alternative suggestion is to use an XML parser. Does one exist in the BigQuery Standard SQL library?

like image 228
Jonny Brooks Avatar asked Nov 01 '25 16:11

Jonny Brooks


1 Answers

Here is the documentation to how to use Javascript UDFs in BigQuery like Elliot has mentioned.

https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions

I imagine the UDF might look something like

CREATE TEMPORARY FUNCTION XML(x STRING)
RETURNS STRING
  LANGUAGE js AS """
  var data = fromXML(x);
  return data.title;
"""
OPTIONS(
library="gs://<BUCKET_NAME>/from-xml.min.js"
);
SELECT XML(a) FROM UNNEST(["<title>Title of Page</title>"]) as a

Where from-xml.min.js is from this library and loaded into your gcs account

like image 192
Lisa Yin Avatar answered Nov 03 '25 08:11

Lisa Yin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!