Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a JSON string to a struct column without schema in Spark

Spark: 3.0.0
Scala: 2.12.8

My data frame has a column with JSON string, and I want to create a new column from it with the StructType.

temp_json_string
{"name":"test","id":"12","category":[{"products":["A","B"],"displayName":"test_1","displayLabel":"test1"},{"products":["C"],"displayName":"test_2","displayLabel":"test2"}],"createdAt":"","createdBy":""}
root
 |-- temp_json_string: string (nullable = true)

Formatted JSON:

{
  "name":"test",
  "id":"12",
  "category":[
    {
      "products":[
        "A",
        "B"
      ],
      "displayName":"test_1",
      "displayLabel":"test1"
    },
    {
      "products":[
        "C"
      ],
      "displayName":"test_2",
      "displayLabel":"test2"
    }
  ],
  "createdAt":"",
  "createdBy":""
}

I want to create a new column of type Struct so I tried:

dataFrame
     .withColumn("temp_json_struct", struct(col("temp_json_string")))
     .select("temp_json_struct")

Now, I get the schema as:

root
 |-- temp_json_struct: struct (nullable = false)
 |    |-- temp_json_string: string (nullable = true)

Desired result:

root
 |-- temp_json_struct: struct (nullable = false)
 |    |-- name: string (nullable = true)
 |    |-- category: array (nullable = true)
 |    |    |-- products: array (nullable = true)
 |    |    |-- displayName: string (nullable = true)
 |    |    |-- displayLabel: string (nullable = true)
 |    |-- createdAt: timestamp (nullable = true)
 |    |-- updatedAt: timestamp (nullable = true)
like image 982
JDev Avatar asked Feb 04 '26 00:02

JDev


1 Answers

json_str_col is the column that has JSON string. I had multiple files so that's why the fist line is iterating through each row to extract the schema. If you know your schema up front then just replace json_schema with that.

json_schema = spark.read.json(df.rdd.map(lambda row: row.json_str_col)).schema
df = df.withColumn('new_col', from_json(col('json_str_col'), json_schema))
like image 128
jayrythium Avatar answered Feb 05 '26 18:02

jayrythium