Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading a Dictionary inside JSON

Here is my JSON

[{"dict": {"key": "value1"}}, {"dict": {"key": "value2"}}]

Here is my parse code:

val mdf = sparkSession.read.option("multiLine","true").json("multi2.json")
mdf.show(false)

This outputs:

+--------+
|dict    |
+--------+
|[value1]|
|[value2]|
+--------+

I want to see the name-value pairs? The keys and the values.

How do I do this?

Thanks

like image 805
More Than Five Avatar asked Oct 16 '25 21:10

More Than Five


1 Answers

If you want to expand data just select dict.* (note that the option is named multiline not multiLine):

val df = spark.read.option("multiline", "true").json("multi2.json")
df.select($"dict.*").show

// +------+
// |   key|
// +------+
// |value1|
// |value2|
// +------+

If you want to treat it as a dictionary just provide the schema:

import org.apache.spark.sql.types._

val schema = StructType(Seq(
  StructField("dict", MapType(StringType, StringType))
))

val dfm = spark.read
  .schema(schema)
  .option("multiline", "true")
  .json("multi2.json")

dfm.show
// +------------------+
// |              dict|
// +------------------+
// |Map(key -> value1)|
// |Map(key -> value2)|
// +------------------+

and if you want a pair per row, just explode the result:

import org.apache.spark.sql.functions._

dfm.select(explode(col("dict"))).show
// +---+------+
// |key| value|
// +---+------+
// |key|value1|
// |key|value2|
// +---+------+
like image 119
Alper t. Turker Avatar answered Oct 18 '25 14:10

Alper t. Turker