Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert string column to json and parse in pyspark

My dataframe looks like

|ID|Notes|
---------------
|1|'{"Country":"USA","Count":"1000"}'|
|2|{"Country":"USA","Count":"1000"}|

ID : int
Notes : string

When i use from_json to parse the column Notes, it gives all Null values. I need help in parsing this column Notes into columns in pyspark

like image 854
KM Kavia Avatar asked Oct 27 '25 08:10

KM Kavia


1 Answers

When you are using from_json() function, make sure that the column value is exactly a json/dictionary in String format. In the sample data you have given, the Notes column value with id=1 is not exactly in json format (it is a string but enclosed within additional single quotes). This is the reason it is returning NULL values. Implementing the following code on the input dataframe gives the following output.

df = df.withColumn("Notes",from_json(df.Notes,MapType(StringType(),StringType())))

enter image description here

You need to change your input data such that the entire Notes column is in same format which is json/dictionary as a string and nothing more because it is the main reason for the issue. The below is the correct format that helps you to fix your issue.

| ID | Notes |
---------------
| 1 | {"Country":"USA","Count":"1000"} |
| 2 | {"Country":"USA","Count":"1000"} |

To parse Notes column values as columns in pyspark, you can simply use function called json_tuple() (no need to use from_json()). It extracts the elements from a json column (string format) and creates the result as new columns.

df = df.select(col("id"),json_tuple(col("Notes"),"Country","Count")) \
    .toDF("id","Country","Count")
df.show()

Output:

enter image description here

NOTE: json_tuple() also returns null if the column value is not in the correct format (make sure the column values are json/dictionary as a string without additional quotes).

like image 197
Saideep Arikontham Avatar answered Oct 28 '25 22:10

Saideep Arikontham



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!