Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List of struct's field names in Spark dataframe

I have a dataframe with the following schema:

root
 |-- _id: long (nullable = true)
 |-- student_info: struct (nullable = true)
 |    |-- firstname: string (nullable = true)
 |    |-- lastname: string (nullable = true)
 |    |-- major: string (nullable = true)
 |    |-- hounour_roll: boolean (nullable = true)
 |-- school_name: string (nullable = true)

How can I get a list of columns under "student_info" only? I.e. ["firstname","lastname","major","honour_roll"]

like image 519
Pari Avatar asked Sep 08 '25 00:09

Pari


1 Answers

All of the following return the list of struct's field names. The .columns approach looks cleanest.

df.select("student_info.*").columns
df.schema["student_info"].dataType.names
df.schema["student_info"].dataType.fieldNames()
df.select("student_info.*").schema.names
df.select("student_info.*").schema.fieldNames()
like image 167
ZygD Avatar answered Sep 10 '25 08:09

ZygD