Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a map column in Apache Spark from other columns

I searched this quite a bit but cannot find anything that I can adapt to my situation. I have a dataframe like so:

+-----------------+---------------+
|             keys|         values|
+-----------------+---------------+
|[one, two, three]|[101, 202, 303]|
+-----------------+---------------+

Keys have an array of strings, values has an array of ints.

I want to create a new column that contains a map of keys to values like so:

+-----------------+---------------+---------------------------+
|             keys|         values|                        map|
+-----------------+---------------+---------------------------+
|[one, two, three]|[101, 202, 303]|Map(one->101, two->202, etc|
+-----------------+---------------+---------------------------+

I've been looking at this question, but not sure it can be used as a starting point for my situation: Spark DataFrame columns transform to Map type and List of Map Type

I need this in Scala please.

Thanks!

like image 848
olegmeister Avatar asked Oct 20 '25 22:10

olegmeister


1 Answers

as of Spark 2.4 there is a built in version def map_from_arrays(keys: Column, values: Column): Column in org.apache.spark.sql.functions

like image 151
Antalagor Avatar answered Oct 24 '25 00:10

Antalagor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!