I'm trying to read the data from Cassandra and write to Redis of a specific index. let's say Redis DB 5.
I need to write all data into Redis DB index 5 in the hashmap format.
val spark = SparkSession.builder()
.appName("redis-df")
.master("local[*]")
.config("spark.redis.host", "localhost")
.config("spark.redis.port", "6379")
.config("spark.redis.db", 5)
.config("spark.cassandra.connection.host", "localhost")
.getOrCreate()
import spark.implicits._
val someDF = Seq(
(8, "bat"),
(64, "mouse"),
(-27, "horse")
).toDF("number", "word")
someDF.write
.format("org.apache.spark.sql.redis")
.option("keys.pattern", "*")
//.option("table", "person"). // Is it mandatory ?
.save()
Can I save data into Redis without a table name? Actually just I want to save all data into Redis index 5 without table name is it possible? I have gone through the documentation of spark Redis connector I don't see any example related to this. Doc link : https://github.com/RedisLabs/spark-redis/blob/master/doc/dataframe.md#writing
I'm currently using this version of spark redis-connector
<dependency>
<groupId>com.redislabs</groupId>
<artifactId>spark-redis_2.11</artifactId>
<version>2.5.0</version>
</dependency>
Did anyone face this issue? any workaround?
The error I get if I do not mention the table name in the config
FAILED
java.lang.IllegalArgumentException: Option 'table' is not set.
at org.apache.spark.sql.redis.RedisSourceRelation$$anonfun$tableName$1.apply(RedisSourceRelation.scala:208)
at org.apache.spark.sql.redis.RedisSourceRelation$$anonfun$tableName$1.apply(RedisSourceRelation.scala:208)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.redis.RedisSourceRelation.tableName(RedisSourceRelation.scala:208)
at org.apache.spark.sql.redis.RedisSourceRelation.saveSchema(RedisSourceRelation.scala:245)
at org.apache.spark.sql.redis.RedisSourceRelation.insert(RedisSourceRelation.scala:121)
at org.apache.spark.sql.redis.DefaultSource.createRelation(DefaultSource.scala:30)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
The table option is mandatory. The idea is that you specify the table name, so it is possible to read the dataframe back from Redis providing that table name.
In your case another option is to convert the dataframe to the key/value RDD and use sc.toRedisKV(rdd)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With