What causes this Serialization error in Apache Spark 1.4.0 when calling:
sc.parallelize(strList, 4)
This exception is thrown:
com.fasterxml.jackson.databind.JsonMappingException: 
Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
Thrown from addBeanProps in Jackson: com.fasterxml.jackson.databind.deser.BeanDeserializerFactory#addBeanProps
The RDD is a Seq[String], and the #partitions doesn't seem to matter (tried 1, 2, 4).
There is no serialization stack trace, as normal the worker closure cannot be serialized.
What is another way to track this down?
@Interfector is correct. I ran into this issue also, here's a snippet from my sbt file and the 'dependencyOverrides' section which fixed it.
libraryDependencies ++= Seq(
  "com.amazonaws" % "amazon-kinesis-client" % "1.4.0",
  "org.apache.spark" %% "spark-core" % "1.4.0",
  "org.apache.spark" %% "spark-streaming" % "1.4.0",
  "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.0",
  "com.amazonaws" % "aws-java-sdk" % "1.10.2"
)
dependencyOverrides ++= Set(
  "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
)
I suspect that this is caused by the classpath providing you with a different version of jackson than the one Spark is expecting (that is 2.4.4 if I'm not mistaking). You will need to adjust your classpath so that the correct jackson is referenced first for Spark.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With