I'm trying to run a spark sql test against a hive table using the Spark Java API. The problem I am having is with kerberos. Whenever I attempt to run the program I get this error message:
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS];
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:69)
at org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:69)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$2.apply(BaseSessionStateBuilder.scala:293)
at org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:79)
at org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:79)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
at tester.SparkSample.lambda$0(SparkSample.java:62)
... 5 more
on this line of code:
ss.sql("select count(*) from entps_pma.baraccount").show();
Now when I run the code, I log into kerberos just fine and get this message:
18/05/01 11:21:03 INFO security.UserGroupInformation: Login successful for user <kerberos user> using keytab file /root/hdfs.keytab
I even connect to the Hive Metastore:
18/05/01 11:21:06 INFO hive.metastore: Trying to connect to metastore with URI thrift://<hiveserver>:9083
18/05/01 11:21:06 INFO hive.metastore: Connected to metastore.
But right after that I get the error. Appreciate any direction here. Here is my code:
public static void runSample(String fullPrincipal) throws IOException {
System.setProperty("hive.metastore.sasl.enabled", "true");
System.setProperty("hive.security.authorization.enabled", "true");
System.setProperty("hive.metastore.kerberos.principal", fullPrincipal);
System.setProperty("hive.metastore.execute.setugi", "true");
System.setProperty("hadoop.security.authentication", "kerberos");
Configuration conf = setSecurity(fullPrincipal);
loginUser = UserGroupInformation.getLoginUser();
loginUser.doAs((PrivilegedAction<Void>) () -> {
SparkConf sparkConf = new SparkConf().setMaster("local");
sparkConf.set("spark.sql.warehouse.dir", "hdfs:///user/hive/warehouse");
sparkConf.set("hive.metastore.uris", "thrift://<hive server>:9083");
sparkConf.set("hadoop.security.authentication", "kerberos");
sparkConf.set("hadoop.rpc.protection", "privacy");
sparkConf.set("spark.driver.extraClassPath",
"/opt/cloudera/parcels/CDH/jars/*.jar:/opt/cloudera/parcels/CDH/lib/hive/conf:/opt/cloudera/parcels/CDH/lib/hive/lib/*.jar");
sparkConf.set("spark.executor.extraClassPath",
"/opt/cloudera/parcels/CDH/jars/*.jar:/opt/cloudera/parcels/CDH/lib/hive/conf:/opt/cloudera/parcels/CDH/lib/hive/lib/*.jar");
sparkConf.set("spark.eventLog.enabled", "false");
SparkSession ss = SparkSession
.builder()
.enableHiveSupport()
.config(sparkConf)
.appName("Jim Test Spark App")
.getOrCreate();
ss.sparkContext()
.hadoopConfiguration()
.addResource(conf);
ss.sql("select count(*) from entps_pma.baraccount").show();
return null;
});
}
I guess you are running Spark on YARN. You need to specify spark.yarn.principal and spark.yarn.keytab parameters. Please check running Spark on YARN documentation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With