Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

unable to insert into hive partitioned table from spark

I create an external partitioned table in hive. in the logs it shows numinputrows. that means the query is working and sending data. but when I connect to hive using beeline and query, select * or count(*) it's always empty.

def hiveOrcSetWriter[T](event_stream: Dataset[T])( implicit spark: SparkSession): DataStreamWriter[T] = {

    import spark.implicits._
    val hiveOrcSetWriter: DataStreamWriter[T] = event_stream
      .writeStream
      .partitionBy("year","month","day")
      .format("orc")
      .outputMode("append")
      .option("compression", "zlib")
      .option("path", _table_loc)
      .option("checkpointLocation", _table_checkpoint)

    hiveOrcSetWriter
  }

What can be the issue? I'm unable to understand.

like image 411
Sam Avatar asked Jan 30 '26 15:01

Sam


1 Answers

msck repair table tablename

It give go and check the location of the table and adds partitions if new ones exits.

In your spark process add this step in order to query from hive.

like image 149
loneStar Avatar answered Feb 02 '26 06:02

loneStar