Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass parameters to the jar when using spark launcher

I am trying to create an executable jar which is using a spark launcher to run another jar with data transformation task(this jar creates spark session).

I need to pass java parameters(some java arrays) to the jar which is executed by the launcher.

object launcher {
  @throws[Exception]
  // How do I pass parameters to spark_job_with_spark_session.jar
  def main(args: Array[String]): Unit = {
    val handle = new SparkLauncher()
      .setAppResource("spark_job_with_spark_session.jar")
      .setVerbose(true)
      .setMaster("local[*]")
      .setConf(SparkLauncher.DRIVER_MEMORY, "4g")
      .launch()
  }
}

How can I do that?

like image 885
Oleg Yarin Avatar asked Oct 21 '25 04:10

Oleg Yarin


2 Answers

need to pass java parameters(some java arrays)

It is equivalent to executing spark-submit so you cannot pass Java objects directly. Use app args

addAppArgs(String... args)

to pass application arguments, and parse them in your app.

like image 156
Alper t. Turker Avatar answered Oct 23 '25 22:10

Alper t. Turker


/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package com.meow.woof.meow_spark_launcher.app;

import com.meow.woof.meow_spark_launcher.common.TaskListener;
import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;

/**
 *
 * @author hahattpro
 */
public class ExampleSparkLauncherApp {

    public static void main(String[] args) throws Exception {
        SparkAppHandle handle = new SparkLauncher()
                .setAppResource("/home/cpu11453/workplace/experiment/SparkPlayground/target/scala-2.11/SparkPlayground-assembly-0.1.jar")
                .setMainClass("me.thaithien.playground.ConvertToCsv")
                .setMaster("spark://cpu11453:7077")
                .setConf(SparkLauncher.DRIVER_MEMORY, "3G")
                .addAppArgs("--input" , "/data/download_hdfs/data1/2019_08_13/00/", "--output", "/data/download_hdfs/data1/2019_08_13/00_csv_output/")
                .startApplication(new TaskListener());

        handle.addListener(new SparkAppHandle.Listener() {
            @Override
            public void stateChanged(SparkAppHandle handle) {
                System.out.println(handle.getState() + " new  state");
            }

            @Override
            public void infoChanged(SparkAppHandle handle) {
                System.out.println(handle.getState() + " new  state");
            }
        });

        System.out.println(handle.getState().toString());

        while (!handle.getState().isFinal()) {
            //await until job finishes
            Thread.sleep(1000L);
        }
    }
}

Here is example code that work

like image 27
Haha TTpro Avatar answered Oct 23 '25 22:10

Haha TTpro



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!