Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark: What is the Use of Creating New Spark Sessions?

Tags:

apache-spark

We can create new Spark sessions by using spark.newSession in spark-shell. Now, my question is what is the use of new Spark session instances?

like image 679
user1888243 Avatar asked Oct 21 '25 23:10

user1888243


1 Answers

The two most common uses cases are:

  • Keeping sessions with minor differences in configuration.

    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
          /_/
    
    Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_141)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> spark.range(100).groupBy("id").count.rdd.getNumPartitions
    res0: Int = 200
    
    scala> 
    
    scala> val newSpark = spark.newSession
    newSpark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@618a9cb7
    
    scala> newSpark.conf.set("spark.sql.shuffle.partitions", 99)
    
    scala> newSpark.range(100).groupBy("id").count.rdd.getNumPartitions
    res2: Int = 99
    
    scala> spark.range(100).groupBy("id").count.rdd.getNumPartitions  // No effect on initial session
    res3: Int = 200
    
  • Separating temporary namespaces:

    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
          /_/
    
    Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_141)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> spark.range(1).createTempView("foo")
    
    scala> 
    
    scala> spark.catalog.tableExists("foo")
    res1: Boolean = true
    
    scala> 
    
    scala> val newSpark = spark.newSession
    newSpark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@73418044
    
    scala> newSpark.catalog.tableExists("foo")
    res2: Boolean = false
    
    scala> newSpark.range(100).createTempView("foo")  // No exception
    
    scala> spark.table("foo").count // No effect on inital session
    res4: Long = 1     
    
like image 131
Alper t. Turker Avatar answered Oct 23 '25 19:10

Alper t. Turker