Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres JAR with EMR and Jupyter Notebooks

I am trying to launch an EMR cluster that includes the Postgres driver JAR file so I can load data from Postgres and analyze it with PySpark. I have the the JAR I would like to include stored in S3. I have tried the following things:

1 - entering the following configuration:

[
  {
    "Classification": "presto-connector-postgresql",
    "Properties": {
      "connection-url": "jdbc:postgresql://example.net:5432/database",
      "connection-user": "MYUSER",
      "connection-password": "MYPASS"
    },
    "Configurations": []
  }
]

2 - Adding the JAR as a custom step (selecting the JAR from S3)

3 - Adding the JAR as a custom bootstrap action (selecting the JAR from S3)

None of these work, I can't figure out how to use the connector in Step 1 within Jupyter, and the custom step/bootstrap action both fail when I launch the cluster. How can I launch an EMR cluster with the Postgres drivers installed so I can query my data in Jupyter?

EDIT:

I used the following bootstrap script to copy the JAR to my master/worker nodes:

#!/bin/bash
aws s3 cp s3://BUCKETNAME/postgresql-42.2.8.jar /mnt1/myfolder

But still get the following error:

An error was encountered:
An error occurred while calling o90.load.
: java.lang.ClassNotFoundException: org.postgresql.Driver

With the following code:

df = spark.read \
    .format("jdbc") \
    .option("url", "jdbcURL") \
    .option("user", "user") \
    .option("password", "password") \
    .option("driver", "org.postgresql.Driver") \
    .option("query", "select * from slm_files limit 100") \
    .load()

df.count()
like image 366
DBA108642 Avatar asked Oct 20 '25 14:10

DBA108642


1 Answers

Using this code in the first cell of my Jupyter notebook solved it for me:

%%configure -f
{ "conf":{
          "spark.jars": "s3://JAR-LOCATION/postgresql-42.2.8.jar"
         }
}

like image 154
DBA108642 Avatar answered Oct 23 '25 03:10

DBA108642



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!