Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redshift Not Connecting to Host via Python Script

I currently have a .csv file in an S3 bucket that I'd like to append to a table in a Redshift database using a Python script. I have a separate file parser and upload to S3 that work just fine.

The code I have for connecting to/copying into the table is below here. I get the following error message:

OperationalError: (psycopg2.OperationalError) could not connect to server: Connection timed out (0x0000274C/10060) Is the server running on host "redshift_cluster_name.unique_here.region.redshift.amazonaws.com" (18.221.51.45) and accepting TCP/IP connections on port 5439?

I can confirm the following:

  • Port is 5439

  • Not encrypted

  • Cluster name/DB name/username/password are all correct

  • Publicly accessible set to "Yes"

What should I be fixing to make sure I can connect my file in S3 to Redshift? Thank you all for any help you can provide.

Also I have looked around on Stack Overflow and ServerFault but these seem to either be for MySQL to Redshift or the solutions (like the linked ServerFault CIDR solution) did not work.

Thank you for any help!

DATABASE = "db"
USER = "user"
PASSWORD = "password"
HOST = "redshift_cluster_name.unique_here.region.redshift.amazonaws.com"
PORT = "5439"
SCHEMA = "public"
S3_FULL_PATH = 's3://bucket/file.csv'
#ARN_CREDENTIALS = 'arn:aws:iam::aws_id:role/myRedshiftRole'
REGION = 'region'
############ CONNECTING AND CREATING SESSIONS ############
connection_string = f"redshift+psycopg2://{USER}:{PASSWORD}@{HOST}:{PORT}/{DATABASE}"
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = f"SET search_path TO {SCHEMA}"
s.execute(SetPath)
###########################################################
############ RUNNING COPY ############
copy_command = f
'''
copy category from '{S3_FULL_PATH}'
credentials 'aws_iam_role={ARN_CREDENTIALS}'
delimiter ',' region '{REGION}';
'''
s.execute(copy_command)
s.commit()
######################################
#################CLOSE SESSION################
s.close()
##############################################
like image 702
TJ15 Avatar asked Jan 25 '26 03:01

TJ15


1 Answers

Connecting via a Python program would require the same connectivity as connecting from an SQL Client.

I created a new cluster so I could document the process for you.

Here's the steps I took:

  • Created a VPC with CIDR of 10.0.0.0/16. I don't really need to create another VPC, but I want to avoid any problems with prior configurations.
  • Created a Subnet in the VPC with CIDR of 10.0.0.0/24.
  • Created an Internet Gateway and attached it to the VPC.
  • Edited the default Route Table to send 0.0.0.0/0 traffic to the Internet Gateway. (I'm only creating a public subnet, so don't need a route table for private subnet.)
  • Created a Redshift Cluster Subnet Group with the single subnet I created.
  • Launch a 1-node Redshift cluster into the Cluster Subnet Group. Publicly accessible = Yes, default Security Group.
  • Went back to the VPC console to edit the Default Security Group. Added an Inbound rule for Redshift from Anywhere.
  • Waited for the Cluster to become ready.
  • I then used DbVisualizer to login to the database. Success!

The above steps made a publicly-available Redshift cluster and I connected to it from my computer on the Internet.

like image 135
John Rotenstein Avatar answered Jan 27 '26 15:01

John Rotenstein



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!