Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Athena query works in console but not with boto3 client in sagemaker (convert csv into table)

I am trying to convert a csv file from s3 into a table in Athena. When I run the query on Athena console it works but when I run it on Sagemaker Jupyter notebook with boto3 client it returns:

"**InvalidRequestException**: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:8: no viable alternative at input 'CREATE EXTERNAL'"

Here is my code

def run_query(query):
    client = boto3.client('athena')
    response = client.start_query_execution(
        QueryString=query,
        ResultConfiguration={
            'OutputLocation': 's3://path/to/s3output',
            }
        )
    print('Execution ID: ' + response['QueryExecutionId'])
    return response

createTable = \
"""CREATE EXTERNAL TABLE TestTable (
    ID string,
    CustomerId string, 
    Ip string,
    MessageFilename string

)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   'separatorChar' = ',',
   'quoteChar' = '\"',
   'escapeChar' = '\\'
 )
STORED AS TEXTFILE
LOCATION 's3://bucket_name/results/csv/'
TBLPROPERTIES ("skip.header.line.count"="1")"""

response = run_query(createTable, s3_output)
print(response)

I have run queries through boto3 client in json format (so, using ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe') which have worked well but somehow this doesn't. I have tried changing names, syntax, quotes but that doesn't seem to work.

Any suggestion would be very appreciated, Thank you!

like image 911
user3462191 Avatar asked Nov 25 '25 11:11

user3462191


1 Answers

Thanks for sharing complete example. The issue is with the escaping in SERDEPROPERTIES. After modifying createTable as below it works

createTable = \
"""CREATE EXTERNAL TABLE testtable (
    `id` string,
    `customerid` string, 
    `ip` string,
    `messagefilename` string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
  'separatorChar' = ',', 
  'quoteChar' = '\\\"', 
  'escapeChar' = '\\\\' )
STORED AS TEXTFILE
LOCATION 's3://bucket_name/results/csv/'
TBLPROPERTIES ("skip.header.line.count"="1");"""
like image 173
raj Avatar answered Nov 27 '25 01:11

raj



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!