Can you please help me with reading a tar.gz file using Glue Data crawler please? I have a tar.gz file which contains couple of files in different schema in my S3, and when I try to run a crawler, I don't see the schema in the data catalogue. Should we use any custom classifiers? The AWS Glue FAQ specifies that gzip is supported using classifiers, but is not listed in the classifiers list provided in the Glue Classifier sections.
According to the official AWS docs for Glue Crawler built in classifiers this functionality should be 100% supported and transparent.
https://docs.aws.amazon.com/glue/latest/dg/add-classifier.html
A csv format compressed with gzip is built-in.
However i would suggest contacting AWS Support if it does not work as described for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With