Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Glue Crawler: want separate table for folder in s3

My s3 file structure is:

├── bucket
│   ├── customer_1
│   │   ├── year=2016
│   │   ├── year=2017
│   │   │   ├── month=11
│   │   |   │   ├── sometype-2017-11-01.parquet
│   |   |   |   ├── sometype-2017-11-02.parquet
│   |   |   |   ├── ...
│   │   │   ├── month=12
│   │   |   │   ├── sometype-2017-12-01.parquet
│   |   |   |   ├── sometype-2017-12-02.parquet
│   |   |   |   ├── ...
│   │   ├── year=2018
│   │   │   ├── month=01
│   │   |   │   ├── sometype-2018-01-01.parquet
│   |   |   |   ├── sometype-2018-01-02.parquet
│   |   |   |   ├── ...
│   ├── customer_2
│   │   ├── year=2017
│   │   │   ├── month=11
│   │   |   │   ├── moretype-2017-11-01.parquet
│   |   |   |   ├── moretype-2017-11-02.parquet
│   |   |   |   ├── ...
│   │   ├── year=...

I want create separate table for customer_1 and customer_2 with AWS Glue crawler. It is working if i mention path s3://bucket/customer_1 and s3://bucket/customer_2.

I've tried s3://bucket/customer_* and s3://bucket/*, both are not working and can not create table in Glue catalog

like image 436
Jay Avatar asked Oct 19 '25 21:10

Jay


2 Answers

I myself faced this issue recently. AWS GLUE Crawlers has this option Grouping behaviour for S3 data. If the checkbox is not selected it will try to combine schemas. By selecting the checkbox you can ensure that multiple and separate databases are created.

The table level should be the depth from the root of the bucket, from where you want separate tables.

In your case the depth would be 2.

More here

enter image description here

like image 150
Sandeep Singh Avatar answered Oct 22 '25 10:10

Sandeep Singh


Glue's natural tendency is to add similar schemas(when pointed to the parent folder) to the same table with anything over than a 70% match(Assuming, In your case Cust1 and Cust2 have the same schemas). Keeping them in individual folders might create respective partitions based on the folder names.

like image 30
Kishore Bharathy Avatar answered Oct 22 '25 10:10

Kishore Bharathy