I'm trying to change the existing Hive external table delimiter from comma , to ctrl+A character by using Hive ALTER TABLE statement
ALTER TABLE table_name SET SERDEPROPERTIES ('field.delim' = '\u0001');
After the DDL I could see changes
show create table table_name
But when I select from Hive, the values are all NULL (underlying files in HDFS are changed to have ctrl+A delimiter).
I have repaired the table also by using msck.
Only way to see the data is dropping and re-creating the external table, can anyone please help me to understand the reason.
Table Syntax :
CREATE EXTERNAL TABLE `table_name`( col1, col2, col3) PARTITIONED BY ( `ing_year` int, `ing_month` int, `ing_day` int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'hdfs://location/'
An ALTER TABLE command on a partitioned table changes the default settings for future partitions.
But it will not apply to existing partitions, unless that specific command supports the CASCADE option -- but that's not the case for SET SERDEPROPERTIES; compare with column management for instance
So you must ALTER each and every existing partition with this kind of command
ALTER TABLE table_name PARTITION (ing_year=2016,ing_month=8,ing_day=31)
SET SERDEPROPERTIES ('field.delim' = '\u0001');
So now it's time for you to run a SHOW PARTITIONS, apply a couple of RegEx on the output to generate the list of commands, run these commands, and be happy ever after.
Side note: I can tell you it was REALLY painful to rename a column before the CASCADE stuff was finally implemented...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With