問題描述:
實際應用中,常常存在修改數據表結構的需求,比如:增加一個新字段。
如果使用如下語句新增列,可以成功添加列col1。但如果數據表tb已經有舊的分區(例如:dt=20190101),則該舊分區中的col1將為空且無法更新,即便insert overwrite該分區也不會生效。
alter table tb add columns(col1 string);
解決方法:
解決方法很簡單,就是增加col1時加上cascade關鍵字。示例如下:
alter table tb add columns(col1 string) cascade;
加深記憶的方法也很簡單,cascade的中文翻譯為“級聯”,也就是不僅變更新分區的表結構(metadata),同時也變更舊分區的表結構。
附:官方文檔
ADD COLUMNS lets you add new columns to the end of the existing columns but before the partition columns. This is supported for Avro backed tables as well, for Hive 0.14 and later.
REPLACE COLUMNS removes all existing columns and adds the new set of columns. This can be done only for tables with a native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe). Refer to Hive SerDe for more information. REPLACE COLUMNS can also be used to drop columns. For example, "ALTER TABLE test_change REPLACE COLUMNS (a int, b int);" will remove column 'c' from test_change's schema.
The PARTITION clause is available in Hive 0.14.0 and later; see Upgrading Pre-Hive 0.13.0 Decimal Columns for usage.
The CASCADE|RESTRICT clause is available in Hive 1.1.0. ALTER TABLE ADD|REPLACE COLUMNS with CASCADE command changes the columns of a table's metadata, and cascades the same change to all the partition metadata. RESTRICT is the default, limiting column changes only to table metadata.
Add/Replace Columns
ALTER TABLE table_name
[PARTITION partition_spec] -- (Note: Hive 0.14.0 and later)
ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)
[CASCADE|RESTRICT] -- (Note: Hive 1.1.0 and later)
```