Change field termination value in Hive
January 23, 2017How to find directories in HDFS which are older than N days?
January 30, 2017How to use multi character delimiter in a Hive table?
Sometimes your data is slightly complex to delimit the individual columns with a single character like delimiter comma, pipe symbol etc.
Problem
What if you have multi character delimiter like below ?
In the below sample record the delimiter is @#
NYSE@#B7J@#2009-12-16@#8.80@#8.84@#8.77@#8.83@#69700@#8.83
Solution
Well if you are using Hive version 0.14.0 then you are in luck. You could use MultiDelimitSerDe and specify the multi character delimiter for field.delim property.
hive> CREATE EXTERNAL TABLE IF NOT EXISTS stocks_multi ( exch STRING, symbol STRING, ymd STRING, price_open FLOAT, price_high FLOAT, price_low FLOAT, price_close FLOAT, volume INT, price_adj_close FLOAT) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="@#") LOCATION '/user/hirw/stocks-multi';