How to move a Hive table from one database to another?
January 5, 2022How to see the current database user is on with Hive cli or Beeline?
January 19, 2022The difference between static and dynamic partitioning only exists when the partition is being created based on how the partitions are added to the table. Once the partitions are created, the tables won’t have any difference like static and dynamic partitions. All partitions are treated and one and the same.
Static partitioning
Partitioning is referred to as static when you are using LOAD command to create the partition. With the below 2 partitions are created in sales table one with country=”US” and one with country=”UK”. Data under /hdfs/path/US has the data from partition country=”US” and similarly data under /hdfs/path/UK has the data from partition country=”UK”
hive>LOAD DATA INPATH '/hdfs/path/US' INTO TABLE sales PARTITION(country="US") hive>LOAD DATA INPATH '/hdfs/path/UK' INTO TABLE sales PARTITION(country="UK")
Dynamic partitioning
With the below INSERT with a SELECT query we are creating multiple partitions in table sales dynamically. If sales_staging has records from 10 countries then 10 partitions are created in sales tables.
hive> INSERT INTO TABLE sales PARTITION(country) SELECT * from sales_staging;
hive.exec.dynamic.partition control whether to allow dynamic partition or not. The default value is false prior to Hive 0.9.0 and true in Hive 0.9.0 and later.
When to use static partitioning and when to use dynamic partitioning?
- Use static partitioning when data is already physically categorized/grouped/partitioned and ready to be added as a partition to a table.
- Static partitioning will not result in MapReduce job execution since the data is already physically categorized/partitioned.
- Use dynamic partitioning when data is not already physically categorized/grouped/partitioned.
- Dynamic partitioning will result in MapReduce job execution to group the data first and then partitions will be added to the table.