August 4, 2021It is a very common use case to process the data in Spark and save the processed data or Spark dataframe directly into a Hive table.
There are a couple ways to achieve this.
Solution 1
Create Hivecontext
import org.apache.spark.sql.hive.HiveContext; HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
df is the result dataframe you want to write to Hive. Below will write the contents of dataframe df to sales under the database sample_db. Since we are using the SaveMode Overwrite the contents of the table will be overwritten.
Solution 2
Register the dataframe df to a temporary view name temp_table in Spark
With below we are creating a table named sales by selecting the content of temp_table. Sales will have the same structure as temp_table.
sqlContext.sql("create table sample_db.sales as select * from temp_table");