Hive architecture

How to find the version of Hadoop and Hive?

October 18, 2021

How to get DDL or create script of an existing Hive table?

October 25, 2021

Published by Big Data In Real World at October 22, 2021

HiveServer2

HiveServer2 is an improved implementation of HiveServer1 and was introduced with Hive 0.11. HiveServer2 is responsible for the following functions.

Thrift service to support concurrent client connections and sessions
Support common ODBC and JDBC drivers
Authentication support via Kerberos, LDAP and other pluggable implementations
Authorization
Query optimization and execution

HiveServer2 is a container for the Hive execution engine. For each client connection, it creates a new execution context that serves Hive SQL requests from the client.

Compiler and Execute Engine

When a client executes a Hive query it is sent to the compiler and Hive optimizes the query, creates a query plan and creates an execution plan and finally executes it against the data in HDFS.

Metastore database

Metastore database is not part of HiveServer2 (and it is not shown in the picture). Every Hive installation needs to have an RDBMS like Derby (good for dev environments only), Oracle or MySQL.

Hive stores the metadata of the tables and database that is managed by Hive in the metastore database. Note that this database doesn’t hold the actual data. The data will reside in HDFS.

Metastore

Metastore service runs inside Hiveserver2 and will communicate with the configured metastore database to look up the metadata information of the tables and database that is managed by Hive.

Hive clients

Hive CLI was deprecated and was replaced by Beeline to access Hive. Beeline connects to the HiveServer2 and acts as an interface or client for users to run queries and see results.

Hive also supports other clients using ODBC and JDBC to HiveServer2.

Big Data In Real World

We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Hive architecture

How to find the version of Hadoop and Hive?

How to get DDL or create script of an existing Hive table?

How to find the version of Hadoop and Hive?

How to get DDL or create script of an existing Hive table?

HiveServer2

Compiler and Execute Engine

Metastore database

Metastore

Hive clients

Big Data In Real World

Related posts

How to transpose or convert columns to rows in Hive?

How to fail a Hive script based on a condition?

How to delete duplicate data from the Hive table?