How to select top N rows in Hive? - Big Data In Real World

How to select top N rows in Hive?

What is the difference between foreach and foreachPartition in Spark?
October 4, 2021
How to calculate the difference between 2 dates in Hive?
October 11, 2021
What is the difference between foreach and foreachPartition in Spark?
October 4, 2021
How to calculate the difference between 2 dates in Hive?
October 11, 2021

Simple problem with a simple solution.

Solution

Order the records first and then apply the LIMIT clause to limit the number of records.

SELECT * FROM employee ORDER BY salary DESC LIMIT 20

Keep in mind ORDER BY does a global ordering and it is an expensive operation. 

Check out this post on differences between ORDER BY, SORT BY in Hive.

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

How to select top N rows in Hive?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X