How to read and write Excel files with Spark?
March 16, 2023How to search a file or objects by name inside an S3 bucket?
March 23, 2023This is a very useful trick when you have a big Hive script as part of your production jobs and you want to check the consistency or integrity or quality of data before moving to the next series of steps.
assert_true
assert_true is a conditional function when the expression is evaluated to false, the function will through an exception stating that the assertion failed.
select assert_true (1<0); Error: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: ASSERT_TRUE(): assertion failed. (state=,code=0)
Usage
Let’s say your Hive script just loaded the sales table. You want to make sure the sales table is properly loaded before proceeding.
--Instructions to load sales table here select assert_true(count(*)>1000) from sales; --Don’t execute the below scripts if the number of records in sales is less than 1000
If the sales table has less than 1000 records, Hive will throw an exception stating the assertion failed.