How to get a count of the number of documents in an Elasticsearch Index?
February 1, 2021How to specify join hints with Spark 3.0?
February 5, 2021In Hadoop you will find Hadoop specific types for basic types. For eg. you will find Text for String. IntWritable instead of Integer. For all primitive types you will see a type which implements Writable interface.
Fast and compact
Serialization and Deserialization are at the heart of Hadoop implementation. Data is distributed in Hadoop and data is transferred over the network multiple times during shuffle mainly. Data is serialized when the data is transferred over the network and deserialized when data is processed on the other side.
Plain Java serialization is heavy and slow.
To overcome the inefficiencies with Java serialization Hadoop’s creator Doug Cutting implemented Writable objects that came up with IO classes to replace Java primitive types which can perform serialization that is fast and compact.