I am learning Hadoop MapReduce framework . I am struggling to find that Why can't we use Java primitive data types in Map Reduce.
The Java serialization requires the hash of the class to be prefixed before each instance of the object in the serialized format. Hence, to read the object, you do not need to specify the class name. This causes an overhead to read the object since each object can be an instance of different classes.
In Hadoop Serialization, we specify the class name while retrieving it. Hence, there is no need for a prefix since we already have knowledge of what we are retrieving. Hence we set the InputFormat. This increases the speed and performance in various aspect during RPC's.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With