Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the becustom domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home4/joyplace/public_html/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wordpress-seo domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home4/joyplace/public_html/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":1214,"date":"2017-02-27T10:00:11","date_gmt":"2017-02-27T16:00:11","guid":{"rendered":"http:\/\/hadoopinrealworld.com\/?p=1214"},"modified":"2023-02-19T07:32:42","modified_gmt":"2023-02-19T13:32:42","slug":"dissecting-mapreduce-program-part1","status":"publish","type":"post","link":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/","title":{"rendered":"Dissecting MapReduce Program (Part 1)"},"content":{"rendered":"

Dissecting MapReduce Program (Part 1)<\/h1>\n

From the previous post<\/a>, we now we have a very good idea about the phases involved in MapReduce. We have a very good conceptual understanding of what is a Mapper, Reducer, Combiner etc. Now to the fun part. lets go ahead and write a MapReduce program in Java to calculate the Maximum closing price by the stock symbol from stocks dataset.<\/p>\n

Before you go on reading this post, please note that this post is from our free course named Hadoop Starter Kit. It is a free introductory course on Hadoop and it is 100% free. Click here to enroll to Hadoop Starter Kit.\u00a0<\/a>\u00a0You will also get free access to our 3 node Hadoop cluster hosted on Amazon Web Services (AWS) \u2013 also free !<\/span><\/p>\n

Here is the game plan. we are going to write 3 programs – Mapper, Reducer and a Driver program. We know what a Mapper and what is a Reducer but what is a Driver program ? So let\u2019s start with that. A driver program will provide all the needed information and bring all the needed information together to Submit a MapReduce job. We will see each one in detail in this post.<\/p>\n

package com.hirw.maxcloseprice;\r\n\r\n\/**\r\n* MaxClosePrice.java\r\n* www.hadoopinrealworld.com\r\n* This is a driver program to calculate Max Close Price from stock dataset using MapReduce\r\n*\/\r\n\r\nimport org.apache.hadoop.fs.Path;\r\nimport org.apache.hadoop.io.FloatWritable;\r\nimport org.apache.hadoop.io.Text;\r\nimport org.apache.hadoop.mapreduce.Job;\r\nimport org.apache.hadoop.mapreduce.lib.input.FileInputFormat;\r\nimport org.apache.hadoop.mapreduce.lib.input.TextInputFormat;\r\nimport org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;\r\nimport org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;\r\n\r\npublic class MaxClosePrice {\r\n\r\npublic static void main(String[] args) throws Exception {\r\nif (args.length != 2) {\r\nSystem.err.println(\"Usage: MaxClosePrice <input path> <output path>\");\r\nSystem.exit(-1);\r\n}\r\n\r\n\/\/Define MapReduce job\r\nJob job = new Job();\r\njob.setJarByClass(MaxClosePrice.class);\r\njob.setJobName(\"MaxClosePrice\");\r\n\r\n\/\/Set input and output locations\r\nFileInputFormat.addInputPath(job, new Path(args[0]));\r\nFileOutputFormat.setOutputPath(job, new Path(args[1]));\r\n\r\n\/\/Set Input and Output formats\r\njob.setInputFormatClass(TextInputFormat.class);\r\njob.setOutputFormatClass(TextOutputFormat.class);\r\n\r\n\/\/Set Mapper and Reduce classes\r\njob.setMapperClass(MaxClosePriceMapper.class);\r\njob.setReducerClass(MaxClosePriceReducer.class);\r\n\r\n\/\/Combiner (optional)\r\njob.setCombinerClass(MaxClosePriceReducer.class);\r\n\r\n\/\/Output types\r\njob.setOutputKeyClass(Text.class);\r\njob.setOutputValueClass(FloatWritable.class);\r\n\r\n\/\/Submit job\r\nSystem.exit(job.waitForCompletion(true) ? 0 : 1);\r\n}\r\n}\r\n\r\n<\/pre>\n

Job object refers to a MapReduce job. Instantiate a new object and give a name to the job. When we run this job on a Hadoop cluster, we will package the code in to a Jar and Hadoop will distribute the jar across all the nodes in the cluster. setJarByClass method takes in the class and Hadoop uses this class to locate the Jar file. In the next few lines you would set the input path where you can find the input dataset and the output path where you want your reducers to write the output.Also you have to specify the format of your input and output dataset.<\/p>\n

Let’s talk about InputFormat first. InputFormat is responsible 3 main tasks –<\/p>\n

First, it Validate inputs, meaning make sure the dataset actually exists in the location that you specified.
\nNext, Split-up the input file(s) into logical InputSplit(s), each of which is then assigned to an individual Mapper.
\nFinally and this is important, InputFormat provides. RecordReader implementation to extract input records from the logical InputSplit for processing by the Mapper.<\/p>\n

\"InputFormat\"<\/a><\/p>\n

In our case, our stocks dataset is of Text format and each line in the dataset is a record. So we will use TextInputFormat. Hadoop provides several other InputFormats each designed for a specific purpose. for eg. if your dataset has binary key value pairs you can use Sequence File InputFormat. There are several important file formats in use like AVRO, Sequence, RCRile etc. infact due to its importance, we have a separate chapter in our Hadoop Developer In Real World course dedicated to file formats<\/a>. Further more we will also look at implementing a custom File format in the course.<\/p>\n

Similar to InputFormat, OutputFormat validate output specifications and has RecordWriter implementation to be used to write out the output files of the job. Hadoop comes with several OutputFormat implementations, infact for every InputFormat you can find a corresponding OutputFormat and also you can write custom implementations of OutputFormat.<\/p>\n

\"OutputFormat\"<\/a><\/p>\n

Next set the Mapper and Reducer classes for this MapReduce job. Our Mapper is MaxClosePriceMapper and our Reducer is MaxClosePriceReducer. Also set the output key and the value types for both your Mapper and Reducer. The key is of type Text and value is of type FloatWritable.<\/p>\n

These types look new doesn’t it? Yes they are new. There are writable wrappers in Hadoop for all major Java primitive types. For example, the Writable implemenation for int is IntWritable. for float is FloatWritable, for boolean is BooleanWritable and for String it is Text. But why new datatypes? when we already have well defined datatypes in Java?<\/p>\n

\"Writable<\/p>\n

Writables are used when ever there is a need to transfer data between tasks. That is when the data is given as input and output to and from the mapper. They are used also when the data is given as input and output to and from the reducer. As you know Hadoop is a distributed computing framework. which means you will have Mappers and Reducers distributed in many different nodes and this mean you will have a lot of data being transferred between nodes. So when there is a need to transfer data over the network between nodes the objects must be turned in to byte stream and this process as we know is called serialization.<\/p>\n

As you can imagine, Hadoop is designed to process is million and billions of records so there is a lot of data transferred over the network and hence Serialization should be fast, compact and effective. Authors of Hadoop felt the Java’s out of the box serailzation was not that effective in terms of speed and size. Here are couple of reasons why they felt so.<\/p>\n

Java serialization writes the class name of each object which is being serialized to the byte stream. This is to know the object’s type so that we will be able to deserialize the object from the byte stream. Every subsequent instance of the class should have a reference to the first occurrence of the class name which this clearly takes up space. This reference result in two problems. first one is space and hence it is not compact. second problem is the reference handles introduce a problem during sorting records in a serialized stream, since only the first record will have the class name and must be given special care.<\/p>\n

\"Writable<\/a><\/p>\n

So Writables was introduced to make the Serialization fast and compact. How though? By simply not writing the class name to the stream. Then how would you know the type during deserialization? The assumption is that the client always knows the type and this is usually true.<\/p>\n

There is one more benefit of using Writables as opposed to regular Java types. With standard Java types, when the objects are reconstructed from a byte stream during deserialization process, a new instance for each object has to be created. Where as with writables, the same object can be reused which improves processing efficiency and speed.<\/p>\n

In the Hadoop Developer In Real World course<\/a> look in to how to write Custom Writables when we look in to solving the common friends problem that we see in social sites like Facebook for instance. It’s an interesting project !<\/p>\n

Wait for completion method submits the job and also waits for it to finish. set the boolean argument to true to so that you can see the progress of the job in the console. Let’s pause our post here. In the next post<\/a> we will look at the Mapper and Reducer programs in detail and we will also execute the MapReduce program in our Hadoop cluser.<\/p>\n","protected":false},"excerpt":{"rendered":"

Dissecting MapReduce Program (Part 1) From the previous post, we now we have a very good idea about the phases involved in MapReduce. We have a [\u2026]<\/span><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1214","post","type-post","status-publish","format-standard","hentry","category-hadoop"],"yoast_head":"\nDissecting MapReduce Program (Part 1) - Big Data In Real World<\/title>\n<meta name=\"description\" content=\"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dissecting MapReduce Program (Part 1) - Big Data In Real World\" \/>\n<meta property=\"og:description\" content=\"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\" \/>\n<meta property=\"og:site_name\" content=\"Big Data In Real World\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/bigdatainrealworld\" \/>\n<meta property=\"article:published_time\" content=\"2017-02-27T16:00:11+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-02-19T13:32:42+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg\" \/>\n<meta name=\"author\" content=\"Big Data In Real World\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Big Data In Real World\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\"},\"author\":{\"name\":\"Big Data In Real World\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67\"},\"headline\":\"Dissecting MapReduce Program (Part 1)\",\"datePublished\":\"2017-02-27T16:00:11+00:00\",\"dateModified\":\"2023-02-19T13:32:42+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\"},\"wordCount\":1133,\"publisher\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage\"},\"thumbnailUrl\":\"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg\",\"articleSection\":[\"Hadoop\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\",\"name\":\"Dissecting MapReduce Program (Part 1) - Big Data In Real World\",\"isPartOf\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage\"},\"thumbnailUrl\":\"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg\",\"datePublished\":\"2017-02-27T16:00:11+00:00\",\"dateModified\":\"2023-02-19T13:32:42+00:00\",\"description\":\"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage\",\"url\":\"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg\",\"contentUrl\":\"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.bigdatainrealworld.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Dissecting MapReduce Program (Part 1)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#website\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/\",\"name\":\"Big Data In Real World\",\"description\":\"Learn Big Data from experts!\",\"publisher\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.bigdatainrealworld.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\",\"name\":\"Big Data In Real World\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png\",\"contentUrl\":\"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png\",\"width\":500,\"height\":500,\"caption\":\"Big Data In Real World\"},\"image\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/bigdatainrealworld\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67\",\"name\":\"Big Data In Real World\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g\",\"caption\":\"Big Data In Real World\"},\"description\":\"We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.\",\"sameAs\":[\"https:\/\/www.bigdatainrealworld.com\/\"],\"url\":\"https:\/\/www.bigdatainrealworld.com\/author\/bigdatainrealworld\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dissecting MapReduce Program (Part 1) - Big Data In Real World","description":"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/","og_locale":"en_US","og_type":"article","og_title":"Dissecting MapReduce Program (Part 1) - Big Data In Real World","og_description":"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.","og_url":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/","og_site_name":"Big Data In Real World","article_publisher":"https:\/\/www.facebook.com\/bigdatainrealworld","article_published_time":"2017-02-27T16:00:11+00:00","article_modified_time":"2023-02-19T13:32:42+00:00","og_image":[{"url":"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg"}],"author":"Big Data In Real World","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Big Data In Real World","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#article","isPartOf":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/"},"author":{"name":"Big Data In Real World","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67"},"headline":"Dissecting MapReduce Program (Part 1)","datePublished":"2017-02-27T16:00:11+00:00","dateModified":"2023-02-19T13:32:42+00:00","mainEntityOfPage":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/"},"wordCount":1133,"publisher":{"@id":"https:\/\/www.bigdatainrealworld.com\/#organization"},"image":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage"},"thumbnailUrl":"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg","articleSection":["Hadoop"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/","url":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/","name":"Dissecting MapReduce Program (Part 1) - Big Data In Real World","isPartOf":{"@id":"https:\/\/www.bigdatainrealworld.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage"},"image":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage"},"thumbnailUrl":"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg","datePublished":"2017-02-27T16:00:11+00:00","dateModified":"2023-02-19T13:32:42+00:00","description":"In this post we will go through the driver program of a MapReduce program in detail. We will also see InputFormat, OutputFormat & Writables.","breadcrumb":{"@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#primaryimage","url":"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg","contentUrl":"http:\/\/hadoopinrealworld.com\/wp-content\/uploads\/2017\/02\/InputFormat.jpg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.bigdatainrealworld.com\/dissecting-mapreduce-program-part1\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.bigdatainrealworld.com\/"},{"@type":"ListItem","position":2,"name":"Dissecting MapReduce Program (Part 1)"}]},{"@type":"WebSite","@id":"https:\/\/www.bigdatainrealworld.com\/#website","url":"https:\/\/www.bigdatainrealworld.com\/","name":"Big Data In Real World","description":"Learn Big Data from experts!","publisher":{"@id":"https:\/\/www.bigdatainrealworld.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.bigdatainrealworld.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.bigdatainrealworld.com\/#organization","name":"Big Data In Real World","url":"https:\/\/www.bigdatainrealworld.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png","contentUrl":"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png","width":500,"height":500,"caption":"Big Data In Real World"},"image":{"@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/bigdatainrealworld"]},{"@type":"Person","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67","name":"Big Data In Real World","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g","caption":"Big Data In Real World"},"description":"We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.","sameAs":["https:\/\/www.bigdatainrealworld.com\/"],"url":"https:\/\/www.bigdatainrealworld.com\/author\/bigdatainrealworld\/"}]}},"_links":{"self":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/1214","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/comments?post=1214"}],"version-history":[{"count":1,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/1214\/revisions"}],"predecessor-version":[{"id":1219,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/1214\/revisions\/1219"}],"wp:attachment":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/media?parent=1214"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/categories?post=1214"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/tags?post=1214"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}