Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the becustom domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home4/joyplace/public_html/wp-includes/functions.php on line 6114

Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the wordpress-seo domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home4/joyplace/public_html/wp-includes/functions.php on line 6114

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893

Warning: Cannot modify header information - headers already sent by (output started at /home4/joyplace/public_html/wp-includes/functions.php:6114) in /home4/joyplace/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1893
{"id":2240,"date":"2023-04-13T06:00:00","date_gmt":"2023-04-13T11:00:00","guid":{"rendered":"https:\/\/www.bigdatainrealworld.com\/?p=2240"},"modified":"2023-03-26T07:28:58","modified_gmt":"2023-03-26T12:28:58","slug":"what-is-the-difference-between-map-and-mapvalues-functions-in-spark","status":"publish","type":"post","link":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/","title":{"rendered":"What is the difference between map and mapValues functions in Spark?"},"content":{"rendered":"\n

In this post we will look at the differences between map and mapValues functions and when it is appropriate to use either one.<\/p>\n\n\n\n

We have a small made up dataset with Day and temperature in Fahrenheit. Let\u2019s use both map() and mapValues() to convert them to Celsius.<\/p>\n\n\n\n

map()<\/h2>\n\n\n\n

<\/p>\n\n\n\n

Both map and mapValues are transformation functions<\/p>\n\n\n\n

With map(), we will have access to both the key and value (x._1 and x._2) so we can transform both key and value if we choose to. (for eg. we can change the key, day to all uppercase if we have to)<\/p>\n\n\n\n

Returns Array[(String, Double)]<\/p>\n\n\n\n

val rdd = sc.parallelize(Seq((\"Sunday\", 50), (\"Monday\", 60), (\"Tuesday\", 65), (\"Wednesday\", 70), (\"Thursday\", 85), (\"Friday\", 25), (\"Saturday\", 15)))\r\n\r\nrdd.map { x =>\r\n  val ctemp = (x._2 - 32)*.55\r\n  (x._1, ctemp)\r\n}.collect\r\n\r\nres1: Array[(String, Double)] = Array((Sunday,9.9), (Monday,15.400000000000002), (Tuesday,18.150000000000002), (Wednesday,20.900000000000002), (Thursday,29.150000000000002), (Friday,-3.8500000000000005), (Saturday,-9.350000000000001))\r\n<\/pre>\n\n\n\n

mapValues()<\/h2>\n\n\n\n

<\/p>\n\n\n\n

Both map and mapValues are transformation functions<\/p>\n\n\n\n

With mapValues(), unlike map(), we will not have access to the key. We will only have access to value. Which means we can only transform value and not key.<\/p>\n\n\n\n

Just like map(), returns Array[(String, Double)]<\/p>\n\n\n\n

mapValues() differ from map() when we use custom partitioners. If we applied any custom partitioning to our RDD (e.g. using partitionBy), using map would “forget” that partitioner (the result will revert to default partitioning) as the keys might have changed; mapValues, however, preserves any partitioner set on the RDD because the keys don\u2019t change with mapValues as it doesn\u2019t have access to the keys in the first place.<\/p>\n\n\n\n

rdd.mapValues { x =>\r\n  (x - 32)*.55\r\n}.collect\r\n\r\nres4: Array[(String, Double)] = Array((Sunday,9.9), (Monday,15.400000000000002), (Tuesday,18.150000000000002), (Wednesday,20.900000000000002), (Thursday,29.150000000000002), (Friday,-3.8500000000000005), (Saturday,-9.350000000000001))\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"

In this post we will look at the differences between map and mapValues functions and when it is appropriate to use either one. We have a [\u2026]<\/span><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[],"class_list":["post-2240","post","type-post","status-publish","format-standard","hentry","category-spark"],"yoast_head":"\nWhat is the difference between map and mapValues functions in Spark? - Big Data In Real World<\/title>\n<meta name=\"description\" content=\"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is the difference between map and mapValues functions in Spark? - Big Data In Real World\" \/>\n<meta property=\"og:description\" content=\"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\" \/>\n<meta property=\"og:site_name\" content=\"Big Data In Real World\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/bigdatainrealworld\" \/>\n<meta property=\"article:published_time\" content=\"2023-04-13T11:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-03-26T12:28:58+00:00\" \/>\n<meta name=\"author\" content=\"Big Data In Real World\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Big Data In Real World\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\"},\"author\":{\"name\":\"Big Data In Real World\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67\"},\"headline\":\"What is the difference between map and mapValues functions in Spark?\",\"datePublished\":\"2023-04-13T11:00:00+00:00\",\"dateModified\":\"2023-03-26T12:28:58+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\"},\"wordCount\":228,\"publisher\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\"},\"articleSection\":[\"Spark\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\",\"name\":\"What is the difference between map and mapValues functions in Spark? - Big Data In Real World\",\"isPartOf\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#website\"},\"datePublished\":\"2023-04-13T11:00:00+00:00\",\"dateModified\":\"2023-03-26T12:28:58+00:00\",\"description\":\"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.bigdatainrealworld.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What is the difference between map and mapValues functions in Spark?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#website\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/\",\"name\":\"Big Data In Real World\",\"description\":\"Learn Big Data from experts!\",\"publisher\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.bigdatainrealworld.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#organization\",\"name\":\"Big Data In Real World\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png\",\"contentUrl\":\"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png\",\"width\":500,\"height\":500,\"caption\":\"Big Data In Real World\"},\"image\":{\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/bigdatainrealworld\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67\",\"name\":\"Big Data In Real World\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g\",\"caption\":\"Big Data In Real World\"},\"description\":\"We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.\",\"sameAs\":[\"https:\/\/www.bigdatainrealworld.com\/\"],\"url\":\"https:\/\/www.bigdatainrealworld.com\/author\/bigdatainrealworld\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What is the difference between map and mapValues functions in Spark? - Big Data In Real World","description":"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/","og_locale":"en_US","og_type":"article","og_title":"What is the difference between map and mapValues functions in Spark? - Big Data In Real World","og_description":"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.","og_url":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/","og_site_name":"Big Data In Real World","article_publisher":"https:\/\/www.facebook.com\/bigdatainrealworld","article_published_time":"2023-04-13T11:00:00+00:00","article_modified_time":"2023-03-26T12:28:58+00:00","author":"Big Data In Real World","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Big Data In Real World","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#article","isPartOf":{"@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/"},"author":{"name":"Big Data In Real World","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67"},"headline":"What is the difference between map and mapValues functions in Spark?","datePublished":"2023-04-13T11:00:00+00:00","dateModified":"2023-03-26T12:28:58+00:00","mainEntityOfPage":{"@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/"},"wordCount":228,"publisher":{"@id":"https:\/\/www.bigdatainrealworld.com\/#organization"},"articleSection":["Spark"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/","url":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/","name":"What is the difference between map and mapValues functions in Spark? - Big Data In Real World","isPartOf":{"@id":"https:\/\/www.bigdatainrealworld.com\/#website"},"datePublished":"2023-04-13T11:00:00+00:00","dateModified":"2023-03-26T12:28:58+00:00","description":"In this post we will look at the differences between map and mapValues functions in Spark and when it is appropriate to use either one.","breadcrumb":{"@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.bigdatainrealworld.com\/what-is-the-difference-between-map-and-mapvalues-functions-in-spark\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.bigdatainrealworld.com\/"},{"@type":"ListItem","position":2,"name":"What is the difference between map and mapValues functions in Spark?"}]},{"@type":"WebSite","@id":"https:\/\/www.bigdatainrealworld.com\/#website","url":"https:\/\/www.bigdatainrealworld.com\/","name":"Big Data In Real World","description":"Learn Big Data from experts!","publisher":{"@id":"https:\/\/www.bigdatainrealworld.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.bigdatainrealworld.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.bigdatainrealworld.com\/#organization","name":"Big Data In Real World","url":"https:\/\/www.bigdatainrealworld.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png","contentUrl":"https:\/\/www.bigdatainrealworld.com\/wp-content\/uploads\/2023\/02\/black.png","width":500,"height":500,"caption":"Big Data In Real World"},"image":{"@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/bigdatainrealworld"]},{"@type":"Person","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/24cab2292ef49c73053440c86515ef67","name":"Big Data In Real World","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bigdatainrealworld.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d332bc24fe9b3182f0a22135f163ac4e?s=96&d=retro&r=g","caption":"Big Data In Real World"},"description":"We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.","sameAs":["https:\/\/www.bigdatainrealworld.com\/"],"url":"https:\/\/www.bigdatainrealworld.com\/author\/bigdatainrealworld\/"}]}},"_links":{"self":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/2240","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/comments?post=2240"}],"version-history":[{"count":1,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/2240\/revisions"}],"predecessor-version":[{"id":2241,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/posts\/2240\/revisions\/2241"}],"wp:attachment":[{"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/media?parent=2240"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/categories?post=2240"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bigdatainrealworld.com\/wp-json\/wp\/v2\/tags?post=2240"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}