Real World End To End Project using Spark, Elasticsearch, Kibana, REST and Angular - Big Data In Real World

Real World End To End Project using Spark, Elasticsearch, Kibana, REST and Angular

SparkSession vs. SparkContext vs. SQLContext vs. HiveContext
February 25, 2019
Big Data Interview Questions and Answers (Part 2)
March 11, 2019
SparkSession vs. SparkContext vs. SQLContext vs. HiveContext
February 25, 2019
Big Data Interview Questions and Answers (Part 2)
March 11, 2019

Students have been asking us for an end to end real wold project that uses Spark and we delivered a new chapter to Spark Developer In Real World course last night that demonstrates an end to end project that uses Spark and other tools from the Big Data ecosystem.

Our intention with this project

Real world projects usually would involve more tools and frameworks in addition to spark. With a good end to end project example you would be able to visualize and possibly implement one of the use cases at work or solve a problem using Spark in combination with other tools in the ecosystem. By seeing an end to end project, you can possibly explain how big data projects are implemented using Spark in your next interview.

Enroll in Spark Developer In Real World course

What is the project about?

Our goal with this project is to build a “mini Stackoverflow” website with Stackoverflow’s post (Q&A) dataset . Users can go to a web page, type in a text they would like to search and the website will bring back the relevant questions and answers by searching the data stored in Elasticsearch. We will write a Spark job to load the data in to Elasticsearch.

Implementation steps

  1. Load Stackoverflow post dataset (27.3 GB) to Elasticsearch by writing a Spark job transforming XML data to Q&A documents in JSON format. Each question will have all of it’s answers in a nested array.
  2. Visualize the Stackoverflow Q&A documents stored in Elasticsearch using Kibana
  3. Write a REST service using the Java Spring framework and Spring Boot to expose the data stored in Elasticsearch
  4. Write an Angular application to build a web application which allows users to search and present the data stored in Elasticsearch by consuming the REST service.

As you can see we are not just using Spark to solve the problem in our project. In real world projects, there are usually multiple tools involved in a solving a problem and Spark is usually one of the tools in a big chain of things. So one thing we were keen in showing the students is how Spark is used along with other tools in the big data ecosystem to solve a specific problem.

Enroll in Spark Developer In Real World course

Here is a quick sneak peak of the end result

Enroll in Spark Developer In Real World course

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

Real World End To End Project using Spark, Elasticsearch, Kibana, REST and Angular
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X