Overview
Learning Path
Career Outlook
Course Overview:
This course covers the concepts of the Big Data Hadoop framework and Hadoop ecosystem tools; distributed storage and processing; the basics of functional programming; Apache Hive; Apache Spark; and No SQL database. The course curriculum is aligned with the CCA Spark and Hadoop Developer Exam (CCA175).
Course Highlights:
- 22 hours of online self-paced learning
- 52 hours of instructor-led training
- 4 industry-based course-end projects
- Interactive learning with integrated labs
- Curriculum aligned to Cloudera CCA175 certification exam
Course Delivery Method:
Online Bootcamp– Online, self-paced, video-based learning and live virtual classroom conducted by the industry’s leading big data coach. This course includes Simpliearn’s Integrated lab platform.
Prerequisites:
Learners interested in taking this Big Data Hadoop and Spark Developer course should have a basic understanding of core Java and SQL.
Skills Covered:
- Data processing
- Functional programming
- Apache Spark
- Parallel processing
- Spark RDD optimization techniques
- Spark SQL
Learning Path
-
Lesson 1
Course Introduction
-
Lesson 2
Introduction to Big Data and Hadoop
-
Lesson 3
Hadoop Architecture,Distributed Storage (HDFS) and YARN
-
Lesson 4
Data Ingestion into Big Data Systems and ETL
-
Lesson 5
Distributed Processing - MapReduce Framework and Pig
-
Lesson 6
Apache Hive
View more
-
Lesson 7
NoSQL Databases - HBase
-
Lesson 8
Basics of Functional Programming and Scala
-
Lesson 9
Apache Spark Next Generation Big Data Framework
-
Lesson 10
Spark Core Processing RDD
-
Lesson 11
Spark SQL - Processing DataFrames
-
Lesson 12
Spark MLLib - Modelling BigData with Spark
-
Lesson 13
Stream Processing Frameworks and Spark Streaming
-
Lesson 14
Spark GraphX
Who Will Benefit:
This Big Data Hadoop and Spark Developer course is best suited for the following professionals:
- Analytics, BI, IT, data management, and project management professionals
- Architects, software developers, and professionals working on mainframe and testing
- Aspiring data scientists and graduates looking to begin a career in big data analytics
Key Learning Outcomes:
- Learn how to navigate the Hadoop ecosystem and understand how to optimize its use
- Ingest data using Sqoop, Flume, and Kafka
- Implement partitioning, bucketing, and indexing in Hive
- Work with RDD in Apache Spark
- Process real-time streaming data
- Perform DataFrame operations in Spark using SQL queries
- Implement user-defined functions (UDF) and user-defined attribute functions (UDAF) in Spark
Accreditations:
This Big Data Hadoop and Spark Developer course is aligned with the CCA Spark and Hadoop Developer Exam (CCA175).
Certification Criteria:
- Completion of at least 85 percent of online self-paced learning or attendance of one live virtual classroom
- A score of at least 75 percent in course-end assessment
- Successful evaluation in at least one project
Career Outlook
Expected Growth (2019 – 2029)*
- 15%
Annual Average US Salary*
- $92,00 - $140,000
Demanding Fields
- Informaon Technology
- Finance
- Retail
- Real Estate
- Engineering
- Hospitality Management
- Business Consulng
Job Opportunies for Professionals
- IT Developers
- Analytics Managers
- Information Architects
- Analytics professionals
- Experienced professionals
- Beginners or Recent
Graduates in Bachelors or
Master’s Degree
*Salary and job outlook information comes from the US Bureau of Labor Statistics and Projections Central. Employment outcomes are not guaranteed.
-
Lesson 1
Course Introduction
-
Lesson 2
Introduction to Big Data and Hadoop
-
Lesson 3
Hadoop Architecture,Distributed Storage (HDFS) and YARN
-
Lesson 4
Data Ingestion into Big Data Systems and ETL
-
Lesson 5
Distributed Processing - MapReduce Framework and Pig
-
Lesson 6
Apache Hive
View more
-
Lesson 7
NoSQL Databases - HBase
-
Lesson 8
Basics of Functional Programming and Scala
-
Lesson 9
Apache Spark Next Generation Big Data Framework
-
Lesson 10
Spark Core Processing RDD
-
Lesson 11
Spark SQL - Processing DataFrames
-
Lesson 12
Spark MLLib - Modelling BigData with Spark
-
Lesson 13
Stream Processing Frameworks and Spark Streaming
-
Lesson 14
Spark GraphX