About this Course
3.9
2,688 ratings
660 reviews
This course is for novice programmers or business people who would like to understand the core tools used to wrangle and analyze big data. With no prior experience, you will have the opportunity to walk through hands-on examples with Hadoop and Spark frameworks, two of the most common in the industry. You will be comfortable explaining the specific components and basic processes of the Hadoop architecture, software stack, and execution environment. In the assignments you will be guided in how data scientists apply the important concepts and techniques such as Map-Reduce that are used to solve fundamental problems in big data. You'll feel empowered to have conversations about big data and the data analysis process....
Globe

100% online courses

Start instantly and learn at your own schedule.
Calendar

Flexible deadlines

Reset deadlines in accordance to your schedule.
Clock

Suggested: 5 weeks of study, 1-2 hours/week

Approx. 19 hours to complete
Comment Dots

English

Subtitles: English

Skills you will gain

Python ProgrammingApache HadoopMapreduceApache Spark
Globe

100% online courses

Start instantly and learn at your own schedule.
Calendar

Flexible deadlines

Reset deadlines in accordance to your schedule.
Clock

Suggested: 5 weeks of study, 1-2 hours/week

Approx. 19 hours to complete
Comment Dots

English

Subtitles: English

Syllabus - What you will learn from this course

1

Section
Clock
2 hours to complete

Hadoop Basics

Welcome to the first module of the Big Data Platform course. This first module will provide insight into Big Data Hype, its technologies opportunities and challenges. We will take a deeper look into the Hadoop stack and tool and technologies associated with Big Data solutions. ...
Reading
7 videos (Total 53 min), 4 readings, 1 quiz
Video7 videos
The Apache Framework: Basic Modules3m
Hadoop Distributed File System (HDFS)5m
The Hadoop "Zoo"5m
Hadoop Ecosystem Major Components11m
Exploring the Cloudera VM: Hands-On Part 116m
Exploring the Cloudera VM: Hands-On Part 26m
Reading4 readings
Apache Hadoop Ecosystem10m
Lesson 1 Slides (PDF)10m
Hardware & Software Requirements10m
Lesson 2 Slides - Cloudera VM Tour10m
Quiz1 practice exercise
Basic Hadoop Stack20m

2

Section
Clock
3 hours to complete

Introduction to the Hadoop Stack

In this module we will take a detailed look at the Hadoop stack ranging from the basic HDFS components, to application execution frameworks, and languages, services....
Reading
10 videos (Total 70 min), 6 readings, 3 quizzes
Video10 videos
The Hadoop Distributed File System (HDFS) and HDFS28m
MapReduce Framework and YARN8m
The Hadoop Execution Environment4m
YARN, Tez, and Spark11m
Hadoop Resource Scheduling6m
Hadoop-Based Applications3m
Introduction to Apache Pig7m
Introduction to Apache HIVE7m
Introduction to Apache HBASE7m
Reading6 readings
Hadoop Basics - Lesson 1 Slides10m
Lesson 2: Hadoop Execution Environment - Slides10m
Lesson 3: Hadoop-Based Applications Overview - All Slides10m
Command list for Applications Slides10m
Tips to handle service connection errors10m
References for Applications10m
Quiz3 practice exercises
Overview of Hadoop Stack10m
Hadoop Execution Environment14m
Hadoop Applications12m

3

Section
Clock
2 hours to complete

Introduction to Hadoop Distributed File System (HDFS)

In this module we will take a detailed look at the Hadoop Distributed File System (HDFS). We will cover the main design goals of HDFS, understand the read/write process to HDFS, the main configuration parameters that can be tuned to control HDFS performance and robustness, and get an overview of the different ways you can access data on HDFS....
Reading
9 videos (Total 58 min), 5 readings, 3 quizzes
Video9 videos
The HDFS Performance Envelope5m
Read/Write Processes in HDFS4m
HDFS Tuning Parameters6m
HDFS Performance and Robustness9m
Overview of HDFS Access, APIs, and Applications5m
HDFS Commands8m
Native Java API for HDFS4m
REST API for HDFS8m
Reading5 readings
Lesson 1: Introduction to HDFS - Slides10m
HDFS references10m
Lesson 2: HDFS Performance and Tuning - Slides10m
HDFS Access, APIs10m
Lesson 3: HDFS Access, APIs, Applications - Slides10m
Quiz3 practice exercises
HDFS Architecture12m
HDFS performance,tuning, and robustness10m
Accessing HDFS12m

4

Section
Clock
7 hours to complete

Introduction to Map/Reduce

This module will introduce Map/Reduce concepts and practice. You will learn about the big idea of Map/Reduce and you will learn how to design, implement, and execute tasks in the map/reduce framework. You will also learn the trade-offs in map/reduce and how that motivates other tools....
Reading
9 videos (Total 27 min), 3 readings, 3 quizzes
Video9 videos
The Map/Reduce Framework2m
A MapReduce Example: Wordcount in detail4m
MapReduce: Intro to Examples and Principles2m
MapReduce Example: Trending Wordcount1m
MapReduce Example: Joining Data4m
MapReduce Example: Vector Multiplication2m
Computational Costs of Vector Multiplication3m
MapReduce Summary2m
Reading3 readings
Lesson 1: Introduction to MapReduce - Slides10m
A note on debugging map/reduce programs.10m
Lesson 2: MapReduce Examples and Principles - Slides10m
Quiz1 practice exercise
Lesson 1 Review14m
3.9
Briefcase

83%

got a tangible career benefit from this course

Top Reviews

By GMFeb 1st 2016

I'm forced to give 5 stars. I don't want to have a certification on a poor quality course (another coursera mistake). This material needs tremendous amount of work to get finished and revised.

By GCOct 25th 2015

Super hands on introduction to key Hadoop components, such as Spark, Map Reduce, Hive, Pig, HBase, HDFS, YARN, Squoop and Flume.\n\nI can't wait to the next course on the specialization.

Instructors

Natasha Balac

Director, Predictive Analytics Center of Excellence (PACE)
San Diego Supercomputer Center

Paul Rodriguez

Research Programmer
San Diego Supercomputer Center (SDSC)

Andrea Zonca

HPC Applications Specialist
San Diego Supercomputer Center (SDSC)

About University of California San Diego

UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory....

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.