About this Course
4.5
1,023 ratings
196 reviews
Specialization

Course 4 of 6 in the

100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Hours to complete

Approx. 16 hours to complete

Suggested: 5 Weeks, 3 - 5 hours per week...
Available languages

English

Subtitles: English

Skills you will gain

Machine Learning ConceptsKnimeMachine LearningApache Spark
Specialization

Course 4 of 6 in the

100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Hours to complete

Approx. 16 hours to complete

Suggested: 5 Weeks, 3 - 5 hours per week...
Available languages

English

Subtitles: English

Syllabus - What you will learn from this course

Week
1
Hours to complete
24 minutes to complete

Welcome

...
Reading
2 videos (Total 14 min)
Video2 videos
Summary of Big Data Integration and Processing10m
Hours to complete
3 hours to complete

Introduction to Machine Learning with Big Data

...
Reading
7 videos (Total 45 min), 7 readings, 1 quiz
Video7 videos
Categories Of Machine Learning Techniques7m
Machine Learning Process3m
Goals and Activities in the Machine Learning Process10m
CRISP-DM5m
Scaling Up Machine Learning Algorithms5m
Tools Used in this Course5m
Reading7 readings
Slides: Machine Learning Overview and Applications25m
Downloading, Installing and Using KNIMEs
Downloading and Installing the Cloudera VM Instructions (Windows)10m
Downloading and Installing the Cloudera VM Instructions (Mac)10m
Instructions for Downloading Hands On Datasets10m
Instructions for Starting Jupyter10m
PDFs of Readings for Week 1 Hands-On10m
Quiz1 practice exercise
Machine Learning Overview20m
Week
2
Hours to complete
3 hours to complete

Data Exploration

...
Reading
6 videos (Total 39 min), 5 readings, 2 quizzes
Video6 videos
Data Exploration4m
Data Exploration through Summary Statistics7m
Data Exploration through Plots8m
Exploring Data with KNIME Plots9m
Data Exploration in Spark5m
Reading5 readings
Slides: Data Exploration Overview and Terminology10m
Description of Daily Weather Dataset10m
Exploring Data with KNIME Plots40m
Data Exploration in Spark10m
PDFs of Activities for Data Exploration Hands-On Readings10m
Quiz2 practice exercises
Data Exploration20m
Data Exploration in KNIME and Spark Quiz20m
Hours to complete
3 hours to complete

Data Preparation

...
Reading
8 videos (Total 42 min), 4 readings, 2 quizzes
Video8 videos
Data Quality4m
Addressing Data Quality Issues4m
Feature Selection5m
Feature Transformation5m
Dimensionality Reduction7m
Handling Missing Values in KNIME5m
Handling Missing Values in Spark5m
Reading4 readings
Slides: Data Preparation for Machine Learning30m
Handling Missing Values in KNIME20m
Handling Missing Values in Spark10m
PDFs for Data Preparation Hands-On Readings10m
Quiz2 practice exercises
Data Preparation25m
Handling Missing Values in KNIME and Spark Quiz20m
Week
3
Hours to complete
4 hours to complete

Classification

...
Reading
8 videos (Total 60 min), 7 readings, 2 quizzes
Video8 videos
Building and Applying a Classification Model5m
Classification Algorithms2m
k-Nearest Neighbors4m
Decision Trees13m
Naïve Bayes14m
Classification using Decision Tree in KNIME8m
Classification in Spark6m
Reading7 readings
Slides: What is Classification?10m
Slides: Classification Algorithms10m
Classification using Decision Tree in KNIME45m
Interpreting a Decision Tree in KNIME20m
Instructions for Changing the Number of Cloudera VM CPUs10m
Classification in Spark45m
PDFs for Classification Hands-On Readings10m
Quiz2 practice exercises
Classification20m
Classification in KNIME and Spark Quiz16m
Week
4
Hours to complete
3 hours to complete

Evaluation of Machine Learning Models

...
Reading
7 videos (Total 42 min), 7 readings, 2 quizzes
Video7 videos
Overfitting in Decision Trees3m
Using a Validation Set9m
Metrics to Evaluate Model Performance10m
Confusion Matrix7m
Evaluation of Decision Tree in KNIME3m
Evaluation of Decision Tree in Spark2m
Reading7 readings
Slides: Overfitting: What is it and how would you prevent it?10m
Slides: Model evaluation metrics and methods10m
Evaluation of Decision Tree in KNIME30m
Completed KNIME Workflows10m
Evaluation of Decision Tree in Spark20m
Comparing Classification Results for KNIME and Spark10m
PDFs for Evaluation of Machine Learning Models Hands-On Readings10m
Quiz2 practice exercises
Model Evaluation20m
Model Evaluation in KNIME and Spark Quiz16m
4.5
196 ReviewsChevron Right
Career direction

60%

started a new career after completing these courses
Career Benefit

47%

got a tangible career benefit from this course
Career promotion

25%

got a pay increase or promotion

Top Reviews

By PRJul 19th 2018

Excellent course, I learned a lot about machine learning with big data, but most importantly I feel ready to take it into more complex level although I realized there is lots to learn.

By RCSep 1st 2018

Amazing training on ML for people starting their first experiences with the topic. Practical and easy to understand examples that can be further extended by the student.

Instructors

Avatar

Mai Nguyen

Lead for Data Analytics
San Diego Supercomputer Center
Avatar

Ilkay Altintas

Chief Data Science Officer
San Diego Supercomputer Center

About University of California San Diego

UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory....

About the Big Data Specialization

Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions. ********* Do you need to understand big data and how it will impact your business? This Specialization is for you. You will gain an understanding of what insights big data can provide through hands-on experience with the tools and systems used by big data scientists and engineers. Previous programming experience is not required! You will be guided through the basics of using Hadoop with MapReduce, Spark, Pig and Hive. By following along with provided code, you will experience how one can perform predictive modeling and leverage graph analytics to model problems. This specialization will prepare you to ask the right questions about data, communicate effectively with data scientists, and do basic exploration of large, complex datasets. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned to do basic analyses of big data....
Big Data

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.