About this Course
6,697 recent views

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Beginner Level

Approx. 19 hours to complete

Suggested: 8 hours/week...

English

Subtitles: English

What you will learn

  • Check

    Use different tools to browse existing databases and tables in big data systems

  • Check

    Use different tools to explore files in distributed big data filesystems and cloud storage

  • Check

    Create and manage big data databases and tables using Apache Hive and Apache Impala

  • Check

    Describe and choose among different data types and file formats for big data systems

Skills you will gain

Data ManagementDistributed File SystemsCloud StorageBig DataSQL

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Beginner Level

Approx. 19 hours to complete

Suggested: 8 hours/week...

English

Subtitles: English

Syllabus - What you will learn from this course

Week
1
3 hours to complete

Orientation to Data in Clusters and Cloud Storage

...
7 videos (Total 56 min), 3 readings, 1 quiz
7 videos
Browsing Tables with Hue7m
Browsing Tables with SQL Utility Statements6m
Browsing HDFS with the Hue File Browser13m
Browsing HDFS from the Command Line9m
Understanding S3 and Other Cloud Storage Platforms6m
Browsing S3 Buckets from the Command Line8m
3 readings
Review and Preparation30m
Instructions for Downloading and Installing the Exercise Environment30m
Troubleshooting the VM5m
1 practice exercise
Week 1 Graded Quiz30m
Week
2
5 hours to complete

Defining Databases, Tables, and Columns

...
7 videos (Total 33 min), 12 readings, 2 quizzes
7 videos
Introduction to the CREATE TABLE Statement5m
Using Different Schemas on the Same Data12m
Specifying TBLPROPERTIES2m
Examining, Modifying, and Removing Tables1m
Hive and Impala Interoperability2m
Impala Metadata Refresh3m
12 readings
Creating Databases and Tables with Hue30m
Creating Databases and Tables with SQL15m
Permissions to Create Databases and Tables5m
The ROW FORMAT Clause25m
The STORED AS Clause15m
The LOCATION Clause20m
CREATE TABLE Shortcuts10m
Using Hive SerDes15m
Working with Unstructured and Semi-Structured Data15m
Examining Table Structure10m
Dropping Databases and Tables5m
Modifying Existing Tables35m
2 practice exercises
Week 2 Practice Quiz20m
Week 2 Graded Quiz30m
Week
3
3 hours to complete

Data Types and File Types

...
5 videos (Total 14 min), 12 readings, 2 quizzes
5 videos
Overview of Data Types1m
Choosing the Right Data Types4m
Overview of File Types3m
Choosing the Right File Types3m
12 readings
Integer Data Types5m
Decimal Data Types10m
Character String Data Types10m
Other Data Types5m
Examining Data Types10m
Out-of-Range Values5m
Text Files5m
Avro Files5m
Parquet Files5m
ORC Files5m
Other File Types5m
Creating Tables with Avro and Parquet Files20m
2 practice exercises
Week 3 Practice Quiz20m
Week 3 Graded Quiz30m
Week
4
5 hours to complete

Managing Datasets in Clusters and Cloud Storage

...
8 videos (Total 48 min), 13 readings, 3 quizzes
8 videos
Refresh Impala's Metadata Cache after Loading Data2m
Loading Files into HDFS with Hue's Table Browser10m
Loading Files into HDFS with Hue's File Browser6m
Loading Files into HDFS from the Command Line8m
Loading Files into S3 from the Command Line10m
Using Hive and Impala to Load Data into Tables3m
Conclusion2m
13 readings
More about HDFS Shell Commands10m
Chaining and Scripting with HDFS Commands5m
HDFS Permissions5m
Other Ways to Load Files into S35m
S3 Permissions10m
Missing Values15m
Character Sets5m
Using Sqoop to Import Data15m
More Sqoop Import Options5m
Using Sqoop to Export Data5m
SQL LOAD DATA Statements10m
SQL INSERT Statements10m
SQL INSERT ... SELECT and CTAS Statements15m
2 practice exercises
Week 4 Practice Quiz20m
Week 4 Graded Quiz30m

Instructors

Avatar

Ian Cook

Senior Curriculum Developer
Cloudera
Avatar

Glynn Durham

Senior Instructor
Cloudera

About Cloudera

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises. ...

About the Modern Big Data Analysis with SQL Specialization

This Specialization teaches the essential skills for working with large-scale data using SQL. Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you. Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable. To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines. This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala....
Modern Big Data Analysis with SQL

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

  • • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)

More questions? Visit the Learner Help Center.