Coursera
Explore
  • Browse
  • Search
  • For Enterprise
  • Log In
  • Sign Up

Mastering Data Analysis in Excel

OverviewSyllabusFAQsCreatorsPricingRatings and Reviews

HomeData ScienceData Analysis

Mastering Data Analysis in Excel

Duke University

About this course: Important: The focus of this course is on math - specifically, data-analysis concepts and methods - not on Excel for its own sake. We use Excel to do our calculations, and all math formulas are given as Excel Spreadsheets, but we do not attempt to cover Excel Macros, Visual Basic, Pivot Tables, or other intermediate-to-advanced Excel functionality. This course will prepare you to design and implement realistic predictive models based on data. In the Final Project (module 6) you will assume the role of a business data analyst for a bank, and develop two different predictive models to determine which applicants for credit cards should be accepted and which rejected. Your first model will focus on minimizing default risk, and your second on maximizing bank profits. The two models should demonstrate to you in a practical, hands-on way the idea that your choice of business metric drives your choice of an optimal model. The second big idea this course seeks to demonstrate is that your data-analysis results cannot and should not aim to eliminate all uncertainty. Your role as a data-analyst is to reduce uncertainty for decision-makers by a financially valuable increment, while quantifying how much uncertainty remains. You will learn to calculate and apply to real-world examples the most important uncertainty measures used in business, including classification error rates, entropy of information, and confidence intervals for linear regression. All the data you need is provided within the course, all assignments are designed to be done in MS Excel, and you will learn enough Excel to complete all assignments. The course will give you enough practice with Excel to become fluent in its most commonly used business functions, and you’ll be ready to learn any other Excel functionality you might need in the future (module 1). The course does not cover Visual Basic or Pivot Tables and you will not need them to complete the assignments. All advanced concepts are demonstrated in individual Excel spreadsheet templates that you can use to answer relevant questions. You will emerge with substantial vocabulary and practical knowledge of how to apply business data analysis methods based on binary classification (module 2), information theory and entropy measures (module 3), and linear regression (module 4 and 5), all using no software tools more complex than Excel.


Created by:  Duke University
Duke University

  • Jana Schaich Borg

    Taught by:  Jana Schaich Borg, Assistant Research Professor

    Social Science Research Institute

  • Daniel Egger

    Taught by:  Daniel Egger, Executive in Residence and Director, Center for Quantitative Modeling

    Pratt School of Engineering, Duke University
Basic Info
Course 2 of 5 in the Excel to MySQL: Analytic Techniques for Business Specialization
Commitment6 weeks, 8-10 hours per week
Language
English
How To PassPass all graded assignments to complete the course.
User Ratings
4.2 stars
Average User Rating 4.2See what learners said
Syllabus
WEEK 1
About This Course
This course will prepare you to design and implement realistic predictive models based on data. In the Final Project (module 6) you will assume the role of a business data analyst for a bank, and develop two different predictive models to determine which applicants for credit cards should be accepted and which rejected. Your first model will focus on minimizing default risk, and your second on maximizing bank profits. The two models should demonstrate to you in a practical, hands-on way the idea that your choice of business metric drives your choice of an optimal model.The second big idea this course seeks to demonstrate is that your data-analysis results cannot and should not aim to eliminate all uncertainty. Your role as a data-analyst is to reduce uncertainty for decision-makers by a financially valuable increment, while quantifying how much uncertainty remains. You will learn to calculate and apply to real-world examples the most important uncertainty measures used in business, including classification error rates, entropy of information, and confidence intervals for linear regression. All the data you need is provided within the course, and all assignments are designed to be done in MS Excel. The course will give you enough practice with Excel to become fluent in its most commonly used business functions, and you’ll be ready to learn any other Excel functionality you might need in future (module 1). The course does not cover Visual Basic or Pivot Tables and you will not need them to complete the assignments. All advanced concepts are demonstrated in individual Excel spreadsheet templates that you can use to answer relevant questions. You will emerge with substantial vocabulary and practical knowledge of how to apply business data analysis methods based on binary classification (module 2), information theory and entropy measures (module 3), and linear regression (module 4 and 5), all using no software tools more complex than Excel.
2 videos, 3 readings
  1. Video: About This Specialization
  2. Reading: Specialization Overview
  3. Reading: Course Overview
  4. Video: Introduction to Mastering Data Analysis in Excel
  5. Reading: Feedback Survey Information
Excel Essentials for Beginners
In this module, will explore the essential Excel skills to address typical business situations you may encounter in the future. The Excel vocabulary and functions taught throughout this module make it possible for you to understand the additional explanatory Excel spreadsheets that accompany later videos in this course.
8 videos, 2 readings, 1 practice quiz
  1. Reading: Tips for Success
  2. Video: Introduction to Using Excel in this Course
  3. Video: Basic Excel Vocabulary; Intro to Charting
  4. Video: Arithmetic in Excel
  5. Video: Functions on Individual Cells
  6. Video: Functions on a Set of Numbers
  7. Video: Functions on Ordered Pairs of Data
  8. Video: Sorting Data in Excel
  9. Video: Introduction to the Solver Plug-in
  10. Practice Quiz: Excel Essentials Practice
  11. Reading: Feedback Survey
Graded: Excel Essentials
WEEK 2
Binary Classification
Separating collections into two categories, such as “buy this stock, don’t but that stock” or “target this customer with a special offer, but not that one” is the ultimate goal of most business data-analysis projects. There is a specialized vocabulary of measures for comparing and optimizing the performance of the algorithms used to classify collections into two groups. You will learn how and why to apply these different metrics, including how to calculate the all-important AUC: the area under the Receiver Operating Characteristic (ROC) Curve.
6 videos, 2 readings, 1 practice quiz
  1. Reading: Tips for Success
  2. Video: Introduction to Binary Classification
  3. Video: Bombers and Seagulls: Confusion Matrix
  4. Video: Costs Determine Optimal Threshold
  5. Video: Calculating Positive and Negative Predictive Values
  6. Video: How to Calculate the Area Under the ROC Curve
  7. Video: Binary Classification with More than One Input Variable
  8. Practice Quiz: Binary Classification (practice)
  9. Reading: Feedback Survey
Graded: Binary Classification (graded)
WEEK 3
Information Measures
In this module, you will learn how to calculate and apply the vitally useful uncertainty metric known as “entropy.” In contrast to the more familiar “probability” that represents the uncertainty that a single outcome will occur, “entropy” quantifies the aggregate uncertainty of all possible outcomes. The entropy measure provides the framework for accountability in data-analytic work. Entropy gives you the power to quantify the uncertainty of future outcomes relevant to your business twice: using the best-available estimates before you begin a project, and then again after you have built a predictive model. The difference between the two measures is the Information Gain contributed by your work.
7 videos, 2 readings, 1 practice quiz
  1. Reading: Tips for Success
  2. Video: Quantifying the Informational Edge
  3. Video: Probability and Entropy
  4. Video: Entropy of a Guessing Game
  5. Video: Dependence and Mutual Information
  6. Practice Quiz: Using the Information Gain Calculator Spreadsheet (practice)
  7. Video: The Monty Hall Problem
  8. Video: Learning from One Coin Toss, Part 1
  9. Video: Learning From One Coin Toss, Part 2
  10. Reading: Feedback Survey
Graded: Information Measures (graded)
WEEK 4
Linear Regression
The Linear Correlation measure is a much richer metric for evaluating associations than is commonly realized. You can use it to quantify how much a linear model reduces uncertainty. When used to forecast future outcomes, it can be converted into a “point estimate” plus a “confidence interval,” or converted into an information gain measure. You will develop a fluent knowledge of these concepts and the many valuable uses to which linear regression is put in business data analysis. This module also teaches how to use the Central Limit Theorem (CLT) to solve practical problems. The two topics are closely related because regression and the CLT both make use of a special family of probability distributions called “Gaussians.” You will learn everything you need to know to work with Gaussians in these and other contexts.
11 videos, 2 readings, 2 practice quizzes
  1. Reading: Tips for Success
  2. Video: Introducing the Gaussian
  3. Video: Introduction to Standardization
  4. Video: Standard Normal Probability Distribution in Excel
  5. Video: Calculating Probabilities from Z-scores
  6. Video: Central Limit Theorem
  7. Video: Algebra with Gaussians
  8. Video: Markowitz Portfolio Optimization
  9. Practice Quiz: The Gaussian (practice)
  10. Video: Standardizing x and y Coordinates for Linear Regression
  11. Video: Standardization Simplifies Linear Regression
  12. Video: Modeling Error in Linear Regression
  13. Video: Information Gain from Linear Regression
  14. Practice Quiz: Regression Models and PIG (practice)
  15. Reading: Feedback Survey
Graded: Parametric Models for Regression (graded)
WEEK 5
Additional Skills for Model Building
This module gives you additional valuable concepts and skills related to building high-quality models. As you know, a “model” is a description of a process applied to available data (inputs) that produces an estimate of a future and as yet unknown outcome as output. Very often, models for outputs take the form of a probability distribution. This module covers how to estimate probability distributions from data (a “probability histogram”), and how to describe and generate the most useful probability distributions used by data scientists. It also covers in detail how to develop a binary classification model with parameters optimized to maximize the AUC, and how to apply linear regression models when your input consists of multiple types of data for each event. The module concludes with an explanation of “over-fitting” which is the main reason that apparently good predictive models often fail in real life business settings. We conclude with some tips for how you can avoid over-fitting in you own predictive model for the final project – and in real life.
4 videos, 2 readings
  1. Video: Describing Histograms and Probability Distributions Functions
  2. Video: Some Important and Frequently Encountered PDFs
  3. Reading: AUC Calculator Explanation and Spreadsheet
  4. Video: Linear Regression with More than One Input Variable
  5. Video: Understanding Why Over-fitting Happens
  6. Reading: Feedback Survey
Graded: Probability, AUC, and Excel Linest Function
WEEK 6
Final Course Project
The final course project is a comprehensive assessment covering all of the course material, and consists of four quizzes and a peer review assignment. For quiz one and quiz two, there are learning points that explain components of the quiz. These learning points will unlock only after you complete the quiz with a passing grade. Before you start, please read through the final project instructions. From past student experience, the final project which includes all the quizzes and peer assessment, takes anywhere from 10-12 hours.
2 videos, 4 readings
  1. Reading: Final Project Information
  2. Video: Final Project Information: Part 1
  3. Reading: Summary of Learning Points for Final Project: Quiz 1
  4. Video: Final Project Information: Part 2
  5. Reading: Summary of Learning Points for Final Project: Quiz 2
  6. Reading: Feedback Survey
Graded: Part 1: Building your Own Binary Classification Model
Graded: Part 2: Should the Bank Buy Third-Party Credit Information?
Graded: Part 3: Comparing the Information Gain of Alternative Data and Models
Graded: Part 4: Modeling Profitability Instead of Default
Graded: Part 5: Modeling Credit Card Default Risk and Customer Profitability

FAQs
How It Works
Coursework
Coursework

Each course is like an interactive textbook, featuring pre-recorded videos, quizzes and projects.

Help from Your Peers
Help from Your Peers

Connect with thousands of other learners and debate ideas, discuss course material, and get help mastering concepts.

Certificates
Certificates

Earn official recognition for your work, and share your success with friends, colleagues, and employers.

Creators
Duke University
Duke University has about 13,000 undergraduate and graduate students and a world-class faculty helping to expand the frontiers of knowledge. The university has a strong commitment to applying knowledge in service to society, both near its North Carolina campus and around the world.
Pricing
Purchase Course
Access to course materials

Available

Access to graded materials

Available

Receive a final grade

Available

Earn a shareable Course Certificate

Available

Ratings and Reviews
Rated 4.2 out of 5 of 2,458 ratings
Junior Ludger

This course is worth taking by anyone who needs to have a better understanding of how the most efficient and best decisions in the market place are taken using Data management simply in Excel.

CR

Really enjoyed this course - thank you Mr Egger !!

RS

I learned a ton, but the course was TOUGH! You really have to hunker down, pay close attention, and take notes...sometimes even re-watch lectures more than once. I am glad I took the course. The only reason I didn't give it 5 stars is that the lectures can be tough to follow, and the quizzes and assignments have missing or misplaced information that make it difficult to get to the right answer. It takes some serious patience, reading of the supplemental material, and hints from others on the discussion forums.

CD

hope there can be clear answers for each assignment in the future



You May Also Like
University of Pennsylvania
Operations Analytics
1 course
University of Pennsylvania
Operations Analytics
View course
Duke University
Business Metrics for Data-Driven Companies
1 course
Duke University
Business Metrics for Data-Driven Companies
View course
University of Pennsylvania
Customer Analytics
1 course
University of Pennsylvania
Customer Analytics
View course
University of Pennsylvania
Accounting Analytics
1 course
University of Pennsylvania
Accounting Analytics
View course
Duke University
Data Visualization and Communication with Tableau
1 course
Duke University
Data Visualization and Communication with Tableau
View course
Coursera
Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.
© 2018 Coursera Inc. All rights reserved.
Download on the App StoreGet it on Google Play
  • Coursera
  • About
  • Leadership
  • Careers
  • Catalog
  • Certificates
  • Degrees
  • For Business
  • For Government
  • Community
  • Partners
  • Mentors
  • Translators
  • Developers
  • Beta Testers
  • Connect
  • Blog
  • Facebook
  • LinkedIn
  • Twitter
  • Google+
  • Tech Blog
  • More
  • Terms
  • Privacy
  • Help
  • Accessibility
  • Press
  • Contact
  • Directory
  • Affiliates