Hello. I am standing outside the new electrical and computer engineering building,
at the University of Illinois.
This building opened in the fall of 2014 and
provides space for both high tech learning and collaboration.
The building has approximately 230,000 square feet of space,
that is devoted to classrooms,
offices and laboratory collaboration.
This building is designed to be one of the largest net zero energy buildings
in the country and has many fundamental features to enable this goal;
Including solar rays on the roof,
a Terra cotta exterior and sunshades.
This module introduces several fundamental machine learning algorithms.
Thus, it is appropriate to be outside the ECE building,
which is a marvel of modern engineering and where faculty and
students develop new hardware and software approaches to improve machine learning.
These basic algorithms are important to learn for several reasons.
First; Learning these basic techniques
will help you strengthen your data analytics intuition,
which will guide you in future analysis projects.
Second; Even if a more powerful machine learning technique is used in the end,
these fundamental algorithms are often quick and easy to run and they
thus provide useful benchmarks against which
other more complex techniques can be compared.
Third; More powerful algorithms are often based on these fundamental techniques.
Thus, learning these algorithms is often
the first step in learning more advanced techniques.
This module will begin by exploring
examples of machine learning being used in business and accountancy.
Next, you will learn about several fundamental algorithms.
The first three of which are logistic regression,
decision trees and support vector machines.
The first of these algorithms,
logistic regression is despite its name,
actually an algorithm for performing classification.
This technique employs a linear equation to
predict which of two classes an instance belongs.
As a result, this technique is popular since
the linear equation makes this model easy to understand and explain.
The actual prediction is based on the logic function which
maps the entire real numbers into the range zero to one.
Thus we have a mapping between the result of
the linear equation and the probability of being Class one or Class two.
This lesson will also introduce different classification metrics,
including the concepts of true positive,
true negative, false positive and false negative.
Many popular performance metrics are based on these four values,
including things such as; precision,
recall, sensitivity, and type one and type two errors.
The second algorithm; decision tree,
recursively subdivides a data set by splitting along different features one at a time.
This process starts with a single group node that represents the entire set of data.
As the data are subdivided into two child populations,
new nodes are constructed.
This process continues constructing
an inverted tree eventually stopping with a set of leaf nodes.
A leaf node is created either when a predefined stopping criteria is reached,
for example the maximum tree depth or when all instances are roughly equivalent.
When splitting a node, the feature is chosen to optimally split the data.
This feature and the split value,
can be chosen in different ways.
For example; For discrete variables it may be useful to choose
the information gain and try to maximize that while for continuous features,
we may want to maximize the reduction in variance.
Decision trees are also easily understood and explained,
since the tree can be used to determine by tracing the decisions made from the root of
the tree to the deciding leaf node how
a final classification is determined for each new instance.
Finally, decision trees are flexible and
can be used for both classification and regression.
The third algorithm introduced in this module,
is the support vector machine or SVM.
Like the decision tree,
this algorithm can also be used for classification and regression.
But the SVM does not generate an easily explainable model.
In classification, SVM is called support vector
classification and works by determining hyper planes that optimally subdivide the data.
For regression, SVM is known as
support vector regression and
the hyper planes are constructed to model the future distribution.
In either case, the construction of these hyper planes is flexible, which allows,
for example some mis-classifications,
as long as the model constructed generalizes appropriately.
This makes SVMs an easy and accurate algorithm to deploy on large complex data.
These three algorithms can either be used directly or
as part of a larger framework in production machine learning systems.
With this module, you will be learning
these fundamental algorithms and thus able to
understand the real world machine learning applications.
I hope you're excited to learn all about them. Good luck.