Machine Learning is the
foundation for today’s insights on customer, products, costs and revenues which
learns from the data provided to its algorithms.
Some of the most common
examples of machine learning are Netflix’s algorithms to give movie suggestions
based on movies you have watched in the past or Amazon’s algorithms that
recommend products based on other customers bought before.
Typical algorithm model selection
can be decided broadly on following questions:
·
How
much data do you have & is it continuous?
·
Is
it classification or regression problem?
·
Predefined
variables (Labeled), unlabeled or mix?
·
Data
class skewed?
·
What
is the goal? – predict or rank?
·
Result
interpretation easy or hard?
Here are the most used
algorithms for various business problems:
Decision
Trees:
Decision tree output is very easy to understand even for people from
non-analytical background. It does not require any statistical knowledge to
read and interpret them. Fastest way to identify most significant variables and
relation between two or more variables. Decision Trees are excellent tools for
helping you to choose between several courses of action. Most popular decision trees
are CART, CHAID, and C4.5 etc.
In general, decision
trees can be used in real-world applications such as:
·
Investment
decisions
·
Banks
loan defaulters
·
Build
vs Buy decisions
·
Company
mergers decisions
·
Sales
lead qualifications
Logistic Regression: Logistic regression
is a powerful statistical way of modeling a binomial outcome with one or more
explanatory variables. It measures the relationship between the categorical
dependent variable and one or more independent variables by estimating
probabilities using a logistic function, which is the cumulative logistic
distribution.
In general, regressions
can be used in real-world applications such as:
·
Predicting
the Customer Churn
·
Measuring
the effectiveness of marketing campaigns
Support
Vector Machines:
Support Vector Machine (SVM) is a supervised machine learning technique that is
widely used in pattern recognition and classification problems - when your data
has exactly two classes.
In general, SVM can be
used in real-world applications such as:
·
detecting
persons with common diseases such as diabetes
·
hand-written
character recognition
·
text
categorization – news articles by topics
·
stock
market price prediction
Naive
Bayes:
It is a
classification technique based on Bayes’ theorem and very easy to build and
particularly useful for very large data sets. Along with simplicity, Naive
Bayes is known to outperform even highly sophisticated classification methods.
Naive Bayes is
also a good choice when CPU and memory resources are a limiting factor
In general, Naive Bayes
can be used in real-world applications such as:
·
Sentiment analysis and text classification
·
Recommendation
systems like Netflix, Amazon
·
To
mark an email as spam or not spam
·
Facebook
like face recognition
Apriori: This algorithm generates
association rules from a given data set. Association rule implies that if an
item A occurs, then item B also occurs with a certain probability.
In general, Apriori can
be used in real-world applications such as:
·
Market
basket analysis like amazon - products purchased together
·
Auto
complete functionality like Google to provide words which come together
·
Identify
Drugs and their effects on patients
Random
Forest:
is an ensemble of decision trees. It can solve both regression and
classification problems with large data sets. It also helps identify most
significant variables from thousands of input variables.
In general, Random
Forest can be used in real-world applications such as:
·
Predict
patients for high risks
·
Predict
parts failures in manufacturing
·
Predict
loan defaulters
The most powerful form
of machine learning being used today, is called “Deep Learning”.
In today’s Digital Transformation age,
most businesses will tap into machine learning algorithms for their operational
and customer-facing functions.
No comments:
Post a Comment