Disable Preloader
  • Instructor
    Rahul Kumar
  • Category
    Python
  • Course Fees
    Quotation on request Rs.

Supervised and Unsupervised Learning using Python 4 Days

DATA SCIENCE USING PYTHON

GOOGLE'S SELF-DRIVING CARS AND ROBOTS GET A LOT OF PRESS, BUT THE COMPANY'S  @#$%&*!REAL FUTURE IS IN MACHINE LEARNING, THE TECHNOLOGY THAT ENABLES COMPUTERS TO GET SMARTER AND MORE PERSONAL. – ERIC SCHMIDT (GOOGLE CHAIRMAN)

This course is intended to give a holistic understanding on statistical & machine learning and its application using Python. The workshop will cover

  • An introduction to business analytics.
  • An introduction to Python for data analysis.
  • An introduction to supervised machine learning algorithms
  • An introduction to unsupervised machine learning algorithms
  • Understanding the core of machine learning – Gradient Descent Algorithm
  • Understanding of various sampling strategies and its efficacy in learning process
  • An introduction to ensemble methods for handling imbalanced data
  • Introduction to text analytics using Python
  • Hands-on using the Python code on the real life dataset

OBJECTIVE

We are living in an era where computing moved from mainframes to personal computers to cloud. And while it happened, we started generating humongous amount of data.However the multi-folds increase in computing power also brought in advancement in application of algorithms which can be used to get insights from huge amount of data being generated. In this course, you will learn to nuances of building supervised and unsupervised machine learning models on real life datasets. We’ll introduce you to Python platform and some of thestatistical and machine learning algorithms which will become handy in solving challenging problems.

At the end of the course you will develop a clear understanding of the need of machine learning algorithms and the context in which to apply these algorithms to solve complex problems from the field of business.

Who Should Attend

Irrespective of type of industry (retail, e-commerce, manufacturing, real estate & construction, telecom, hospitality, banking, healthcare, IT, supply chain &logistic, etc.); data forms the crux of decision making. This course is designed hone up analytical skills and business acumen of mid-level and senior level corporate professionals trying to understand the nuances of data science and help them the machine learning techniques an efficient way to generate insights for customers which in turn optimizes the bottom line of organizations.

Hardware And Software

  • 1. Participants should bring their laptop (preferably Windows 7 or higher/ Mac OS installed).
  • 2. Operating System (any of the following):
    • Mac OS X with XQuartz
    • Windows (Version XP or later) is required.
  • 3. Minimum 8 GB RAM on the system is advisable.

INSTALLATIONS:

  • • ForWindows, goto https://docs.anaconda.com/anaconda/install/windows.html
  • • For MacOS, go to https://docs.anaconda.com/anaconda/install/mac-os
  • • For Linux,gotohttps://docs.anaconda.com/anaconda/install/linux

PRE-REQUISITE &COURSE DELIVERABLE:

  • 1. Participants should have basic programming skills. Participants are expected to spend time with the code set as a home assignment to leverage the classroom training hours to the fullest.
  • 2. High speed internet connection will be provided at the training venue.
  • 3. Deliverable: Python code and dataset. Soft copy of the content being covered (PDF file)

More about anaconda can be found at https://docs.anaconda.com. Participants are expected to resolve any installation issues of the software prior to the commencement of the session.

Day 1: Understanding Anaconda Framework platform and other useful packages in Python

Session 1–Introduction to Business Analytics

  • What is Business Analytics
  • Why is it needed and how industries are adopting it
  • Different components of analytics
  • Applications of analytics in different domains
  • Statistical learning vs. Machine learning
  • What is Data Science and skills of a data scientist
  • Different types of machine learning algorithms–Supervised, Unsupervised and Reinforcement learning

Session 2 & 3–Introduction to Anaconda and Python

  • Overview of Anaconda framework
  • Python – Variables, objects, loops, conditions, function.
  • Python Data structures – lists, tuples, dictionaries, sets
  • Introduction to Numpy – ndarrays, ndarrays indexing, ndarraysdatatypes and operations, statistical sorting and set operation
  • Introduction to Pandas – Data ingestion, descriptive statistics, visualization, frequent data operations, merging dataframes, parsing timestamps
  • Introduction to visualization – Matplotlib

Session 4& 5–Basics of Statistics

  • Random Variable – Discrete and Continuous
  • Probability density function and Cumulative density function.
  • Distribution Family – Gaussian Distribution, Standard Normal Distribution
  • Population and Sample
  • Central limit theorem
  • Demonstration of Central limit theorem on finance data
  • Degree of freedom

Day 2: Understanding advanced algorithms and its implementation using Python

Session 1–Basics of Statistics Cont.…

  • Hypothesis testing – Z test, t test, test for proportion, analysis of variance (ANNOVA)
  • Covariance and Correlation
  • Partial and Semi-partial Correlation

Session 2, 3&4–Lab 1: Linear Regression

  • Introduction to simple and multiple linear regression
  • Regression diagnostic–R-squared, t-test, F-test, error terms distribution, heteroscedasticity, identifying multi-collinearity and handling, AIC - model selection strategy.
  • Common task framework for model evaluation – training and test set.
  • Case study using regression techniques and hands-on using Python code for regression

Session 5–Lab 2: Logistic Regression

  • Introduction to logistic regression
  • Logistic regression diagnostic: Wald statistics, HosmerLemeshow test, Classification Matrix, Sensitivity, Specificity, ROC Curve, precision, recall, F1-score
  • Strategy to find the optimal cut-off
  • Bias and variance in the model, Bias vs. variance tradeoff
  • Case study using logistic regression techniques and hands-on using Python code for regression

Day 3: Understanding supervised learning and gradient descent algorithm

Session 1 &2–Lab 2: Logistic Regression

  • Introduction to logistic regression
  •  
  • Logistic regression diagnostic: Wald statistics, HosmerLemeshow test, Classification Matrix, Sensitivity, Specificity, ROC Curve, precision, recall, F1-score
  •  
  • Strategy to find the optimal cut-off
  •  
  • Bias and variance in the model, Bias vs. variance tradeoff
  •  
  • Case study using logistic regression techniques and hands-on using Python code for regression
  •  

Session 4 & 5 –Lab 3: Introduction to Gradient Descent

  • Hypothesis formulation for linear regression
  • Deriving the cost function for linear regression
  • Cost function–Intuition for linear regression with one parameter and two parameters
  • Gradient descent algorithm–application in linear regression
  • Hypothesis formulation for logistic regression
  • Deriving the cost function for logisticregression
  • Cost function–Intuition for logistic regression
  • Gradient descent algorithm–application in logistic regression
  • Hands-on using Python code to implement gradient descent for regression and logistic regression

Session 5–Lab 4: Decision Trees

  • Decision tree – Classification and regression trees (CART), Gini Index, Entropy
  • Decision tree – Chi-square automatic interaction detection (CHAID).
  • Case study using decision tree techniques .
  • Hands-on using Python code .

Day 4: Understanding unsupervised learning, ensemble methods and text analytics

Session 1–Lab 5: Clustering and Segmentation

  • Supervised and Unsupervised learning
  • Clustering–Hierarchical, K means
  • Clustering diagnostic–Dendrogram, Calinski and Harabasz index, Silhouette width
  • Case study using hierarchical clustering and K–means clustering tree techniques
  • Hands-on using Python code for Hierarchical and K–means cluster

Session 2 & 3–Lab 6:Other Machine learning models (Ensemble Methods)

  • What is Machine learning
  • Different sampling strategies–Bootstrapping, Up–Sample, Down–Sample, Synthetic Sample, Cross–Validation Data
  • Introduction to Bagging–Random Forest
  • Other Bagging algorithms
  • Introduction to Boosting– Adaptive boosting
  • Other Boosting algorithms
  • Case study of an imbalanced data and application of sampling strategies & ensemble methods
  • Hands-on using Python code on an imbalanced data

Session 4 & 5–Lab 7: Introduction to Natural Language Processing

  • Reading in unstructured dataset
  • Data pre-processing – Stop words, stemming, lemmatization
  • Tokenization and bag of words
  • Plotting word frequency
  • Sentiment analysis using Python nltk

Day 1: Understanding Anaconda Framework platform and other useful packages in Python

This day will be primarily cover introduction to business analytics, introduction to Anaconda andPython

# Topic Session From To
1 Introduction to Business Analytics 1 9 AM 10:15 AM
2 Introduction to Anaconda and Python platform 2 10:30 AM 11:15 AM
3 Introduction to Anaconda and Python platform…cont. 3 12:00 PM 1:15 PM
4 Basic of Statistics 4 2:15 PM 3:30 PM
5 Basic of Statistics…cont. 5 3:45 PM 5:00 PM

Day 2: Understanding advanced algorithms and its implementation using Python

Day is primarily devoted to concept building on supervised learningand hands-on using Python code forthe same

# Topic Session From To
1 Basic of Statistics…cont. 1 9 AM 10:15 AM
2 Lab 1: Multiple Linear Regression 2 10:30 AM 11:15 AM
3 Lab 1: Multiple Linear Regression…cont. 3 12:00 PM 1:15 PM
4 Lab 2: Multiple Linear Regression…cont, 4 2:15 PM 3:30 PM
5 Lab 2: Logistic Regression 5 3:45 PM 5:00 PM

Day 3: Understanding supervised learning and gradient descent algorithm

Day will cover concept building on unsupervised learning, sampling strategy and hands-on using Python code forensemble methods

# Topic Session From To
1 Lab 2: Logistic Regression…cont. 1 9 AM 10:15 AM
2 Lab 2: Logistic Regression cont. 2 10:30 AM 11:15 AM
3 Lab 3: Introduction to Gradient descent 3 12:00 PM 1:15 PM
4 Lab 3: Introduction to Gradient descent 4 2:15 PM 3:30 PM
5 Lab 4: Decision Tree 5 3:45 PM 5:00 PM

Day 4: Understanding unsupervised learning, ensemble methods and text analytics

Day will cover concept building on unsupervised learning, sampling strategy and text analytics

# Topic Session From To
1 Lab 5: Clustering and Segmentation 1 9 AM 10:15 AM
2 Lab 6: Other Machine leaning models 2 10:30 AM 11:15 AM
3 Lab 6: Other Machine leaning models…cont. 3 12:00 PM 1:15 PM
4 Lab 7: Introduction to NLP 4 2:15 PM 3:30 PM
5 Basic of Statistics…cont. 5 3:45 PM 5:00 PM

Course Reviews

Average Rating:4.6

5 Stars24
4 Stars5
3 Stars2
2 Stars0
1 Star0

Comments

  • John Doe says:
    23/06/2014

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...

    Replay
  • John Doe says:
    23/06/2014

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...

    Replay
    John Doe says:
    23/06/2014

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...

    Replay
    John Doe says:
    23/06/2014

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...

    Replay
  • John Doe says:
    23/06/2014

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...

    Replay
Leave a Comment