
Instructor
Rahul Kumar

Category
Python

Course Fees
Quotation on request Rs.
Supervised and Unsupervised Learning using Python 4 Days
DATA SCIENCE USING PYTHON
GOOGLE'S SELFDRIVING CARS AND ROBOTS GET A LOT OF PRESS, BUT THE COMPANY'S @#$%&*!REAL FUTURE IS IN MACHINE LEARNING, THE TECHNOLOGY THAT ENABLES COMPUTERS TO GET SMARTER AND MORE PERSONAL. â€“ ERIC SCHMIDT (GOOGLE CHAIRMAN)
This course is intended to give a holistic understanding on statistical & machine learning and its application using Python. The workshop will cover
 An introduction to business analytics.
 An introduction to Python for data analysis.
 An introduction to supervised machine learning algorithms
 An introduction to unsupervised machine learning algorithms
 Understanding the core of machine learning â€“ Gradient Descent Algorithm
 Understanding of various sampling strategies and its efficacy in learning process
 An introduction to ensemble methods for handling imbalanced data
 Introduction to text analytics using Python
 Handson using the Python code on the real life dataset
OBJECTIVE
We are living in an era where computing moved from mainframes to personal computers to cloud. And while it happened, we started generating humongous amount of data.However the multifolds increase in computing power also brought in advancement in application of algorithms which can be used to get insights from huge amount of data being generated. In this course, you will learn to nuances of building supervised and unsupervised machine learning models on real life datasets. Weâ€™ll introduce you to Python platform and some of thestatistical and machine learning algorithms which will become handy in solving challenging problems.
At the end of the course you will develop a clear understanding of the need of machine learning algorithms and the context in which to apply these algorithms to solve complex problems from the field of business.
Who Should Attend
Irrespective of type of industry (retail, ecommerce, manufacturing, real estate & construction, telecom, hospitality, banking, healthcare, IT, supply chain &logistic, etc.); data forms the crux of decision making. This course is designed hone up analytical skills and business acumen of midlevel and senior level corporate professionals trying to understand the nuances of data science and help them the machine learning techniques an efficient way to generate insights for customers which in turn optimizes the bottom line of organizations.
Hardware And Software
 1. Participants should bring their laptop (preferably Windows 7 or higher/ Mac OS installed).
 2. Operating System (any of the following):
 Mac OS X with XQuartz
 Windows (Version XP or later) is required.
 3. Minimum 8 GB RAM on the system is advisable.
INSTALLATIONS:
 â€¢ ForWindows, goto https://docs.anaconda.com/anaconda/install/windows.html
 â€¢ For MacOS, go to https://docs.anaconda.com/anaconda/install/macos
 â€¢ For Linux,gotohttps://docs.anaconda.com/anaconda/install/linux
PREREQUISITE &COURSE DELIVERABLE:
 1. Participants should have basic programming skills. Participants are expected to spend time with the code set as a home assignment to leverage the classroom training hours to the fullest.
 2. High speed internet connection will be provided at the training venue.
 3. Deliverable: Python code and dataset. Soft copy of the content being covered (PDF file)
More about anaconda can be found at https://docs.anaconda.com. Participants are expected to resolve any installation issues of the software prior to the commencement of the session.
Day 1: Understanding Anaconda Framework platform and other useful packages in Python
Session 1â€“Introduction to Business Analytics
 What is Business Analytics
 Why is it needed and how industries are adopting it
 Different components of analytics
 Applications of analytics in different domains
 Statistical learning vs. Machine learning
 What is Data Science and skills of a data scientist
 Different types of machine learning algorithmsâ€“Supervised, Unsupervised and Reinforcement learning
Session 2 & 3â€“Introduction to Anaconda and Python
 Overview of Anaconda framework
 Python â€“ Variables, objects, loops, conditions, function.
 Python Data structures â€“ lists, tuples, dictionaries, sets
 Introduction to Numpy â€“ ndarrays, ndarrays indexing, ndarraysdatatypes and operations, statistical sorting and set operation
 Introduction to Pandas â€“ Data ingestion, descriptive statistics, visualization, frequent data operations, merging dataframes, parsing timestamps
 Introduction to visualization â€“ Matplotlib
Session 4& 5â€“Basics of Statistics
 Random Variable â€“ Discrete and Continuous
 Probability density function and Cumulative density function.
 Distribution Family â€“ Gaussian Distribution, Standard Normal Distribution
 Population and Sample
 Central limit theorem
 Demonstration of Central limit theorem on finance data
 Degree of freedom
Day 2: Understanding advanced algorithms and its implementation using Python
Session 1â€“Basics of Statistics Cont.â€¦
 Hypothesis testing â€“ Z test, t test, test for proportion, analysis of variance (ANNOVA)
 Covariance and Correlation
 Partial and Semipartial Correlation
Session 2, 3&4â€“Lab 1: Linear Regression
 Introduction to simple and multiple linear regression
 Regression diagnosticâ€“Rsquared, ttest, Ftest, error terms distribution, heteroscedasticity, identifying multicollinearity and handling, AIC  model selection strategy.
 Common task framework for model evaluation â€“ training and test set.
 Case study using regression techniques and handson using Python code for regression
Session 5â€“Lab 2: Logistic Regression
 Introduction to logistic regression
 Logistic regression diagnostic: Wald statistics, HosmerLemeshow test, Classification Matrix, Sensitivity, Specificity, ROC Curve, precision, recall, F1score
 Strategy to find the optimal cutoff
 Bias and variance in the model, Bias vs. variance tradeoff
 Case study using logistic regression techniques and handson using Python code for regression
Day 3: Understanding supervised learning and gradient descent algorithm
Session 1 &2â€“Lab 2: Logistic Regression
 Introduction to logistic regression
 Logistic regression diagnostic: Wald statistics, HosmerLemeshow test, Classification Matrix, Sensitivity, Specificity, ROC Curve, precision, recall, F1score
 Strategy to find the optimal cutoff
 Bias and variance in the model, Bias vs. variance tradeoff
 Case study using logistic regression techniques and handson using Python code for regression
Session 4 & 5 â€“Lab 3: Introduction to Gradient Descent
 Hypothesis formulation for linear regression
 Deriving the cost function for linear regression
 Cost functionâ€“Intuition for linear regression with one parameter and two parameters
 Gradient descent algorithmâ€“application in linear regression
 Hypothesis formulation for logistic regression
 Deriving the cost function for logisticregression
 Cost functionâ€“Intuition for logistic regression
 Gradient descent algorithmâ€“application in logistic regression
 Handson using Python code to implement gradient descent for regression and logistic regression
Session 5â€“Lab 4: Decision Trees
 Decision tree â€“ Classification and regression trees (CART), Gini Index, Entropy
 Decision tree â€“ Chisquare automatic interaction detection (CHAID).
 Case study using decision tree techniques .
 Handson using Python code .
Day 4: Understanding unsupervised learning, ensemble methods and text analytics
Session 1â€“Lab 5: Clustering and Segmentation
 Supervised and Unsupervised learning
 Clusteringâ€“Hierarchical, K means
 Clustering diagnosticâ€“Dendrogram, Calinski and Harabasz index, Silhouette width
 Case study using hierarchical clustering and Kâ€“means clustering tree techniques
 Handson using Python code for Hierarchical and Kâ€“means cluster
Session 2 & 3â€“Lab 6:Other Machine learning models (Ensemble Methods)
 What is Machine learning
 Different sampling strategiesâ€“Bootstrapping, Upâ€“Sample, Downâ€“Sample, Synthetic Sample, Crossâ€“Validation Data
 Introduction to Baggingâ€“Random Forest
 Other Bagging algorithms
 Introduction to Boostingâ€“ Adaptive boosting
 Other Boosting algorithms
 Case study of an imbalanced data and application of sampling strategies & ensemble methods
 Handson using Python code on an imbalanced data
Session 4 & 5â€“Lab 7: Introduction to Natural Language Processing
 Reading in unstructured dataset
 Data preprocessing â€“ Stop words, stemming, lemmatization
 Tokenization and bag of words
 Plotting word frequency
 Sentiment analysis using Python nltk
Day 1: Understanding Anaconda Framework platform and other useful packages in Python
This day will be primarily cover introduction to business analytics, introduction to Anaconda andPython
#  Topic  Session  From  To 

1  Introduction to Business Analytics  1  9 AM  10:15 AM 
2  Introduction to Anaconda and Python platform  2  10:30 AM  11:15 AM 
3  Introduction to Anaconda and Python platformâ€¦cont.  3  12:00 PM  1:15 PM 
4  Basic of Statistics  4  2:15 PM  3:30 PM 
5  Basic of Statisticsâ€¦cont.  5  3:45 PM  5:00 PM 
Day 2: Understanding advanced algorithms and its implementation using Python
Day is primarily devoted to concept building on supervised learningand handson using Python code forthe same
#  Topic  Session  From  To 

1  Basic of Statisticsâ€¦cont.  1  9 AM  10:15 AM 
2  Lab 1: Multiple Linear Regression  2  10:30 AM  11:15 AM 
3  Lab 1: Multiple Linear Regressionâ€¦cont.  3  12:00 PM  1:15 PM 
4  Lab 2: Multiple Linear Regressionâ€¦cont,  4  2:15 PM  3:30 PM 
5  Lab 2: Logistic Regression  5  3:45 PM  5:00 PM 
Day 3: Understanding supervised learning and gradient descent algorithm
Day will cover concept building on unsupervised learning, sampling strategy and handson using Python code forensemble methods
#  Topic  Session  From  To 

1  Lab 2: Logistic Regressionâ€¦cont.  1  9 AM  10:15 AM 
2  Lab 2: Logistic Regression cont.  2  10:30 AM  11:15 AM 
3  Lab 3: Introduction to Gradient descent  3  12:00 PM  1:15 PM 
4  Lab 3: Introduction to Gradient descent  4  2:15 PM  3:30 PM 
5  Lab 4: Decision Tree  5  3:45 PM  5:00 PM 
Day 4: Understanding unsupervised learning, ensemble methods and text analytics
Day will cover concept building on unsupervised learning, sampling strategy and text analytics
#  Topic  Session  From  To 

1  Lab 5: Clustering and Segmentation  1  9 AM  10:15 AM 
2  Lab 6: Other Machine leaning models  2  10:30 AM  11:15 AM 
3  Lab 6: Other Machine leaning modelsâ€¦cont.  3  12:00 PM  1:15 PM 
4  Lab 7: Introduction to NLP  4  2:15 PM  3:30 PM 
5  Basic of Statisticsâ€¦cont.  5  3:45 PM  5:00 PM 
John Doe says:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...
ReplayJohn Doe says:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...
ReplayJohn Doe says:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...
ReplayJohn Doe says:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...
ReplayJohn Doe says:
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna et sed aliqua. Ut enim ea commodo consequat...
Replay