Overview
This two day workshop will introduce students to data exploration and machine learning techniques. Students will learn about the data science workflow and will practice exploring and visualising data using Python and built-in libraries. Students will also explore the differences between supervised and unsupervised learning techniques and practice creating predictive regression models.
Note: This is a two day workshop and the first session will be on 7th April (Sat)
About This Workshop
This two day workshop will introduce students to data exploration and machine learning techniques. Students will learn about the data science workflow and will practice exploring and visualising data using Python and built-in libraries. Students will also explore the differences between supervised and unsupervised learning techniques and practice creating predictive regression models.
Takeaways
After this lesson, you will be able to:
- Collect data from a variety of sources (e.g., Excel, web-scraping, APIs and others)
- Explore large data sets
- Clean and "munge" the data to prepare it for analysis
- Apply machine learning algorithms to gain insight from the data
- Visualize the results of your analysis
- Build your own library and Python scripts
Schedule
Day 2 - Diving into machine learning - April 14th - 10AM - 5PM
Module 3: Supervised vs. unsupervised learning (2.5 hours)
- Review of machine learning algorithms
- Classification, linear regression and logistic regression
- Random forests, clustering
- Decision trees
Module 4: Model Evaluation (2.5 hours)
- Feature Engineering and Model Selection
- Model Evaluation Metrics - Accuracy, RMSE, ROC, AUC, Confusion Matrix, Precision, Recall, F1 Score
- Overfitting and Bias-Variance trade-off
- Cross Validation
Prereqs & Preparation
Beginner/intermediate. This workshop is for analysts, product managers, mathematicians, business managers or anyone else that wants to learn about machine learning. A background in computer science, programming, and/or statistics is preferred for this workshop. It is not required but you are expected to be somewhat familiar with the command line tools and how to write simple programs. Recommended that you take the “Python for Beginners†workshop prior to attending this.
About the Instructor
Anthony Ta - Data Scientist, GO-JEK
Anthony is a National University of Singapore grad with strong interest in finding meaningful information from data and applying them to help improve many aspects of life . Previously working in neuroscience, he is now a Data Scientist at GO-JEK, Indonesia's first unicorn and is currently the fastest growing start-up in South Asia. During free time, you can find him wandering around with a camera to capture sceneries and portraits.