Overview
Bringing the best data scientists in Singapore together to discuss and debate your best practices for various kinds of data science tasks, in an open-mic fashion! A github markdown document would be started and updated live, as a documentation of these best practices in data science across the board, which would be shared publicly. We focus on specific techniques and types of data, and implementation, not fluff or general talk – all data scientists are welcomed, and beginners are welcomed too but we won't have time to explain basic concepts since we are covering a lot in this one session.
Here's the flow of the evening:
- Thought process in data visualisation and understanding, useful plots for specific types of data
- Cross-validation strategies
- Data manipulation/preprocessing methods and feature engineering/learning for various types of data: natural language, sound, image
- Tips on Hyperparameter tuning
- Tradeoffs in model selection for various types of data: natural language, sound, image, time series
We will takehttp://blog.hackerearth.com/winning-tips-machine-learning-competitions-k... as a starting point to build upon