Introduction

This course is intended to be split in three sessions comprising:

  • Getting familiar with the working framework
    • Install Jupyter Notebooks
    • Write interactive data analysis with python code, markdown and inline visualizations
  • Understand the process of training a machine learning model
    • Model fitting
    • Train and test sets split
    • Hyperparameter optimization
    • Evaluation
  • Work on field-specific examples

The course can be delivered in-person and on-line, but it involves a strong component of self-led learning.

This course is aimed at the Python developer who wants to learn how to do useful data analysis tasks. Over the years, Python has become a very popular tool for analysing data. These days it comes with support from many tools to do machine learning, data querying, neural networks and exploratory analysis.

In this course we will investigate the use of scikit-learn for machine learning to discover things about whatever data may come across your desk. We will see in detail how to building and evaluate machine learning models to make data-driven decisions.

For the purpose of this course we will be using a free tool called Jupyter Notebooks which provides you with a local editor and Python terminal in your web browser. Setting up instructions can be found here.

Intended learning outcomes

By the end of this course, you will:

  • Know how to use Jupyter Notebooks.
  • Be familiar with scikit-learn, pandas and seaborn.
  • Understand the machine learning
    • Select suitable models
    • Split Data into training and test sets
    • Fit models to data
    • Hyperparameter tuning using grid search
    • Understand how to evaluate model performance

Let’s embark on this exciting journey into the world of data analysis with Python!