Machine Learning Strategy (Part 2)

This is the second blog post about machine learning strategy. It is about human-level performance, bias and variance (tradeoff) and how to improve your algorithm iteratively.

Human-Level Performance

• Human-Level Performance is the performance that can be achieved by human, by e.g. looking at the data.
• An algorithm should at least reach human-level performance; If not, there is potential for improvement
• Improvement can be achieved by e.g.
• Generating more training data (e.g. using humans)
• A manual analysis of the observations on which the algorithm performs bad
• An analysis of the bias-variance tradeoff (see later)
• The data could be structured in different ways for humans and computers; whereas humans can read in the information of a picture with the eyes, for computers this information has to be transformed to numbers.
• The Bayes Error is defined as the best performance that could be achieved by an algorithm for the task; It should theoretically always be below the human-level performance

Bias and Variance

There is a tradeoff between bias and variance, while learning machine learning algorithms. I will differentiate here between avoidable bias and variance:

• Avoidable bias: Difference in performance between training data and Bayes error (=human-level performance)
• Variance: Difference in performance between training data and development data.

Possible improvements:

Notes:

• Avoidable bias should be small, but not too small (problem of overfitting)
• Depending on performance of human, training and dev error either bias or variance reduction should be targeted
• If algorithmic dev set performance is better than human-level performance there is no clear indicator if bias or variance should be reduced

Examples:

Error analysis

Following things can be done in the error analysis:

• Get observations which cause problems (missclassification, big deviation)
• Which is the cause for this?
• Create table with percentage error of the cause compared to whole error
• Obtain the main causes which are responsible for the worse performance

Example for image classification with cats:

Concentration on problems that cause the lion part of the worse performance

• Can the problems be solved by adjusting the algorithm? (new features, other hyperparameters, other algorithms)
• Can the algorithm be improved by removing certain observations from the dataset? (e.g. outliers, wrongly classified observations, etc.)

• Set up train/dev/test data set
• Build the first algorithm quickly (on train)
• Execute a bias-variance and error analysis to detect the main problems (on train/dev)
• Make improvements in the data set/algorithm and analyse the resulting algorithm again (on train/dev)
• Do the final evaluation on test

→ „Build your first system quickly, then iterate to improve it“

→ In most cases data scientists build too complex procedures/algorithms than too simple ones, so think about the complexity of the machine learning system that you have built.

Next blog post

In the next blog post I will talk about what to do with different distribution in the train/dev/test set, how to learn with multiple tasks and advantages and disadvantages of end-to-end learning.

This blog post is partly based on information that is contained in a tutorial about deep learning on coursera.org that I took recently. Hence, a lot of credit for this post goes to Andrew Ng that held this tutorial.

Written on August 13, 2021