Recent posts

Shapley Values

12 minute read

This is another blog post in the series on model explainability. Here I will provide a brief description of Shapley values in the context of explaining outpu...

Local Interpretable Model-Agnostic Explanations

17 minute read

Estimating permutation feature importances and plotting relationships between explanatory variables and model outputs by means of partial dependence plots ar...

Permutation Feature Importance

9 minute read

Investigating feature importances for a developed model is a very important step in achieving the goal of interpretable machine learning. This not only allow...

Partial Dependence Plots

11 minute read

The topic of model interpretability has gained a lot of attention recently with the rapid development of highly complex machine learning algorithms for deali...

Twenty-Sided Die Game

6 minute read

In this blog post we are going to discuss a mock quantitative interview question by Jane Street. Suppose you are offered to play the following game: at the s...

Introduction to Kalman Filter

12 minute read

In this post, we cover the theory behind a discrete-time Kalman filter. Kalman filter is an algorithm that allows us to get a more precise information about ...

Gradient Ascent Algorithm

6 minute read

According to Wikipedia, gradient descent (ascent) is a first-order iterative optimization algorithm for finding a local minimum (maximum) of a differentiable...

Reducing Errors in Data Analysis

2 minute read

Recently a model vetter has pointed out to a mistake that I committed when developing one of the models. The mistake was not of methodological type, rather i...

Training, Validation and Test Datasets

2 minute read

According to different sources, it is advisable that the data that is used to build a model be split into 3 datasets: training, validation and test. This is ...

SMOTE Algorithm

7 minute read

This short blog post relates to addressing a problem of imbalanced datasets. An imbalanced dataset is a dataset where the classes are not approximately equal...

Genetic Algorithms. Introduction.

6 minute read

This is an introductory post about genetic algorithms (GAs) , which are a suite of methods of solving optimization problems. GAs form a subset of more genera...

Abalone. Part 2

9 minute read

This is the second part for the project about constructing a predictive model for the abalone dataset. In this post, we are going to fit 3 different regressi...

Abalone. Part 1

10 minute read

This is the first in a series of posts about constructing a predictive model, based on physical measurements, of age of abalone, where abalone referes to a g...