Machine-Learning

Building a Private MLOps Platform on AWS

1 September 2025·3 mins

Machine-Learning Mlops Aws Terraform Mlflow

How I deployed MLflow as a authenticated experiment tracking server on AWS and integrated it into a reusable ML toolkit.

Using Learning Curves to Know Whether More Data Will Help

10 March 2025·4 mins

Machine-Learning Sample-Size Genomics Methodology

Learning curves won’t tell you exactly how many samples to collect, but they will tell you whether collecting more is worth it at all. In domains where each sample costs real money, that’s the question that actually matters.

Building a Reusable ML Toolkit for Genomic Models

1 March 2025·2 mins

Machine-Learning Mlops Python Scikit-Learn Mlflow

A Python package with a YAML-driven pipeline builder and a prediction CLI.

When Random Features Work Just as Well

20 January 2025·3 mins

Machine-Learning Feature-Selection Genomics Dimensionality

On the counterintuitive finding that randomly selecting features from high-dimensional genomic data often matches the performance of careful feature engineering and why that makes mathematical sense.

Scaling ML Training for Epigenetic Age Prediction

15 November 2024·3 mins

Machine-Learning Aws Sagemaker Mlops Scikit-Learn

How parallelising hyperparameter tuning on SageMaker turned a single-instance grid search into a 100x faster training workflow.

↑