About me

A Data Scientist whose work sits at the intersection of machine learning, data infrastructure, and human behavior. Over the years, I've built systems that turn noisy data into clarity and action plan; including a behavioral segmentation platform for a 20M+ member loyalty program that surfaced unrealized commercial opportunities at scale.
Outside of core industry work, I'm drawn to research questions at the edge of NLP, social computing, and mental health particularly how language in online spaces signals distress, and where AI's understanding of emotionally sensitive context breaks down.
I care about the engineering as much as the insight it produces: rigorous, thoughtful data work that speaks truth about human experience, not just optimizes a funnel.

When I'm offline, you'll usually find me on a run, on the tennis court, or enjoying an off-screen time at the beach

See my work Download CV

Core Competencies

Machine Learning & NLP

Fraud detection, BERT-based NLP, clustering systems, recommender engines, and entity resolution - the kind of work that sits at the hard end of applied ML.

Experimentation & Strategy

A/B testing, multi-armed bandits, sample sizing, KPI design, and promotional optimization.

Data Infrastructure

Python (Scikit-Learn, Pandas), BigQuery, PostgreSQL, REST APIs, ETL automation, AWS QuickSight.

Featured Projects

View all projects

Segmentation

Customer Lifecycle Segmentation

Customers weren't being treated differently based on how they actually behaved, so I built a segmentation system that changed that. Using RFM clustering on large-scale transaction logs, I identified a 76% drop-off in a high-value segment that had never been spotted, and built predictive valuation models that put a $5.5M number on the addressable opportunity.

PythonRFM Clustering BigQueryPredictive Modeling

View on GitHub

Data Identity

Entity Resolution & Deduplication Pipeline

When the same person shows up as five different records, your analytics lie to you. I built an automated entity linkage system using custom blocking schemas and string distance metrics to collapse ~9M records into clean, verified identity clusters - achieving 98% accuracy while keeping compute costs manageable.

PythonEntity Resolution Record LinkagePostgreSQL

View on GitHub

AI Engineering

AI Code Review Assistant

An interactive browser tool that reviews code snippets using the Claude API; scoring readability, structure, and maintainability, surfacing three specific improvements, one positive note, and a critical flag if a real bug is detected. Built as part of the Careem Forward Deployment Engineer AI challenge.

Claude API Prompt Engineering JavaScript Python SQL

View Live Demo

View All Projects