Back to home

All Projects

A complete look at my project work across machine learning, NLP, data infrastructure, and analytics.

All Projects

NDA-compliant where noted
Segmentation

Customer Lifecycle Segmentation

Customers weren't being treated differently based on how they actually behaved, so I built a segmentation system that changed that. Using RFM clustering on large-scale transaction logs, I identified a 76% drop-off in a high-value segment that had never been spotted, and built predictive valuation models that put a $5.5M number on the addressable opportunity.

PythonRFM Clustering BigQueryPredictive Modeling
Attribution

Campaign Analytics & Association Rules

Built a post-campaign analysis framework to understand why some promotions worked and others didn't. Used Apriori-based association rule mining to surface cross-department purchase patterns, confirming an 11–32% performance range across campaigns and a 26% feature attachment rate that informed the next round of targeting strategy.

PythonMarket Basket Analysis Attribution Modeling
Pipeline Engineering

YouTube Safety Index Pipeline

Built a real-time content monitoring pipeline to flag unsafe YouTube content at scale. The system ingests network traffic under 250ms, parses content signals with custom Python ETL scripts, and routes flagged records into an Elasticsearch index for fast downstream querying. Think of it as a safety filter that never sleeps.

Live gateway traffic stream
YouTube pipeline traffic stream screenshot
REST APIPython (ETL) Elasticsearch BERT Model SSL Bumping
Business Intelligence
NDA

Retail Operations Dashboard

A retail chain running 14 branches had no single view of what was happening across the business. I designed an executive analytics layer in AWS QuickSight - built on reusable SQL KPI models — that surfaced live AOV, cancellation rates, and fulfillment ratios in one place. Leadership could finally see the whole picture.

Live QuickSight dashboard — orders & sales overview (anonymized)
Anonymized retail operations QuickSight dashboard
AWS QuickSightSQL Retail Analytics
Data Identity

Entity Resolution & Deduplication Pipeline

When the same person shows up as five different records, your analytics lie to you. I built an automated entity linkage system using custom blocking schemas and string distance metrics to collapse ~9M records into clean, verified identity clusters - achieving 98% accuracy while keeping compute costs manageable.

PythonEntity Resolution Record LinkagePostgreSQL
Supply Chain
NDA

Demand Forecasting Engine

Seasonal spikes in retail are predictable — until they're not. I engineered an end-to-end forecasting pipeline using LightGBM and structural time-series models to predict daily product demand across regional distribution hubs. The model integrated historical transaction data with live promotional signals to stay accurate even during demand outliers.

Time-SeriesLightGBM Feature Engineering
Personalization
NDA

Purchase Propensity & Cross-Sell Engine

Designed a real-time personalization framework that scores a customer's likelihood to buy — and what to recommend next — at the moment of checkout. The system uses collaborative filtering on historical basket data and live clickstream signals to surface relevant cross-sells. Deployed at checkout where it matters most.

PythonSQL / Warehouse Collaborative FilteringPropensity Scoring
AI Engineering

AI Code Review Assistant

An interactive browser tool that reviews code snippets using the Claude API; scoring readability, structure, and maintainability, surfacing three specific improvements, one positive note, and a critical flag if a real bug is detected. Built as part of the Careem Forward Deployment Engineer AI challenge.

Claude API Prompt Engineering JavaScript Python SQL