Case Study  ·  Power BI  ·  Healthcare Analytics

Medicaid Provider
Billing Anomaly
Detection

End-to-end analytics engineering — from six raw public data sources through a Python ML scoring pipeline into a purpose-built Power BI semantic model, with a fully custom visual layer on top. Built collaboratively with Claude (Anthropic) via Model Context Protocol. 617,503 scored providers. $1.09 trillion in Medicaid program spend.

Power BI DAX Power Query / M Python Scikit-learn (IF/LOF/CBLOF) Scikit-learn Pandas / NumPy HTML / CSS Azure Maps Claude via MCP
$1.09T
Total Spend Analyzed
617.5K
Providers Scored
$362.2B
Flagged Spend
2.2x
OIG Validation Lift
01

What This Demonstrates

Start with a data-first framework. Let the statistical patterns define the story. Embed AI across every layer of the workflow. That's how one practitioner — with no prior healthcare background — built a production-grade, end-to-end analytics platform analyzing $1.09T in Medicaid claims data from a cold start.

What it proves: the ability to architect a complex multi-source semantic model, engineer features and integrate ML outputs into a BI layer, and build a custom visual design system from scratch — all while designing for a real end-user workflow, not just for screenshots.

02

The Challenge

Medicaid program integrity is a needle-in-a-haystack problem at massive scale. With 617,503 scored providers across 7 years of claims history, identifying which providers warrant closer review requires more than simple rules or threshold filters — the signal-to-noise ratio demands a statistical approach.

The goal was to build a system that could score every provider by anomaly risk, surface the specific behavioral patterns driving those scores, and present findings in a way that a non-technical reviewer could act on immediately — without needing a data science background to interpret the output.

Before any of that was possible, six heterogeneous public datasets that were never designed to work together had to be assembled into a single coherent analytical layer. That data engineering problem was the real challenge.

●  Key Finding
The top 30,991 flagged providers — just 5.0% of all scored providers — account for $362.2 billion, or 33.1% of total program spend. Surfacing this concentration is the central analytical output of the platform.
03

Data Engineering

Six public datasets — seven years of CMS Medicaid claims files, the NPI provider registry, HCPCS procedure codes, ZIP demographics, and geographic reference tables — each built for separate administrative purposes with different identifiers. Joining them into a single analytical layer required building the key bridges manually and resolving provider identity conflicts across address and name variants before any analysis could begin.

■  Source Data — 6 Systems
CMS Medicaid (7 annual releases), NPPES NPI Provider Registry, HCPCS procedure code reference files, Census ACS ZIP demographics, HUD/USPS ZIP crosswalk, and state/county reference tables.
■  Python Pipeline
Merges and normalizes seven annual CMS release files, resolves provider identity conflicts, and computes the behavioral features that drive risk scoring — growth rates, billing volatility, concentration ratios, gap indicators, and spend patterns — across each provider's full claims history.
■  Semantic Model
A purpose-built analytical layer spanning provider billing facts, longitudinal year-over-year records, ML score outputs, geographic risk layers, ZIP demographics, procedure code references, and AI-enriched provider profiles — structured for sub-second query performance across the full provider population.
■  Address-Level Risk Profiling
A separate profiling layer analyzes every shared billing address — flagging clusters of providers with coordinated entry patterns, sequential NPI registration, billing Gini concentration, and flag rates benchmarked against national, state, specialty, and ZIP averages. Surfaces organized billing patterns that individual provider scoring alone would miss.
04

ML Scoring Pipeline

A 3-component unsupervised ensemble scores all 617,503 providers without training labels. Components run independently and combine via rank fusion — no circular dependency, no ground truth required. OIG exclusion data used only as blind post-hoc validation.

Component 1 — Behavioral Anomaly (40% weight). Isolation Forest + Local Outlier Factor run on 20 billing and Z-score features per provider. All features normalized within specialty peer groups — a cardiologist is only compared to other cardiologists. Contamination rate 0.02. Flags providers statistically extreme relative to their specialty peers.
Component 2 — Billing Exposure (25% weight). Severity-weighted spend concentration in 10 high-abuse billing categories, amplified by specialty mismatch multiplier. Critical categories apply a 5.5x multiplier when billed outside the provider's specialty. Detects vehicle-style billing patterns regardless of behavioral anomaly score.
Component 3 — Geographic Ring (35% weight). High-risk specialty density within 5 miles, normalized to urban/suburban/rural local baseline using BallTree spatial indexing. CBLOF geographic clustering adds a secondary anomaly signal. Catches billing mills even when providers bill legitimately within their specialty.
05

Dashboard Architecture

The platform is structured as six purpose-built pages, each answering a distinct analytical question at a different level of granularity — from $1.09T program overview down to a single provider's behavioral fingerprint.

Page 01
Medicaid Provider Spending
Top-level program overview. KPI cards, risk tier concentration analysis, year-over-year trends, top states and procedure codes by elevated spend. Azure map with provider density overlay.
Page 02
Medicaid Anomaly Explorer
Risk tier deep-dive with spend concentration breakdowns. Unsupervised ensemble methodology documentation, OIG validation results (2.2x above random baseline), and full model performance disclosure.
Page 03
Provider Anomaly Map
Geographic anomaly view. Bubble map sized by anomaly probability across all 617K scored providers. Filter by state, tier, and specialty. Surfaces geographic clustering patterns in billing behavior.
Page 04
Provider Explorer
Filterable table of all 617K providers. Sort by anomaly score, total spend, paid per claim, or years active. Filter by state, specialty, tier, and OIG status. Right-click any row to drill through to Provider Details.
Page 05
Provider Details (Drillthrough)
Individual provider investigation page. KPI benchmarks, anomaly score panel, behavioral signal breakdown, top HCPCS procedure analysis, spend trend, peer comparison matrix, and Azure Maps location panel. Fully dynamic across all 617K providers.
Page 06
About This Project
Project purpose, data sources, key assumptions, methodology limitations, tech stack, and full research disclaimer. Designed to give any reviewer complete context before acting on the analysis.
06

Technical Highlights

DAX measure layer — dynamic HTML rendering, benchmark deviation calculations against national and state averages, multi-year trend logic, ML score interpretation, and conditional signal thresholds. No measure is a simple aggregation — every calculation is contextual, filter-aware, and drives a visual output directly.
Custom HTML visual design system — the entire front end is built in HTML/CSS rendered inside the HTML Content visual. A consistent design token system replaces every native Power BI visual on the page. KPI cards, panel headers, signal indicator rows, verdict bars, benchmark comparisons — all hand-built in DAX measures that return complete HTML documents.
Fully dynamic drillthrough provider card — the Provider Detail page regenerates completely for any of the 617,503 scored NPIs. Signal indicators color red/orange/green against computed statistical thresholds. A verdict bar calculates concern level from the live signal count. HCPCS breakdown, spend trend, benchmark KPIs, and an Azure Maps location panel all update simultaneously from a single drillthrough click.
Dual-mode geographic visualization — Azure Maps integration with a provider-level bubble map sized by anomaly probability and a ZIP-level choropleth risk layer. Both rendering modes are driven by computed map tables in the semantic model, not raw coordinate columns, enabling filter-context awareness across both views.
Measure architecture organized by function — measures are structured across named display folders covering KPI cards, chart tooltips, drillthrough panels, map layers, geographic views, ML score outputs, methodology documentation, and header visuals. Each folder is its own self-contained system — the tooltip layer alone spans KPI tooltips, chart tooltips, map tooltips, and ZIP-level context panels, each rendering a complete HTML document on hover.
07

Tech Stack

Power BI Desktop
DAX
Power Query / M
Python
Scikit-learn (IF/LOF/CBLOF)
Scikit-learn
Pandas / NumPy
HTML / CSS
Azure Maps