The MedScrub Story
A simple necessity that turned into something with legs of its own
How It Started
MedScrub was born out of frustration. While building CLaiR - the AI assistant that helps patients understand their medical records at Clarity Health Project - I (Clint) kept hitting the same wall: we couldn't use modern AI tools with patient data.
Every time we wanted to leverage ChatGPT, Claude, or any LLM to help translate and explain medical records, HIPAA compliance stopped us cold. We had amazing AI capabilities at our fingertips, but couldn't touch them with real healthcare data.
So we built a solution for ourselves — a way to strip out PHI, use AI safely, and restore the context afterward. What started as an internal tool to solve our own problem quickly became something bigger when other healthcare teams started asking for the same capability.
The Reality We're Facing
Let's be honest: healthcare professionals are already using AI. They're copying patient notes into ChatGPT, hoping no one notices. They're taking screenshots and uploading them to Claude. They want to provide better care, and AI helps them do that.
But this shadow IT practice is dangerous. It violates HIPAA, risks patient privacy, and creates liability for healthcare organizations. We can pretend it's not happening, or we can provide a safe, compliant way to do what's already being done.
MedScrub isn't about enabling new behavior — it's about making existing behavior safe.
What We Believe
Privacy First
Patient privacy isn't negotiable. Every decision we make starts with protecting PHI.
Developer Experience
5-minute deployment, not 5-month integration. We respect developers' time.
Clinical Accuracy
96% F1 accuracy on unstructured text, 100% on FHIR data. Precision matters in healthcare.
Zero Trust
PHI stays on your infrastructure. We never see, store, or process patient data.
Open Approach
No vendor lock-in. Use any AI model. Your data, your choice.
Patient Outcomes
Technology should improve healthcare, not complicate it. Patient care comes first.
Built on Proven Methodologies
Ensemble Detection Algorithm
We achieve 96% F1 accuracy by combining multiple proven approaches in a sophisticated ensemble pipeline. This isn't just regex pattern matching — it's a multi-layered system that catches edge cases other tools miss.
HIPAA Safe Harbor Compliance
Every detection method ensures comprehensive coverage of all 18 HIPAA Safe Harbor identifiers. We don't just catch the obvious ones — our system finds complex medical record numbers, institution-specific formats, and edge cases buried in clinical narratives.
Reversible Tokenization
Context-preserving tokens maintain clinical meaning while protecting privacy. "John Smith, 45-year-old" becomes "[NAME1], [AGE1]-year-old" — the AI understands it's a middle-aged patient without seeing the PHI. This approach, inspired by academic research, outperforms simple redaction.
Zero-Trust Architecture
Docker container runs entirely on your infrastructure. PHI processing happens locally with no external calls. Deploy in 5 minutes, not 5 months — because we built it for ourselves first.
About the Founder
Hi, I'm Clint.
What started as a tool to solve our own HIPAA problems at Clarity Health Project has grown into something much bigger. Now healthcare teams everywhere can safely use any AI model they want — ChatGPT, Claude, Gemini, or their own custom models — without risking patient privacy or HIPAA violations.