The MedScrub Story

A simple necessity that turned into something with legs of its own

How It Started

MedScrub was born out of frustration. While building CLaiR - the AI assistant that helps patients understand their medical records at Clarity Health Project - I (Clint) kept hitting the same wall: we couldn't use modern AI tools with patient data.

Every time we wanted to leverage ChatGPT, Claude, or any LLM to help translate and explain medical records, HIPAA compliance stopped us cold. We had amazing AI capabilities at our fingertips, but couldn't touch them with real healthcare data.

So we built a solution for ourselves — a way to strip out PHI, use AI safely, and restore the context afterward. What started as an internal tool to solve our own problem quickly became something bigger when other healthcare teams started asking for the same capability.

The Reality We're Facing

Let's be honest: healthcare professionals are already using AI. They're copying patient notes into ChatGPT, hoping no one notices. They're taking screenshots and uploading them to Claude. They want to provide better care, and AI helps them do that.

But this shadow IT practice is dangerous. It violates HIPAA, risks patient privacy, and creates liability for healthcare organizations. We can pretend it's not happening, or we can provide a safe, compliant way to do what's already being done.

MedScrub isn't about enabling new behavior — it's about making existing behavior safe.

What We Believe

Privacy First

Patient privacy isn't negotiable. Every decision we make starts with protecting PHI.

Developer Experience

5-minute deployment, not 5-month integration. We respect developers' time.

Clinical Accuracy

96% F1 accuracy on unstructured text, 100% on FHIR data. Precision matters in healthcare.

Zero Trust

PHI stays on your infrastructure. We never see, store, or process patient data.

Open Approach

No vendor lock-in. Use any AI model. Your data, your choice.

Patient Outcomes

Technology should improve healthcare, not complicate it. Patient care comes first.

Built on Proven Methodologies

Ensemble Detection Algorithm

We achieve 96% F1 accuracy by combining multiple proven approaches in a sophisticated ensemble pipeline. This isn't just regex pattern matching — it's a multi-layered system that catches edge cases other tools miss.

• Regex Patterns
High-precision deterministic matching
• ML/NLP Models
spaCy, BioBERT, ClinicalBERT
• UCSF Philter Enhanced
Medical-specific entity patterns
• Probabilistic Methods
Bloom filters, fuzzy matching

HIPAA Safe Harbor Compliance

Every detection method ensures comprehensive coverage of all 18 HIPAA Safe Harbor identifiers. We don't just catch the obvious ones — our system finds complex medical record numbers, institution-specific formats, and edge cases buried in clinical narratives.

Reversible Tokenization

Context-preserving tokens maintain clinical meaning while protecting privacy. "John Smith, 45-year-old" becomes "[NAME1], [AGE1]-year-old" — the AI understands it's a middle-aged patient without seeing the PHI. This approach, inspired by academic research, outperforms simple redaction.

Zero-Trust Architecture

Docker container runs entirely on your infrastructure. PHI processing happens locally with no external calls. Deploy in 5 minutes, not 5 months — because we built it for ourselves first.

About the Founder

Hi, I'm Clint.

What started as a tool to solve our own HIPAA problems at Clarity Health Project has grown into something much bigger. Now healthcare teams everywhere can safely use any AI model they want — ChatGPT, Claude, Gemini, or their own custom models — without risking patient privacy or HIPAA violations.

Try MedScrub Today

Stop the risky copy-paste into ChatGPT. Start using AI safely.