The Core HIPAA Question for AI Billing
HIPAA does not prohibit the use of AI for healthcare administration. What HIPAA prohibits is the use or disclosure of Protected Health Information (PHI) without proper authorization or safeguards. The question is not “can we use AI?” — it is “does the AI receive PHI, and if so, under what safeguards?”
There are two architecturally different ways to use AI in medical billing:
- Architecture A (PHI reaches the AI): The AI model receives raw clinical notes or claims data that includes patient names, dates of birth, diagnoses, and other identifiers. In this case, the AI vendor is a Business Associate, a BAA is required, and the vendor must implement the full HIPAA Security Rule safeguards.
- Architecture B (PHI de-identified before the AI): PHI is stripped of all 18 Safe Harbor identifiers before transmission to the AI. The AI receives a clinical description — “a 47-year-old male with Type 2 diabetes and hypertension presented for a follow-up visit” — not “John Smith, DOB 03/15/1978, MRN 00012345.” De-identified data is not PHI under HIPAA, so the AI vendor is not a Business Associate and no BAA is required with them.
Architecture B is the correct approach for responsible AI medical billing. It provides the AI with everything it needs to perform accurate coding and analysis — the clinical facts — without transmitting any patient identity information.
The 18 Safe Harbor Identifiers
HIPAA Safe Harbor de-identification (45 CFR §164.514(b)) requires the removal or generalization of 18 specific types of identifiers. Data from which all 18 have been removed is no longer PHI and can be used, analyzed, and transmitted freely.
The 18 identifiers are:
- Names (first, last, initials)
- Geographic subdivisions smaller than a state (street address, city, county, ZIP codes — except first three digits)
- Dates directly related to an individual (except year): birth date, admission date, discharge date, date of death, ages over 89
- Telephone numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate and license numbers
- Vehicle identifiers and serial numbers (including VINs)
- Device identifiers and serial numbers
- URLs
- IP addresses
- Biometric identifiers (fingerprints, voiceprints)
- Full-face photographs and comparable images
- Any other unique identifying number, characteristic, or code
Note what is NOT on this list: diagnosis codes (ICD-10), procedure codes (CPT), age (as long as not over 89), specialty, payer name, place of service, and clinical descriptions. These are exactly the data points an AI coding engine needs — and they can be transmitted freely after Safe Harbor de-identification.
How HIPAA-Compliant AI Coding Works in Practice
Step 1: PHI De-identification at the Source System
When a provider completes a session note in the EHR or billing system, the AI billing platform's integration layer extracts the clinical content and immediately strips all 18 Safe Harbor identifiers. Patient names become tokens (internal references that allow the system to match records without exposing the name). Dates are shifted or generalized. The result is a de-identified clinical note that contains everything the AI needs: presenting complaint, assessment, plan, procedures performed.
Step 2: AI Coding on De-identified Data
The de-identified note is passed to the AI coding engine. The AI reads the clinical content and assigns ICD-10 diagnosis codes and CPT procedure codes based on the documented encounter. It also identifies appropriate modifiers, checks for bundling issues, and flags potential documentation gaps that would affect coding accuracy.
Because the AI never sees the patient's name, DOB, or other identifiers, this step occurs entirely outside of HIPAA's PHI restrictions. The AI provider is not a Business Associate.
Step 3: Human Review Before Submission
The AI's suggested codes are returned to the billing platform and matched back to the original claim (using the internal token) with PHI restored. The billing team sees the complete claim — patient name, codes, and AI suggestions — and reviews before submission. This human review step is both a compliance best practice and a quality control mechanism. AI coding errors (rare but possible) are caught before they reach the payer.
Step 4: Claim Submission Through Encrypted Channels
Approved claims are submitted to clearinghouses via HIPAA-compliant EDI transactions (837P for professional claims). All data in transit uses TLS 1.3 encryption. All data at rest uses AES-256 encryption.
BAA Requirements for AI Medical Billing Platforms
Even when the AI component uses de-identified data, your medical billing platform vendor still receives and processes PHI at other stages — patient registration, eligibility verification, ERA posting. This makes the billing platform vendor a Business Associate, and a BAA is required.
A compliant BAA for an AI medical billing platform must address:
- PHI handling scope: What PHI the vendor receives, stores, and processes, and for what purposes
- Security safeguards: Encryption standards, access controls, employee training requirements
- AI-specific disclosure: Explicit statement of which data is de-identified before AI processing, and confirmation that PHI is never transmitted to AI model providers
- Breach notification: Obligation to notify covered entity within 60 days of discovering a breach (as required by 45 CFR §§164.400–414)
- Subcontractor BAAs: Obligation to enter BAAs with all subcontractors who receive PHI
- Data return/destruction: What happens to PHI if the contract terminates
What to Look for in an AI Billing Vendor's HIPAA Compliance
When evaluating an AI medical billing platform, ask these specific questions:
- Does PHI reach your AI model provider? If yes, do you have a BAA with them?
- What de-identification method do you use? Is it Safe Harbor or statistical de-identification?
- What encryption standards do you use at rest and in transit?
- What is your breach notification timeline and process?
- What are the terms of your BAA, and can I see a sample before signing?
- Are you SOC 2 audited? What is your infrastructure provider and their compliance posture?
How Datricx Handles HIPAA Compliance
Datricx uses Architecture B exclusively. PHI is de-identified under HIPAA Safe Harbor (45 CFR §164.514(b)) — all 18 identifiers removed — before any data reaches Google Gemini, the AI model used for claims coding and denial analysis. Gemini never receives a patient name, date of birth, SSN, or any other direct identifier.
All PHI stored in Datricx is encrypted with AES-256 at rest via Supabase (SOC 2 Type II certified infrastructure). All data in transit uses TLS 1.3. Access is logged with user ID, timestamp, and action.
Datricx provides a signed Business Associate Agreement to all Growth and Enterprise customers before any PHI is processed. The BAA covers all of the elements described above and is available for legal review before signing.