PreventiveLink

How We Build the Preventive $0 Code List

As of February 6, 2026 full ingest run

Problem We Are Solving

  • Some preventive services should cost members $0, depending on plan rules.
  • There is no single API or file with every preventive CPT/HCPCS code.
  • We need a list that is accurate, explainable, and easy to audit.

Source Strategy

We pull multiple trusted datasets, clean them, filter risky items, and publish one master table used by adjudication.

Source pull
Fetch raw inputs
Snapshot/checksum
Audit raw bytes
Parse/normalize
CPT/HCPCS focus
Reference tables
Traceable staging
Master list
Conservative publish
Adjudication tiers
Conditions + rules
VSAC (NLM) CDC CVX -> CPT CMS PFS USPSTF

Current Run Outputs

  • ref_value_sets_current: 669 rows
  • ref_preventive_codes_current: 16,761 rows
  • ref_vaccines_cpt_current: 168 rows
  • ref_rules_current: 108 rows
  • ref_master_preventive_services_current: 231 rows

Source 1: VSAC (NLM)

  • What it is: A large catalog of value sets used in CMS quality measures.
  • Example data: OID, value set name, measure ID, QDM category, and concepts.
  • Why used: It gives structured groups for screenings and preventive services.
  • How used: Pull BPS + SVS XML, then keep only CPT/HCPCS we can adjudicate.
  • Output: ref_value_sets_current plus VSAC rows in ref_preventive_codes_current.

VSAC Example Record

{
  "oid": "2.16.840.1.113883.3.464.1003.108.12.1020",
  "name": "Colonoscopy",
  "measure_id": "CMS130",
  "metadata_json": {
    "bps_qdm_category": "Procedure"
  }
}

Ref_Preventive_Codes sample:
("44388", "CPT", "SCREENING_COLORECTAL",
 {"oid":"2.16.840.1.113883.3.464.1003.108.12.1020"})

VSAC Safety Firewall

  • We do not treat every VSAC value set as preventive.
  • Auto-categorization uses:
    • Allowed QDM categories
    • Name allow/block patterns
    • Keyword mapping to internal categories (IMM_ADMIN, SCREENING_*)
  • Blocked examples include major treatment procedures like colectomy and mastectomy.

Source 2: CDC CVX -> CPT Crosswalk

  • What it is: CDC mapping from vaccine IDs (CVX) to billable CPT codes.
  • Example data: (cvx=207, cpt=91301, non_travel_flag=true).
  • Why used: It is the cleanest source for vaccine code mapping.
  • How used: Parse CDC file, apply non-travel filter, store results.
  • Output: ref_vaccines_cpt_current and immunization-product rows in the master list.

Source 3: CMS Physician Fee Schedule

  • What it is: Annual CMS list of billable CPT/HCPCS lines.
  • Example data: G-codes and other HCPCS/CPT entries from PFS CSV.
  • Why used: Helps with Medicare preventive coverage and code completeness.
  • How used: Load full PFS into reference data, then publish only approved preventive allowlist rows.
  • Output: ref_preventive_codes_current source cms_pfs; master contains MEDICARE_PREVENTIVE rows.

Source 4: USPSTF Recommendations

  • What it is: Public recommendation pages with grades (A/B/C/D/I).
  • Example data: recommendation slug, title, grade set, source URL.
  • Why used: Gives policy context for why a screening exists.
  • How used: Scrape and normalize into a rules table.
  • Output: ref_rules_current (context/rules, not a direct CPT feed).

How Sources Merge to Final Output

  • Inputs: VSAC preventive CPT/HCPCS + CDC vaccine CPT + CMS Medicare allowlist.
  • We merge those into ref_master_preventive_services_current.
  • Each row stores metadata so decisions are traceable:
    • categories
    • sources
    • programs
    • condition_flags

Master List Example Row

{
  "proc_code": "3044F",
  "proc_code_system": "CPT",
  "category_primary": "SCREENING_OTHER",
  "metadata_json": {
    "categories": ["SCREENING_OTHER"],
    "programs": ["ACA"],
    "sources": ["vsac_value_sets"],
    "condition_flags": [
      "ACA_COMMERCIAL_ONLY",
      "DX_Z_CODE_OR_MOD33",
      "IN_NETWORK_REQUIRED",
      "MOD33_CAN_OVERRIDE"
    ]
  }
}

Conditions / Stipulations

  • ACA_COMMERCIAL_ONLY: this rule is for ACA commercial plans.
  • IN_NETWORK_REQUIRED: in most cases, preventive $0 expects in-network care.
  • DX_Z_CODE_OR_MOD33: some screening codes need preventive diagnosis context or modifier support.
  • MOD33_CAN_OVERRIDE: modifier 33 can mark a service as preventive.
  • MEDICARE_POLICY: row follows Medicare-specific preventive policy handling.

Operational Commands

cd PreventiveLink
$env:NMLS_API_KEY_FILE = (Resolve-Path .\.secrets\nmls_api_key.txt).Path
.\tools\dev.ps1 ingest-annual -DbUrl "sqlite+pysqlite:///./.tmp/preventivelink.db"
python -m preventivelink_ingest report --db-url "sqlite+pysqlite:///./.tmp/preventivelink.db"
.\tools\dump-master-list.ps1 -DbPath .\.tmp\preventivelink.db -Format json -OutPath .tmp\master_list.json

Limitations (Explicit)

  • No single source gives a full preventive CPT list, so we combine sources.
  • Some VSAC sets are mostly LOINC/SNOMED/RxNorm and do not add CPT/HCPCS rows.
  • We intentionally bias toward safety to avoid false positives.
  • Final claim decisions still need diagnosis, modifiers, network, and payer context.

Roadmap

  • Expand CDC non-travel vaccine coverage.
  • Add more vetted VSAC screening categories.
  • Keep analytics categories separate from adjudication categories.
  • Add payer-program logic for Medicare Advantage and Medicaid managed care.

Final Output

Deliverable for claims: ref_master_preventive_services_current

Deliverable for auditability: source snapshots + reference tables + row-level condition flags.