Grouping.FinalBuildGames.com

Feature Implementation
and Decision Review

Prepared from repository report dated February 11, 2026

Scope, Method, and Output

Scope

  • Catalog runtime and operational platform features.
  • Map each capability to implementation locations in code.
  • Review architecture choices with targeted external references.

Decision Lens

  • Keep: current decision remains strong.
  • Improve: approach works but has a better variant.
  • Replace: redesign is justified.
Output emphasizes delivery risk, maintainability, and retrieval quality impact rather than net-new feature ideation.

Feature Inventory

Runtime surfaces, services, and pipelines

Runtime Capabilities (UI + API)

User Interfaces

  • Analyst Harness query workflow.
  • Principal Code Browser and bundle inspector.

Core Retrieval APIs

  • /stack, /bundle, /principals.
  • /cpt/search and /cpt/{code}.
  • Health checks and pool lifecycle hooks.

Problem Workflows

  • /problem/codes ranking + intent guardrails.
  • /problem/packages assembly path.
  • Deterministic package and authoring tool endpoints.
  • /vector/search evidence retrieval.
Source: `grouping-feature-implementation-and-decision-review-2026-02-11.md` section 1.1

Supporting Services and Lifecycle

PubMedBERT Embedder

  • Domain embeddings (768-d) for retrieval and ingestion.
  • /embed endpoint with batched generation path.

Principal Description Worker

  • LLM-first description generation with retry wrapper.
  • Persists human-readable principal metadata.
API startup/shutdown events currently handle pool initialization and teardown; this is a candidate for lifespan refactor.

Ingestion, Indexing, and Quality Gates

Source + Reference Loads

  • Checksum-verified acquisition.
  • CMS RVU, AHRQ CCS, and NCCI loaders.
  • UMLS seeding into affliction catalog.

Mapping + Semantic Layer

  • Deterministic affliction-to-CPT builder.
  • Qdrant indexing for afflictions and templates.
  • Coverage and query-quality reports.

Evidence Pipelines

  • PMC condition harvesting and embedding.
  • ECRI Stagehand browser pipeline.
  • Reindex/repair/report tools for Qdrant evidence state.

Evaluation + Probes

  • Problem ranking and user-batch evaluators.
  • API smoke, bench, and vector probe scripts.

Decision Review Snapshot

Keep

4

Improve

14

Replace

1

High Priority

6

Most Important Changes

Replace Schema migration lifecycle with Alembic.

Improve API modularity, typed settings, auth, rate limits.

Improve Ranking calibration and embedding-model bakeoff.

Strong Existing Bets

Keep Deterministic compose endpoints for anti-hallucination safety.

Keep Relational + vector split (Postgres + Qdrant).

Keep Stagehand browser path for JS-gated evidence sources.

High-Priority Changes (Execution Focus)

Architecture Hardening

  • Split monolithic FastAPI module into routers/services/repositories.
  • Centralize typed config with pydantic-settings.
  • Adopt Alembic for reviewable DDL lifecycle.

Security + Reliability

  • Introduce scoped API authN/Z for sensitive surfaces.
  • Add gateway/app abuse controls (e.g., limit_req).
  • Standardize error contracts and observability.

Retrieval Quality

  • Calibrate lexical/hybrid/semantic blend weights via offline eval.
  • Run canary rollouts for ranking behavior changes.
  • Benchmark PubMedBERT baseline vs MedCPT path.

Medium/Low Priorities and Risk Notes

Medium Priority

  • Migrate startup/shutdown handlers to FastAPI lifespan model.
  • Introduce OpenTelemetry traces and key service metrics.
  • Tune Qdrant payload indexes and quantization policy.
  • Add ingestion orchestration metadata, retries, and provenance.

Low Priority

  • Keep frontend lightweight; add typed API contract checks.

Risk If Delayed

  • Rising API change blast radius.
  • Harder post-incident forensics without richer telemetry.

Rollout and Evidence

Low-risk sequencing with source-backed rationale

Suggested Rollout Sequence

  1. Observe first: add traces and key metrics without changing behavior.
  2. Control config + schema: introduce typed settings and Alembic scaffolding.
  3. Protect surfaces: enable auth and rate limits in low-risk rollout mode.
  4. Refactor safely: modularize API internals behind stable routes.
  5. Tune relevance: calibrate ranking weights with evaluator outputs.
  6. Decide model path: run PubMedBERT vs MedCPT bakeoff on held-out sets.
This sequence intentionally avoids coupling platform hardening with retrieval model changes.

Evidence Base and Bottom Line

Source Buckets

  • FastAPI design/lifecycle/security references (S1, S2, S6).
  • Operational hardening standards: OWASP API Top 10, RFC 9457, NGINX limits (S5, S7, S8).
  • Retrieval stack references: Qdrant docs, pgvector, PubMedBERT, MedCPT, NCBI guidance (S11-S17).

Bottom Line

The platform already has a practical core: deterministic package composition, strong operational tooling, and a clean relational/vector split.

Primary gains now come from architecture hardening + measurable retrieval tuning, not broad feature expansion.

Report basis: `reports/grouping-feature-implementation-and-decision-review-2026-02-11.md`