Projects
Everything here was built to solve a real problem in a regulated environment. The repos are public. Judge the work for yourself.
Featured Case Study
A traceable, citation-grounded retrieval architecture tested against USCIS policy documents. The point is not immigration. The point is what trustworthy AI infrastructure looks like under pressure.
A RAG pipeline that ingests PDF documents, chunks them with respect to legal hierarchy, stores embeddings in pgvector, and retrieves relevant context to generate grounded answers.
Source citation on every answer, deterministic model settings, retrieval choices designed to reduce redundancy, and architecture decisions made for repeatability.
If the system can hold up against dense policy manuals, it can be adapted for compliance documents, internal knowledge bases, policy libraries, and regulatory corpuses.
Compliance manuals, policy libraries, SOPs, legal corpuses, internal guidance, regulatory filings. If your team needs fast, accurate, cited answers from a document set, this is the architecture.
Discuss Your Use CaseMore Work
A batch-scale ETL pipeline for ingesting, classifying, and governing AI-processed data. Airflow orchestration, warehouse-centric design, immutable audit trails, and cost visibility built into the workflow.
Workflow-driven data quality monitoring with checks for schema drift, null thresholds, freshness windows, and row-count anomalies. Quality gates without the overhead of a full observability platform.
If your team has a corpus that should be searchable, citeable, and AI-ready, this is where the conversation starts.
Get in Touch