Hassan Raza
All projects
2025Lead engineer~1M chunks indexed, <500ms time-to-first-token

Retrieval-Augmented Knowledge Assistant

A production RAG system over product documentation and support data.

Built an assistant that answers product and support questions grounded in an internal knowledge base. Chunking is semantic and aware of doc structure; retrieval combines pgvector ANN and BM25 reranking; the LLM layer is streamed with citations and a safety rail.

Highlights

  • Hybrid retrieval (dense + lexical) with learned reranker
  • Citations every answer, with provenance UI
  • Evals in CI against a golden set; regressions block deploys

Stack

Next.jsVercel AI SDKGeminipgvectorPostgreSQLTypeScript
Book a Call