Senior engineer · backend systems · AI infrastructure

I design systems that

Backend systems, cloud architecture, and AI-powered platforms. Ask me anything about my work.

Book a Call Download CV View Projects Ask Me Anything

Case studies ·System design·Projects

50k

events/sec peak throughput

99.9%

production SLOs, monitored with burn-rate alerts

28%

drop in support volume via a production RAG system

years shipping distributed backends at scale

Value proposition

What I Do

I help teams build production-grade systems that are fast, reliable, and easy to evolve.

Design and build scalable backend systems

Architect cloud-native applications

Help startups ship MVPs fast

Optimize performance and infrastructure cost

Build AI-powered applications using LLMs

Cloud infrastructure

A modern backend, visualized.

An interactive topology of the systems I design: edge, API, events, data, workers, and storage — with the trade-offs that hold them together.

Live topology · click a service to learn more

cloud.topology.v1

A modern backend topology, end to end.

Click any node to see its role and the trade-offs that come with it. The dashed paths show event and request flow.

Explore the full visualization

LLM systems

Where the intelligence actually comes from.

Most people see the LLM as one box. In production it's a pipeline. Here's what actually happens between a question and an answer.

LLM pipeline · hover to learn, press play to watch it run

How an LLM answer is actually produced.

Most people think of the LLM as a single step. In production it’s a pipeline: tokenize, retrieve, compose, generate, stream. Each stage has its own trade-offs.

Deep dive into the full pipeline

Case studies

Problems, constraints, and the trade-offs I picked.

Each write-up follows the same shape so you can compare: problem, constraints, architecture, trade-offs, outcome.

All case studies

Consulting

Work With Me

I help companies turn ideas into scalable, production-ready systems.

MVP development (fast and scalable)
System design and architecture
Backend optimization and scaling
AI integration and LLM applications

Book a Strategy Call

Selected projects

Things I've built and shipped.

2025

Event-Driven Commerce Platform

A Kafka + NestJS backbone for a high-throughput commerce product.

NestJSKafkaPostgreSQLRedisElasticsearch

50k events/sec peak, sub-second p99 end-to-end

2025

Retrieval-Augmented Knowledge Assistant

A production RAG system over product documentation and support data.

Next.jsVercel AI SDKGeminipgvectorPostgreSQL

~1M chunks indexed, <500ms time-to-first-token

System design

Deep dives for the curious.

Essays about distributed systems and AI infrastructure. Focused on decisions and the trade-offs that make them.

Event-Driven Backbones with Kafka

When to reach for Kafka, what the outbox pattern really buys you, partitioning for ordering, and the things that break in year two.

KafkaEvent-DrivenDistributed Systems

11 min

Designing a RAG Pipeline for Production

Chunking, hybrid retrieval, reranking, grounding with citations, and the evals that separate demo-ware from production.

LLMRAGAI Systems

13 min

Multi-Tenant Isolation Patterns

Pooled vs siloed vs cells: picking an isolation model that matches your blast-radius budget, not the hype cycle.

Multi-tenantArchitectureSaaS

9 min

Observability That People Actually Use

SLOs, burn-rate alerts, and why your dashboard graveyard is a product problem, not a tooling one.

ObservabilitySRE

8 min

Ask anything

Have a system design question?

Try the AI assistant — it explains architecture trade-offs, answers hiring questions, and points you to the right next step.

Open the assistant

Contact

Let’s build something that scales.

If you’re hiring or planning a new backend/AI initiative, I can help with architecture, delivery, and execution.

hassanrazamohammadtufail@gmail.com LinkedIn GitHub Download CV

Book a Call

I design systems that scale.