Finished my internship @ Chelsio Communications

This summer I interned at Chelsio Communications, a semiconductor company based in Sunnyvale that specializes in high-performance Ethernet networking and storage solutions. My project: build a production-ready internal chatbot powered by an LLM and a private knowledge base, entirely on-premise with no external APIs.

Every component had to stay local. Sending proprietary data to an outside server wasn’t an option, which pushed me toward building a fully self-contained system from the ground up.

I was given a serious machine to work with: 128 cores, 256 GB RAM, 1.1 TB of disk, and an RTX 5090. The hardware constraint of staying local didn’t sting much when the local hardware is that good.

I chose Retrieval Augmented Generation (RAG) over fine-tuning after weighing the tradeoffs. Fine-tuning would have required retraining model weights, curating a high-quality dataset, and burning weeks of compute time. RAG let me combine a pre-trained LLM’s reasoning ability with a semantic search layer grounded in the company’s actual documents, and it was achievable within the internship timeline.

The hardest part turned out to be a problem I didn’t anticipate: document parsing. Most of Chelsio’s knowledge base lives in PDFs full of complex tables. Traditional parsers fell apart on these, producing poor embeddings and, consequently, poor retrieval. After a lot of trial and error, I came across ColPali, a method that embeds each PDF page directly as an image and performs semantic search over image embeddings, bypassing the parse-chunk-index pipeline entirely. It was the breakthrough the project needed.

Stack

Layer	Tech
Retrieval	ColPali (vision RAG)
Vector DB	Local, on-prem, Chroma DB
Inference	RTX 5090 local
Backend	REST API + RAG pipeline
Containers	Docker

Frontend

Once the RAG pipeline was working, I had to ship it as an actual usable application, which meant building a full-stack product. The backend was entirely mine: API server, database setup, RAG orchestration. For the frontend, I evaluated a few off-the-shelf chat UIs rather than building one from scratch.

UI	Verdict
Gradio	Fine for quick PoCs, not production-ready in terms of polish.
ChatbotUI	Decent but had auth and user management friction.
Open WebUI ✓	Clean, extensible, easy to wire up to custom backends.

Open WebUI connected cleanly to the API endpoints exposed by my backend and gave the tool a professional, usable interface without me having to build a chat UI from scratch. The backend was significantly more complex than the frontend integration, and that’s where I spent most of my time.

Overall, this was the most meaningful technical project I’ve worked on. It has also deepened my interest in AI systems substantially. I want to spend more time learning about LLMs in depth rather than treating them as a black box.

Chelsio Chat Architecture