Stage 06 — Capstone — Build and Deploy a Real AI Application
From Zero to a Live AI App · Capstone Project · ⏱ 12–20 hours
Learning Objectives
By the end of this stage you will have:
- Built a complete, production-grade AI application from scratch
- Implemented streaming, RAG, and tool use in an integrated system
- Deployed a live AI app to the web (Railway, Render, or similar)
- Written evaluation tests to measure output quality
- Added authentication, rate limiting, and cost controls
- Created documentation and a portfolio-ready README
The Capstone Project: TechNodeX Study Assistant
You will build a full-stack AI study assistant that:
- Answers questions about TechNodeX course content (RAG over course materials)
- Uses the Claude API with streaming responses
- Supports multi-turn conversation
- Includes a simple web interface
- Deploys to a public URL
- Has proper error handling, logging, and basic auth
Part 1: Architecture Design
Before writing code, design the system.
User Browser
↕ HTTPS
FastAPI Backend
├── /chat (streaming SSE endpoint)
├── /health
└── /reset (clear conversation)
↕
Claude API (Anthropic)
↕
ChromaDB (local vector store)
↑
Document Ingester
↑
Course Markdown Files (GitHub)
Tech Stack
| Component | Choice | Why |
|---|---|---|
| API Framework | FastAPI | Async, streaming support, auto-docs |
| LLM | Claude Sonnet | Speed + quality balance |
| Vector Store | ChromaDB | Simple, local, no extra infra |
| Frontend | Vanilla JS + HTML | No build step, easy to deploy |
| Deployment | Railway | Free tier, GitHub auto-deploy |
| Embeddings | all-MiniLM-L6-v2 | Free, local, no API key needed |
Part 2: Project Setup
Directory Structure
study-assistant/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app
│ ├── rag.py # RAG system
│ ├── agent.py # Claude integration
│ └── config.py # Settings
├── data/
│ └── courses/ # Markdown course files
├── static/
│ ├── index.html # Frontend
│ └── style.css
├── tests/
│ └── test_rag.py
├── requirements.txt
├── Dockerfile
├── railway.json
└── README.md
requirements.txt
fastapi==0.111.0
uvicorn[standard]==0.30.1
anthropic==0.28.0
chromadb==0.5.0
sentence-transformers==3.0.0
python-dotenv==1.0.1
httpx==0.27.0
pydantic==2.7.4
Part 3: The Backend
config.py
from pydantic_settings import BaseSettings
from functools import lru_cache
class Settings(BaseSettings):
anthropic_api_key: str
claude_model: str = "claude-sonnet-4-5"
max_tokens: int = 2048
chroma_db_path: str = "./chroma_data"
collection_name: str = "technodex_courses"
max_context_chunks: int = 5
api_key: str = "changeme" # Simple auth key
class Config:
env_file = ".env"
@lru_cache()
def get_settings() -> Settings:
return Settings()
rag.py
import chromadb
from chromadb.utils import embedding_functions
from pathlib import Path
import hashlib
import re
from .config import get_settings
class CourseRAG:
"""RAG system over TechNodeX course content."""
def __init__(self):
settings = get_settings()
self.db = chromadb.PersistentClient(path=settings.chroma_db_path)
self.embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
self.collection = self.db.get_or_create_collection(
name=settings.collection_name,
embedding_function=self.embedding_fn,
metadata={"hnsw:space": "cosine"}
)
def ingest_markdown_file(self, filepath: str, course: str, stage: int) -> int:
"""Ingest a course markdown file."""
path = Path(filepath)
with open(path, "r", encoding="utf-8") as f:
text = f.read()
# Split by heading
sections = re.split(r'\n(?=#{1,3} )', text)
docs, ids, metadatas = [], [], []
for i, section in enumerate(sections):
if len(section.strip()) < 50:
continue
doc_id = hashlib.md5(f"{filepath}:{i}".encode()).hexdigest()[:12]
docs.append(section.strip())
ids.append(doc_id)
metadatas.append({
"course": course,
"stage": stage,
"filename": path.name,
"section_index": i
})
if docs:
self.collection.upsert(documents=docs, ids=ids, metadatas=metadatas)
return len(docs)
def search(self, query: str, n_results: int = 5, course: str | None = None) -> list[dict]:
"""Search for relevant course content."""
kwargs = {"query_texts": [query], "n_results": n_results}
if course:
kwargs["where"] = {"course": course}
results = self.collection.query(**kwargs)
if not results["documents"][0]:
return []
return [
{
"text": doc,
"course": meta.get("course", ""),
"stage": meta.get("stage", 0),
"filename": meta.get("filename", ""),
"relevance": 1 - dist
}
for doc, meta, dist in zip(
results["documents"][0],
results["metadatas"][0],
results["distances"][0]
)
]
def get_context(self, query: str, n_results: int = 5) -> tuple[str, list[dict]]:
"""Get formatted context string and source list."""
chunks = self.search(query, n_results=n_results)
if not chunks:
return "", []
context_parts = []
for i, chunk in enumerate(chunks, 1):
context_parts.append(
f"[Source {i} — {chunk['course'].title()}, Stage {chunk['stage']}]\n{chunk['text']}"
)
return "\n\n---\n\n".join(context_parts), chunks
main.py
from fastapi import FastAPI, HTTPException, Depends, Header
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from fastapi.responses import StreamingResponse, HTMLResponse
from pydantic import BaseModel
from typing import AsyncIterator
import anthropic
import json
import asyncio
from .config import get_settings
from .rag import CourseRAG
app = FastAPI(title="TechNodeX Study Assistant")
settings = get_settings()
rag = CourseRAG()
claude = anthropic.Anthropic(api_key=settings.anthropic_api_key)
# Conversation store (in production: use Redis)
conversations: dict[str, list[dict]] = {}
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_methods=["*"],
allow_headers=["*"]
)
app.mount("/static", StaticFiles(directory="static"), name="static")
class ChatRequest(BaseModel):
session_id: str
message: str
course_filter: str | None = None
def verify_api_key(x_api_key: str = Header(default=None)) -> None:
if x_api_key != settings.api_key:
raise HTTPException(status_code=401, detail="Invalid API key")
SYSTEM_PROMPT = """You are a knowledgeable study assistant for TechNodeX, an online technical learning platform.
Your role is to help students understand course material, answer technical questions, and guide their learning journey.
When context from the knowledge base is provided, base your answers on that context and cite sources.
If a question is outside the provided context, use your general knowledge but note this clearly.
Guidelines:
- Be encouraging and educational
- Provide concrete examples when explaining concepts
- Suggest related course stages when relevant
- If a student seems stuck, ask clarifying questions
- Keep answers focused — students are learning, not reading essays"""
async def stream_response(
session_id: str,
user_message: str,
context: str,
history: list[dict]
) -> AsyncIterator[str]:
"""Stream Claude's response as SSE events."""
# Build messages
context_note = f"\n\nRelevant course material:\n{context}" if context else ""
full_message = f"{user_message}{context_note}"
messages = history + [{"role": "user", "content": full_message}]
full_response = ""
with claude.messages.stream(
model=settings.claude_model,
max_tokens=settings.max_tokens,
system=SYSTEM_PROMPT,
messages=messages
) as stream:
for text in stream.text_stream:
full_response += text
yield f"data: {json.dumps({'text': text})}\n\n"
# Save assistant response to history
history.append({"role": "user", "content": user_message})
history.append({"role": "assistant", "content": full_response})
conversations[session_id] = history
yield f"data: {json.dumps({'done': True})}\n\n"
@app.post("/chat")
async def chat(request: ChatRequest, _: None = Depends(verify_api_key)):
"""Stream a chat response with RAG context."""
history = conversations.get(request.session_id, [])
context, sources = rag.get_context(request.message)
return StreamingResponse(
stream_response(request.session_id, request.message, context, history),
media_type="text/event-stream"
)
@app.delete("/session/{session_id}")
async def reset_session(session_id: str, _: None = Depends(verify_api_key)):
"""Clear conversation history for a session."""
conversations.pop(session_id, None)
return {"status": "cleared"}
@app.get("/health")
async def health():
"""Health check endpoint."""
doc_count = rag.collection.count()
return {"status": "ok", "documents_indexed": doc_count}
@app.get("/")
async def root():
"""Serve the frontend."""
with open("static/index.html") as f:
return HTMLResponse(f.read())
Part 4: The Frontend
static/index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>TechNodeX Study Assistant</title>
<style>
*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
:root { --bg: #0d0d0d; --surface: #141414; --border: #222; --red: #e63946; --text: #e8e8e8; --muted: #888; }
body { font-family: 'Segoe UI', system-ui, sans-serif; background: var(--bg); color: var(--text); height: 100vh; display: flex; flex-direction: column; }
.header { background: #000; border-bottom: 1px solid var(--border); padding: 1rem 1.5rem; display: flex; align-items: center; gap: 1rem; }
.header h1 { font-size: 1.1rem; font-weight: 700; }
.header h1 span { color: var(--red); }
.status-dot { width: 8px; height: 8px; background: #22c55e; border-radius: 50%; }
.messages { flex: 1; overflow-y: auto; padding: 1.5rem; display: flex; flex-direction: column; gap: 1rem; }
.message { max-width: 800px; }
.message.user { align-self: flex-end; background: rgba(230,57,70,0.15); border: 1px solid rgba(230,57,70,0.3); border-radius: 12px 12px 0 12px; padding: 0.75rem 1rem; }
.message.assistant { align-self: flex-start; background: var(--surface); border: 1px solid var(--border); border-radius: 12px 12px 12px 0; padding: 0.75rem 1rem; }
.message pre { background: #1a1a2e; border: 1px solid var(--border); border-radius: 6px; padding: 0.75rem; overflow-x: auto; margin: 0.5rem 0; }
.message code { font-family: 'Fira Code', monospace; font-size: 0.85em; }
.message p { margin-bottom: 0.5rem; }
.message p:last-child { margin-bottom: 0; }
.input-area { border-top: 1px solid var(--border); padding: 1rem 1.5rem; display: flex; gap: 0.75rem; }
#message-input { flex: 1; background: var(--surface); border: 1px solid var(--border); border-radius: 8px; padding: 0.75rem 1rem; color: var(--text); font-size: 0.95rem; resize: none; min-height: 50px; }
#message-input:focus { outline: none; border-color: var(--red); }
#send-btn { background: var(--red); color: white; border: none; border-radius: 8px; padding: 0.75rem 1.5rem; cursor: pointer; font-weight: 600; }
#send-btn:hover { background: #c1121f; }
#send-btn:disabled { opacity: 0.5; cursor: not-allowed; }
.typing-indicator { display: flex; gap: 4px; align-items: center; padding: 4px 0; }
.typing-indicator span { width: 6px; height: 6px; background: var(--muted); border-radius: 50%; animation: bounce 1s infinite; }
.typing-indicator span:nth-child(2) { animation-delay: 0.2s; }
.typing-indicator span:nth-child(3) { animation-delay: 0.4s; }
@keyframes bounce { 0%, 100% { transform: translateY(0); } 50% { transform: translateY(-5px); } }
</style>
</head>
<body>
<div class="header">
<div class="status-dot"></div>
<h1>Tech<span>Node</span>X Study Assistant</h1>
</div>
<div class="messages" id="messages">
<div class="message assistant">
<p>Hi! I'm your TechNodeX study assistant. Ask me anything about the course material — Python, ethical hacking, AI, Security+, or Kali Linux.</p>
</div>
</div>
<div class="input-area">
<textarea id="message-input" placeholder="Ask a question about the course material..." rows="2"></textarea>
<button id="send-btn">Send</button>
</div>
<script>
const API_KEY = 'changeme'; // In production, get from auth flow
const sessionId = Math.random().toString(36).substr(2, 9);
const messagesEl = document.getElementById('messages');
const inputEl = document.getElementById('message-input');
const sendBtn = document.getElementById('send-btn');
function addMessage(role, content) {
const div = document.createElement('div');
div.className = `message ${role}`;
if (role === 'assistant') {
// Basic markdown rendering
content = content
.replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code>$2</code></pre>')
.replace(/`([^`]+)`/g, '<code>$1</code>')
.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
.replace(/\n\n/g, '</p><p>')
.replace(/\n/g, '<br>');
div.innerHTML = `<p>${content}</p>`;
} else {
div.textContent = content;
}
messagesEl.appendChild(div);
messagesEl.scrollTop = messagesEl.scrollHeight;
return div;
}
async function sendMessage() {
const message = inputEl.value.trim();
if (!message) return;
inputEl.value = '';
sendBtn.disabled = true;
addMessage('user', message);
// Add typing indicator
const indicator = document.createElement('div');
indicator.className = 'message assistant';
indicator.innerHTML = '<div class="typing-indicator"><span></span><span></span><span></span></div>';
messagesEl.appendChild(indicator);
messagesEl.scrollTop = messagesEl.scrollHeight;
try {
const response = await fetch('/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'X-API-Key': API_KEY },
body: JSON.stringify({ session_id: sessionId, message })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let assistantDiv = null;
let fullText = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
const lines = chunk.split('\n');
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const data = JSON.parse(line.slice(6));
if (data.done) break;
if (data.text) {
fullText += data.text;
if (!assistantDiv) {
indicator.remove();
assistantDiv = addMessage('assistant', '');
}
// Update with accumulated text
const rendered = fullText
.replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code>$2</code></pre>')
.replace(/`([^`]+)`/g, '<code>$1</code>')
.replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
.replace(/\n\n/g, '</p><p>')
.replace(/\n/g, '<br>');
assistantDiv.innerHTML = `<p>${rendered}</p>`;
messagesEl.scrollTop = messagesEl.scrollHeight;
}
}
}
} catch (error) {
indicator.remove();
addMessage('assistant', `Error: ${error.message}. Please try again.`);
}
sendBtn.disabled = false;
inputEl.focus();
}
sendBtn.addEventListener('click', sendMessage);
inputEl.addEventListener('keydown', e => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
sendMessage();
}
});
</script>
</body>
</html>
Part 5: Deployment
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Pre-download the embedding model
RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Deploy to Railway
- Push your code to GitHub
- Go to railway.app, connect your repo
- Add environment variable:
ANTHROPIC_API_KEY=sk-ant-... - Railway auto-builds and deploys from your Dockerfile
- Get a public URL like
https://your-app.railway.app
Part 6: Evaluation
# tests/test_rag.py
import pytest
from app.rag import CourseRAG
TEST_QUESTIONS = [
("What is a token in the context of LLMs?", "token"),
("How do I install Kali Linux?", "kali"),
("What is SQL injection?", "sql"),
("Explain the difference between symmetric and asymmetric encryption", "encryption"),
("What is the purpose of a system prompt?", "system prompt"),
]
@pytest.fixture
def rag():
return CourseRAG()
def test_retrieval_recall(rag):
"""Test that correct information appears in top-5 results."""
for question, expected_keyword in TEST_QUESTIONS:
results = rag.search(question, n_results=5)
# Check keyword appears in any of the top-5 results
found = any(
expected_keyword.lower() in r["text"].lower()
for r in results
)
if not found:
print(f"MISS: '{question}' — '{expected_keyword}' not in top 5")
assert len(results) > 0, f"No results for: {question}"
def test_course_filter(rag):
"""Test that course filtering works correctly."""
results = rag.search("how to use Python", n_results=3, course="python-security")
for r in results:
assert r["course"] == "python-security", f"Expected python-security, got {r['course']}"
Checkpoint Assessment (Final)
- Your streaming endpoint is working locally but fails in production. What are the two most common causes?
- A user reports the chatbot is making up information about course topics that aren't in the knowledge base. How do you fix this?
- You're being rate-limited by Claude. What three changes would reduce your API calls?
- Describe how you would add user authentication to this system (not just an API key).
- Your Docker image is 8GB because sentence-transformers downloaded a large model. How would you fix this in production?
- What metrics would you track in production to know if the system is performing well?
Graduation Checklist
You've completed the AI Fundamentals course. Before marking this stage done, verify:
- [ ] App runs locally:
uvicorn app.main:app --reload - [ ] Health endpoint returns correct document count
- [ ] Chat endpoint streams responses correctly
- [ ] RAG retrieves relevant content for course-related questions
- [ ] Conversation history works across turns
- [ ] App deploys successfully to Railway/Render/Fly.io
- [ ] Live URL accessible from browser
- [ ] Evaluation tests pass (≥ 80% retrieval recall)
- [ ] README documents setup, architecture, and usage
Portfolio presentation: Record a 2-minute screen capture of your deployed app answering 3 different questions. This is your portfolio artifact for this course.
Lock In Founding Member Access
Get full access to every course on TechNodeX — AI, cybersecurity, Python, and everything we build next. $9/month, price locked forever.
Become a Founding Member →