Stage 06 — Capstone — Build and Deploy a Real AI Application

From Zero to a Live AI App  ·  Capstone Project  ·  ⏱ 12–20 hours

Learning Objectives

By the end of this stage you will have:

  • Built a complete, production-grade AI application from scratch
  • Implemented streaming, RAG, and tool use in an integrated system
  • Deployed a live AI app to the web (Railway, Render, or similar)
  • Written evaluation tests to measure output quality
  • Added authentication, rate limiting, and cost controls
  • Created documentation and a portfolio-ready README

The Capstone Project: TechNodeX Study Assistant

You will build a full-stack AI study assistant that:

  • Answers questions about TechNodeX course content (RAG over course materials)
  • Uses the Claude API with streaming responses
  • Supports multi-turn conversation
  • Includes a simple web interface
  • Deploys to a public URL
  • Has proper error handling, logging, and basic auth

Part 1: Architecture Design

Before writing code, design the system.

User Browser
    ↕ HTTPS
FastAPI Backend
    ├── /chat (streaming SSE endpoint)
    ├── /health
    └── /reset (clear conversation)
        ↕
    Claude API (Anthropic)
        ↕
    ChromaDB (local vector store)
        ↑
    Document Ingester
        ↑
    Course Markdown Files (GitHub)

Tech Stack

ComponentChoiceWhy
API FrameworkFastAPIAsync, streaming support, auto-docs
LLMClaude SonnetSpeed + quality balance
Vector StoreChromaDBSimple, local, no extra infra
FrontendVanilla JS + HTMLNo build step, easy to deploy
DeploymentRailwayFree tier, GitHub auto-deploy
Embeddingsall-MiniLM-L6-v2Free, local, no API key needed

Part 2: Project Setup

Directory Structure

study-assistant/
├── app/
│   ├── __init__.py
│   ├── main.py              # FastAPI app
│   ├── rag.py               # RAG system
│   ├── agent.py             # Claude integration
│   └── config.py            # Settings
├── data/
│   └── courses/             # Markdown course files
├── static/
│   ├── index.html           # Frontend
│   └── style.css
├── tests/
│   └── test_rag.py
├── requirements.txt
├── Dockerfile
├── railway.json
└── README.md

requirements.txt

fastapi==0.111.0
uvicorn[standard]==0.30.1
anthropic==0.28.0
chromadb==0.5.0
sentence-transformers==3.0.0
python-dotenv==1.0.1
httpx==0.27.0
pydantic==2.7.4

Part 3: The Backend

config.py

from pydantic_settings import BaseSettings
from functools import lru_cache


class Settings(BaseSettings):
    anthropic_api_key: str
    claude_model: str = "claude-sonnet-4-5"
    max_tokens: int = 2048
    chroma_db_path: str = "./chroma_data"
    collection_name: str = "technodex_courses"
    max_context_chunks: int = 5
    api_key: str = "changeme"  # Simple auth key
    
    class Config:
        env_file = ".env"


@lru_cache()
def get_settings() -> Settings:
    return Settings()

rag.py

import chromadb
from chromadb.utils import embedding_functions
from pathlib import Path
import hashlib
import re
from .config import get_settings


class CourseRAG:
    """RAG system over TechNodeX course content."""
    
    def __init__(self):
        settings = get_settings()
        self.db = chromadb.PersistentClient(path=settings.chroma_db_path)
        self.embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
            model_name="all-MiniLM-L6-v2"
        )
        self.collection = self.db.get_or_create_collection(
            name=settings.collection_name,
            embedding_function=self.embedding_fn,
            metadata={"hnsw:space": "cosine"}
        )
    
    def ingest_markdown_file(self, filepath: str, course: str, stage: int) -> int:
        """Ingest a course markdown file."""
        path = Path(filepath)
        with open(path, "r", encoding="utf-8") as f:
            text = f.read()
        
        # Split by heading
        sections = re.split(r'\n(?=#{1,3} )', text)
        
        docs, ids, metadatas = [], [], []
        for i, section in enumerate(sections):
            if len(section.strip()) < 50:
                continue
            doc_id = hashlib.md5(f"{filepath}:{i}".encode()).hexdigest()[:12]
            docs.append(section.strip())
            ids.append(doc_id)
            metadatas.append({
                "course": course,
                "stage": stage,
                "filename": path.name,
                "section_index": i
            })
        
        if docs:
            self.collection.upsert(documents=docs, ids=ids, metadatas=metadatas)
        
        return len(docs)
    
    def search(self, query: str, n_results: int = 5, course: str | None = None) -> list[dict]:
        """Search for relevant course content."""
        kwargs = {"query_texts": [query], "n_results": n_results}
        if course:
            kwargs["where"] = {"course": course}
        
        results = self.collection.query(**kwargs)
        
        if not results["documents"][0]:
            return []
        
        return [
            {
                "text": doc,
                "course": meta.get("course", ""),
                "stage": meta.get("stage", 0),
                "filename": meta.get("filename", ""),
                "relevance": 1 - dist
            }
            for doc, meta, dist in zip(
                results["documents"][0],
                results["metadatas"][0],
                results["distances"][0]
            )
        ]
    
    def get_context(self, query: str, n_results: int = 5) -> tuple[str, list[dict]]:
        """Get formatted context string and source list."""
        chunks = self.search(query, n_results=n_results)
        
        if not chunks:
            return "", []
        
        context_parts = []
        for i, chunk in enumerate(chunks, 1):
            context_parts.append(
                f"[Source {i} — {chunk['course'].title()}, Stage {chunk['stage']}]\n{chunk['text']}"
            )
        
        return "\n\n---\n\n".join(context_parts), chunks

main.py

from fastapi import FastAPI, HTTPException, Depends, Header
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from fastapi.responses import StreamingResponse, HTMLResponse
from pydantic import BaseModel
from typing import AsyncIterator
import anthropic
import json
import asyncio
from .config import get_settings
from .rag import CourseRAG

app = FastAPI(title="TechNodeX Study Assistant")
settings = get_settings()
rag = CourseRAG()
claude = anthropic.Anthropic(api_key=settings.anthropic_api_key)

# Conversation store (in production: use Redis)
conversations: dict[str, list[dict]] = {}

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_methods=["*"],
    allow_headers=["*"]
)

app.mount("/static", StaticFiles(directory="static"), name="static")


class ChatRequest(BaseModel):
    session_id: str
    message: str
    course_filter: str | None = None


def verify_api_key(x_api_key: str = Header(default=None)) -> None:
    if x_api_key != settings.api_key:
        raise HTTPException(status_code=401, detail="Invalid API key")


SYSTEM_PROMPT = """You are a knowledgeable study assistant for TechNodeX, an online technical learning platform.

Your role is to help students understand course material, answer technical questions, and guide their learning journey.

When context from the knowledge base is provided, base your answers on that context and cite sources.
If a question is outside the provided context, use your general knowledge but note this clearly.

Guidelines:
- Be encouraging and educational
- Provide concrete examples when explaining concepts
- Suggest related course stages when relevant
- If a student seems stuck, ask clarifying questions
- Keep answers focused — students are learning, not reading essays"""


async def stream_response(
    session_id: str,
    user_message: str,
    context: str,
    history: list[dict]
) -> AsyncIterator[str]:
    """Stream Claude's response as SSE events."""
    
    # Build messages
    context_note = f"\n\nRelevant course material:\n{context}" if context else ""
    full_message = f"{user_message}{context_note}"
    
    messages = history + [{"role": "user", "content": full_message}]
    
    full_response = ""
    
    with claude.messages.stream(
        model=settings.claude_model,
        max_tokens=settings.max_tokens,
        system=SYSTEM_PROMPT,
        messages=messages
    ) as stream:
        for text in stream.text_stream:
            full_response += text
            yield f"data: {json.dumps({'text': text})}\n\n"
    
    # Save assistant response to history
    history.append({"role": "user", "content": user_message})
    history.append({"role": "assistant", "content": full_response})
    conversations[session_id] = history
    
    yield f"data: {json.dumps({'done': True})}\n\n"


@app.post("/chat")
async def chat(request: ChatRequest, _: None = Depends(verify_api_key)):
    """Stream a chat response with RAG context."""
    
    history = conversations.get(request.session_id, [])
    context, sources = rag.get_context(request.message)
    
    return StreamingResponse(
        stream_response(request.session_id, request.message, context, history),
        media_type="text/event-stream"
    )


@app.delete("/session/{session_id}")
async def reset_session(session_id: str, _: None = Depends(verify_api_key)):
    """Clear conversation history for a session."""
    conversations.pop(session_id, None)
    return {"status": "cleared"}


@app.get("/health")
async def health():
    """Health check endpoint."""
    doc_count = rag.collection.count()
    return {"status": "ok", "documents_indexed": doc_count}


@app.get("/")
async def root():
    """Serve the frontend."""
    with open("static/index.html") as f:
        return HTMLResponse(f.read())

Part 4: The Frontend

static/index.html

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>TechNodeX Study Assistant</title>
  <style>
    *, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
    :root { --bg: #0d0d0d; --surface: #141414; --border: #222; --red: #e63946; --text: #e8e8e8; --muted: #888; }
    body { font-family: 'Segoe UI', system-ui, sans-serif; background: var(--bg); color: var(--text); height: 100vh; display: flex; flex-direction: column; }
    
    .header { background: #000; border-bottom: 1px solid var(--border); padding: 1rem 1.5rem; display: flex; align-items: center; gap: 1rem; }
    .header h1 { font-size: 1.1rem; font-weight: 700; }
    .header h1 span { color: var(--red); }
    .status-dot { width: 8px; height: 8px; background: #22c55e; border-radius: 50%; }
    
    .messages { flex: 1; overflow-y: auto; padding: 1.5rem; display: flex; flex-direction: column; gap: 1rem; }
    .message { max-width: 800px; }
    .message.user { align-self: flex-end; background: rgba(230,57,70,0.15); border: 1px solid rgba(230,57,70,0.3); border-radius: 12px 12px 0 12px; padding: 0.75rem 1rem; }
    .message.assistant { align-self: flex-start; background: var(--surface); border: 1px solid var(--border); border-radius: 12px 12px 12px 0; padding: 0.75rem 1rem; }
    .message pre { background: #1a1a2e; border: 1px solid var(--border); border-radius: 6px; padding: 0.75rem; overflow-x: auto; margin: 0.5rem 0; }
    .message code { font-family: 'Fira Code', monospace; font-size: 0.85em; }
    .message p { margin-bottom: 0.5rem; }
    .message p:last-child { margin-bottom: 0; }
    
    .input-area { border-top: 1px solid var(--border); padding: 1rem 1.5rem; display: flex; gap: 0.75rem; }
    #message-input { flex: 1; background: var(--surface); border: 1px solid var(--border); border-radius: 8px; padding: 0.75rem 1rem; color: var(--text); font-size: 0.95rem; resize: none; min-height: 50px; }
    #message-input:focus { outline: none; border-color: var(--red); }
    #send-btn { background: var(--red); color: white; border: none; border-radius: 8px; padding: 0.75rem 1.5rem; cursor: pointer; font-weight: 600; }
    #send-btn:hover { background: #c1121f; }
    #send-btn:disabled { opacity: 0.5; cursor: not-allowed; }
    
    .typing-indicator { display: flex; gap: 4px; align-items: center; padding: 4px 0; }
    .typing-indicator span { width: 6px; height: 6px; background: var(--muted); border-radius: 50%; animation: bounce 1s infinite; }
    .typing-indicator span:nth-child(2) { animation-delay: 0.2s; }
    .typing-indicator span:nth-child(3) { animation-delay: 0.4s; }
    @keyframes bounce { 0%, 100% { transform: translateY(0); } 50% { transform: translateY(-5px); } }
  </style>
</head>
<body>
<div class="header">
  <div class="status-dot"></div>
  <h1>Tech<span>Node</span>X Study Assistant</h1>
</div>

<div class="messages" id="messages">
  <div class="message assistant">
    <p>Hi! I'm your TechNodeX study assistant. Ask me anything about the course material — Python, ethical hacking, AI, Security+, or Kali Linux.</p>
  </div>
</div>

<div class="input-area">
  <textarea id="message-input" placeholder="Ask a question about the course material..." rows="2"></textarea>
  <button id="send-btn">Send</button>
</div>

<script>
  const API_KEY = 'changeme'; // In production, get from auth flow
  const sessionId = Math.random().toString(36).substr(2, 9);
  
  const messagesEl = document.getElementById('messages');
  const inputEl = document.getElementById('message-input');
  const sendBtn = document.getElementById('send-btn');
  
  function addMessage(role, content) {
    const div = document.createElement('div');
    div.className = `message ${role}`;
    
    if (role === 'assistant') {
      // Basic markdown rendering
      content = content
        .replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code>$2</code></pre>')
        .replace(/`([^`]+)`/g, '<code>$1</code>')
        .replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
        .replace(/\n\n/g, '</p><p>')
        .replace(/\n/g, '<br>');
      div.innerHTML = `<p>${content}</p>`;
    } else {
      div.textContent = content;
    }
    
    messagesEl.appendChild(div);
    messagesEl.scrollTop = messagesEl.scrollHeight;
    return div;
  }
  
  async function sendMessage() {
    const message = inputEl.value.trim();
    if (!message) return;
    
    inputEl.value = '';
    sendBtn.disabled = true;
    
    addMessage('user', message);
    
    // Add typing indicator
    const indicator = document.createElement('div');
    indicator.className = 'message assistant';
    indicator.innerHTML = '<div class="typing-indicator"><span></span><span></span><span></span></div>';
    messagesEl.appendChild(indicator);
    messagesEl.scrollTop = messagesEl.scrollHeight;
    
    try {
      const response = await fetch('/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json', 'X-API-Key': API_KEY },
        body: JSON.stringify({ session_id: sessionId, message })
      });
      
      const reader = response.body.getReader();
      const decoder = new TextDecoder();
      
      let assistantDiv = null;
      let fullText = '';
      
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        
        const chunk = decoder.decode(value);
        const lines = chunk.split('\n');
        
        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const data = JSON.parse(line.slice(6));
          
          if (data.done) break;
          if (data.text) {
            fullText += data.text;
            if (!assistantDiv) {
              indicator.remove();
              assistantDiv = addMessage('assistant', '');
            }
            // Update with accumulated text
            const rendered = fullText
              .replace(/```(\w+)?\n([\s\S]*?)```/g, '<pre><code>$2</code></pre>')
              .replace(/`([^`]+)`/g, '<code>$1</code>')
              .replace(/\*\*(.*?)\*\*/g, '<strong>$1</strong>')
              .replace(/\n\n/g, '</p><p>')
              .replace(/\n/g, '<br>');
            assistantDiv.innerHTML = `<p>${rendered}</p>`;
            messagesEl.scrollTop = messagesEl.scrollHeight;
          }
        }
      }
    } catch (error) {
      indicator.remove();
      addMessage('assistant', `Error: ${error.message}. Please try again.`);
    }
    
    sendBtn.disabled = false;
    inputEl.focus();
  }
  
  sendBtn.addEventListener('click', sendMessage);
  inputEl.addEventListener('keydown', e => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      sendMessage();
    }
  });
</script>
</body>
</html>

Part 5: Deployment

Dockerfile

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Pre-download the embedding model
RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"

EXPOSE 8000

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Deploy to Railway

  1. Push your code to GitHub
  2. Go to railway.app, connect your repo
  3. Add environment variable: ANTHROPIC_API_KEY=sk-ant-...
  4. Railway auto-builds and deploys from your Dockerfile
  5. Get a public URL like https://your-app.railway.app

Part 6: Evaluation

# tests/test_rag.py
import pytest
from app.rag import CourseRAG

TEST_QUESTIONS = [
    ("What is a token in the context of LLMs?", "token"),
    ("How do I install Kali Linux?", "kali"),
    ("What is SQL injection?", "sql"),
    ("Explain the difference between symmetric and asymmetric encryption", "encryption"),
    ("What is the purpose of a system prompt?", "system prompt"),
]

@pytest.fixture
def rag():
    return CourseRAG()

def test_retrieval_recall(rag):
    """Test that correct information appears in top-5 results."""
    for question, expected_keyword in TEST_QUESTIONS:
        results = rag.search(question, n_results=5)
        
        # Check keyword appears in any of the top-5 results
        found = any(
            expected_keyword.lower() in r["text"].lower()
            for r in results
        )
        
        if not found:
            print(f"MISS: '{question}' — '{expected_keyword}' not in top 5")
        
        assert len(results) > 0, f"No results for: {question}"

def test_course_filter(rag):
    """Test that course filtering works correctly."""
    results = rag.search("how to use Python", n_results=3, course="python-security")
    for r in results:
        assert r["course"] == "python-security", f"Expected python-security, got {r['course']}"

Checkpoint Assessment (Final)

  1. Your streaming endpoint is working locally but fails in production. What are the two most common causes?
  2. A user reports the chatbot is making up information about course topics that aren't in the knowledge base. How do you fix this?
  3. You're being rate-limited by Claude. What three changes would reduce your API calls?
  4. Describe how you would add user authentication to this system (not just an API key).
  5. Your Docker image is 8GB because sentence-transformers downloaded a large model. How would you fix this in production?
  6. What metrics would you track in production to know if the system is performing well?

Graduation Checklist

You've completed the AI Fundamentals course. Before marking this stage done, verify:

  • [ ] App runs locally: uvicorn app.main:app --reload
  • [ ] Health endpoint returns correct document count
  • [ ] Chat endpoint streams responses correctly
  • [ ] RAG retrieves relevant content for course-related questions
  • [ ] Conversation history works across turns
  • [ ] App deploys successfully to Railway/Render/Fly.io
  • [ ] Live URL accessible from browser
  • [ ] Evaluation tests pass (≥ 80% retrieval recall)
  • [ ] README documents setup, architecture, and usage

Portfolio presentation: Record a 2-minute screen capture of your deployed app answering 3 different questions. This is your portfolio artifact for this course.

Lock In Founding Member Access

Get full access to every course on TechNodeX — AI, cybersecurity, Python, and everything we build next. $9/month, price locked forever.

Become a Founding Member →