![]() |
Building A Rag Application In Python
![]() Building A Rag Application In Python Published 6/2026 MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz, 2 Ch Language: English | Duration: 9h 50m | Size: 7 GB What you'll learn Build a complete Retrieval-Augmented Generation pipeline in Python, from document ingestion to streaming chat output Run Postgres with the pgvector extension via Docker Compose, including HNSW indexing for fast approximate-nearest-neighbour vector search Chunk documents with paragraph-aware splitting and overlap, and explain why each chunking choice affects retrieval quality Implement idempotent, atomic document ingestion using SHA-256 content hashes and transactional upserts Use the OpenAI SDK to call local Ollama models and OpenAI's hosted API through the same code path Implement hybrid retrieval that combines dense vector search with Postgres full-text BM25, fused with Reciprocal Rank Fusion Build a query rewriter that turns follow-up questions like "what does it eat?" into standalone search queries that actually retrieve useful chunks Build a directory watcher with watchdog, including per-path debouncing so editor saves never trigger reads of half-written files Apply the Strategy/Adapter pattern to swap a Postgres backend for Weaviate via a single environment variable, with zero changes to the rest of the code Build a streaming chat web UI with FastAPI, Server-Sent Events, and vanilla JavaScript - no React, no build step Ingest images using a "describe-then-embed" vision-model pipeline, including format normalization for vision backends Render LLM markdown output safely in the browser with marked + DOMPurify, including inline images Apply standard software-engineering patterns - Dependency Injection, Factory, Strategy/Adapter, context managers, lazy imports, etc. Diagnose RAG failures empirically (cosine scores, full-text ranks, fused output) instead of guessing at prompts Requirements Basic Python skills, basic SQL, comfort with the command line and Docker. No prior LLM or vector-database experience needed. Description Build a workingRetrieval-Augmented Generation (RAG) application in Python - from an empty directory to a streaming web chat with multi-turn memory, hybrid retrieval, image ingestion, and two interchangeable vector-store backends. No LangChain, no LlamaIndex, no magic. You write every line yourself, and by the end you understand exactly what each one does. Most RAG tutorials wrap everything in a single high-level library and stop at "it works." This course goes the other way. You'll build the pipeline from scratch - chunking, embeddings, idempotent ingestion, hybrid semantic-plus-lexical retrieval with Reciprocal Rank Fusion, a query rewriter for follow-up questions, server-sent token streaming, a vision-model branch for images - on top of plain Postgres (with pgvector) and a local Ollama server.No API bills while you learn. No black boxes. When you later reach for a framework like LangChain, you'll actually understand what it's doing under the hood. What you'll build, in one project - Runs entirely locally against Ollama, or transparently against the OpenAI API by changing one environment variable - Stores embeddings in Postgres + pgvector with HNSW indexing, or in Weaviate - backends swappable via a single config setting - Hybrid retrieval: dense vector search and Postgres full-text BM25, fused with Reciprocal Rank Fusion - fixing the cases where pure semantic search silently fails on rare terms, names, and identifiers - A directory watcher that ingests new files automatically, with editor-save debouncing so it never reads a half-written file - A streaming web chat UI built on FastAPI + Server-Sent Events + vanilla JavaScript - no React, no build step - with multi-turn memory, query rewriting for follow-ups, source citations, and inline image rendering - Image ingestion through a vision model with a "describe-then-embed" pipeline - multimodal in the same chunks table, no schema change required Along the way you'll work through real software-design patterns in real code: Dependency Injection, Strategy/Adapter, Factory, lifespans, context managers, thread-safety boundaries, atomic transactions, defensive coding against external services that quietly don't work the way their docs claim. The course's recurring theme is the payoff of good abstractions: the vector-store interface designed early lets you bolt on a second backend in one file; the same retrieval pipeline serves both the CLI and the web app; the chunk-metadata field that seemed academic early in the course is what makes image support a simple change later on. You'll finish with a codebase you can extend - add a reranker, try a different embedder, swap the chat model, point it at a corpus of your own docs - and the engineering vocabulary to talk about RAG as production software, not a notebook demo. Who this course is for Python developers interested in integrating LLMs into their projects, and adding RAG functionality. Цитата:
|
| Часовой пояс GMT +3, время: 20:25. |
vBulletin® Version 3.6.8.
Copyright ©2000 - 2026, Jelsoft Enterprises Ltd.
Перевод: zCarot