14. Map-Reduce Agents

Mini-Project: Long Document Summarizer

Splits a long document into overlapping chunks, summarizes each chunk in parallel (map), then synthesizes all chunk summaries into a coherent final summary (reduce).

View on GitHub

Description

Map-Reduce Agents apply the classic MapReduce paradigm to LLM processing. A large input (long document, multiple files, dataset) is split into chunks, each chunk is processed independently by a "mapper" LLM call (the Map phase), and the results are combined by a "reducer" LLM call (the Reduce phase). This is ideal for processing inputs that exceed context windows or benefit from parallel analysis.

When to Use

Summarizing very long documents that exceed context limits
Analyzing multiple documents or data sources in parallel
Extracting structured data from large unstructured datasets
Any task where "divide and conquer" improves quality or overcomes limits

Benefits

Benefit	Description
Scalability	Handle arbitrarily large inputs by chunking
Parallelism	Map phase runs concurrently for speed
Context Fit	Each chunk fits within the LLM context window
Quality	Focused analysis per chunk, then intelligent aggregation

Architecture Diagram

flowchart TD
    A[Large Input] --> B[Splitter]
    B --> C[Chunk 1]
    B --> D[Chunk 2]
    B --> E[Chunk N]
    C --> F[Map: Process Chunk 1]
    D --> G[Map: Process Chunk 2]
    E --> H[Map: Process Chunk N]
    F --> I[Reduce: Aggregate Results]
    G --> I
    H --> I
    I --> J[Final Output]

    style A fill:#4CAF50,color:#fff
    style B fill:#FF9800,color:#fff
    style F fill:#2196F3,color:#fff
    style G fill:#2196F3,color:#fff
    style H fill:#2196F3,color:#fff
    style I fill:#E91E63,color:#fff
    style J fill:#4CAF50,color:#fff