PDF Toolbox home pagelight logodark logo
Getting Started
  • Introduction
  • Quickstart
  • Use Cases
Key Concepts
  • Overview
  • How It Works
  • Metadata
  • Matter Pages
  • Token Costs
  • Duplicate Detection
  • Annotation Handling
Configuration
  • Overview
  • Processing
  • Diagnostic
  • AI Models
Analysis & Iteration
  • Overview
  • Evaluation
  • Test Cases
  • Known Issues
Project
  • Roadmap
  • Contributing
  • Changelog
PDF Toolbox home pagelight logodark logo
  • Dashboard
  • Dashboard
Documentation
Documentation
  • Dashboard
  • Key Concepts

    Overview

    Explore the key concepts of the document processing pipeline.

    How It Works

    A deeper dive on the end-to-end pipeline

    Metadata

    The fields extracted, modified and written

    Matter Pages

    The page sets containing key information

    Token Costs

    How we estimate the cost of each run

    Duplicate Detection

    Document identification and change tracking through CRC32 hashing.

    Annotation Handling

    Organization of document markups, highlights, and reader notes.

    Use CasesHow It Works
    websitegithublinkedin
    Powered by Mintlify
    Assistant
    Responses are generated using AI and may contain mistakes.