> ## Documentation Index
> Fetch the complete documentation index at: https://cstreams.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Matter Pages

> The first and last pages of a document

<div>
  <img src="https://mintcdn.com/cstreams/1iORXuSnPeObgRK8/images/matter.png?fit=max&auto=format&n=1iORXuSnPeObgRK8&q=85&s=7bb817950311ea5ef8e04445ef3d95ee" alt="Diagram showing the front, body and back matter sections of a book" title="Diagram showing the front, body and back matter sections of a book" width="5627" height="1192" data-path="images/matter.png" />

  <div className="caption caption--right">
    example matter pages from *[Machine Learning System
    Design](https://www.manning.com/books/machine-learning-system-design)*
  </div>
</div>

### What are these pages?

Front and back matter sections refer to the first and last pages of a document.

These sections hold the key [raw material](/key-concepts/metadata) we'll feed to the LLM
as grounded context for an accurate filename and metadata prediction.

We skip the body matter, the middle chunk of the document, as it rarely has the metadata
we need.

### Front Matter Pages

Commonly found pages in the first pages contain publication details and introductory
content:

* **Cover** - Title, author\[s], publisher logo
* **Half-title** - Title and subtitle only
* **Recommended** - Other/similar books by the same author\[s]/publisher
* **Title** - Primary source for title and subtitle
* **Copyright** - Publication year, edition, DOI, LOC, ISBN\[s]
* **Letter from the author\[s]**
* **Acknowledgements**
* **Preface**
* **Table of contents** - Document structure and scope

Configurable with the <a href="/config/processing#param-max-pages" class="param-field-link">
MATTER\_CONFIG.front.max\_pages</a> setting.

### Back Matter Pages

Commonly found pages in the last pages, contain supplementary info:

* **Bibliography** - References and citations
* **Glossary** - Key terms and definitions
* **Appendices** - Additional material and data
* **Index/End notes** - Subject coverage and annotations
* **Author\[s] bios** - Detailed author\[s] information
* **Back cover** - Marketing copy and additional metadata

Configurable with the <a href="/config/processing#param-max-pages-1" class="param-field-link">MATTER\_CONFIG.back.max\_pages</a> setting.

## Matter Processing Flow

```mermaid theme={null}
flowchart-elk TD
    Start([Start]) --> OpenPDF[Open PDF Document]
    OpenPDF --> ProcessNextPage[Process Next Page]

    ProcessNextPage --> CheckFrontMaxPages{"\
        <span class='mermaid__default-text'>Reached</span><br><a href='/config/processing#param-front' class='mermaid__link mermaid__code'>MATTER_CONFIG.front.max_pages</a><span class='mermaid__default-text'>?</span>
    "}

    CheckFrontMaxPages -->|Yes| FrontMatterComplete
    CheckFrontMaxPages -->|No| ProcessNextPage

    FrontMatterComplete([Front Matter Complete]) --> CheckBackMatterMode{"\
        <span class='mermaid__default-text'>Which</span><br><a href='/config/processing#param-back' class='mermaid__link mermaid__code'>MATTER_CONFIG.back.mode</a><span class='mermaid__default-text'>?</span>
    "}

    CheckBackMatterMode -->|"<span class='mermaid__code'>never</span>"| End
    CheckBackMatterMode -->|"<span class='mermaid__code'>always</span>"| ProcessNextBackPage[Process Next Back Page]

    CheckBackMatterMode -->|"<span class='mermaid__code'>fallback</span>"| CheckMissingFields{"\
        <span class='mermaid__default-text'>Missing any fields</span><br><span class='mermaid__default-text'>set to <span class='mermaid__code'>TRUE</span> in</span><br><a href='/config/processing#param-fields' class='mermaid__link mermaid__code'>MATTER_CONFIG.back.fields</a><span class='mermaid__default-text'>?</span>
    "}

    CheckMissingFields -->|No| End
    CheckMissingFields -->|Yes| ProcessNextBackPage

    ProcessNextBackPage -->CheckBackMaxPages{"\
        <span class='mermaid__default-text'>Reached</span><br><a href='/config/processing#param-max-pages-1' class='mermaid__link mermaid__code'>MATTER_CONFIG.back.max_pages</a>
    "}
    CheckBackMaxPages -->|Yes| End
    CheckBackMaxPages -->|No| ProcessNextBackPage

    End([Text extraction complete]) --> LLMPrediction[Continue to LLM prediction]

    %% Apply styles to nodes
    class Start,End terminal
    class LLMPrediction process
```
