pdf-toolbox

A powerful tool for standardizing the filenames and metadata of large PDF collections.

Before

├── Androids Dream of Electric Sheep__English-242L.pdf ├── Quantum Computing Introduction MITPRESS_2011.pdf ├── Complexity ihn Physics- .pdf ├── GOODFELLOW_AVIAN (books about birds).pdf ├── j.physrep.2024.01.012.pdf └── 10.1007-978-3-031-04083-2.pdf

After

├── Do Androids Dream of Electric Sheep, (Philip K. Dick), Doubleday, (1968).pdf ├── A Gentle Introduction to Quantum Computing, (Eleanor Rieffel), MIT Press, (2011).pdf ├── More Than the Sum of the Parts, Complexity in Physics and Beyond, (Helmut Satz), Oxford University Press, (2022).pdf ├── Avian Architecture, (Peter Goodfellow), Princeton University Press, 2nd Ed, (2024).pdf ├── Quantum Phase Transitions in Driven Systems, (Smith et al.), Physical Review, (2024).pdf └── Emergence in Complex Networks, (Lee Johnson), arXiv, (2024).pdf

This project is a work-in-progress:

  • Back up your PDFs
  • Run the scripts iteratively on a small subset of your collection before scaling up
  • Monitor your OpenAI API costs: Displayed costs are only estimates

Quickstart

Get up and running quickly