pi-arxivist
Fetch arxiv papers as Markdown (pi extension)
Package details
Install pi-arxivist from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-arxivist- Package
pi-arxivist- Version
0.1.7- Published
- Jun 18, 2026
- Downloads
- 687/mo · 687/wk
- Author
- lhufo
- License
- MIT
- Types
- extension
- Size
- 62.7 KB
- Dependencies
- 2 dependencies · 3 peers
Pi manifest JSON
{
"extensions": [
"./dist/index.js"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-arxivist
Fetch arxiv papers as clean Markdown, right inside pi. Zero config, zero system dependencies.
Arxiv provides LaTeX source tarballs for most papers. fetch_arxiv downloads the source, flattens \input/\include references, and converts the result to Markdown via pandoc. No PDF extraction, no garbled math, no lost structure.
Install
pi install npm:pi-arxivist
Usage
fetch_arxiv 1203.6859
fetch_arxiv https://arxiv.org/abs/1203.6859
fetch_arxiv https://arxiv.org/pdf/1203.6859
Accepts bare IDs, abstract URLs, or PDF URLs.
What it returns
paper.md— full paper in the cache directory, math preserved as$...$/$$...$$meta.json— full frontmatter as JSON (title, abstract, authors, etc.)preamble.tex— macro definitions that pandoc couldn't process, extracted for inspection
The tool truncates output to fit context limits. Use read on the output path for the full paper.
How it works
- Downloads the source tarball from
arxiv.org/e-print/<id> - Extracts with
tar - Builds a dependency graph from
\input/\includereferences across all.texfiles, and selects the root by indegree - Resolves the graph into a single flat document (circular-reference-safe,
\includeonly-aware) - Converts the full source to Markdown via the official pandoc WASM binary
- Extracts metadata from the pandoc-generated YAML frontmatter
- Extracts unprocessed preamble macros to
preamble.tex
No system pandoc or LaTeX distribution needed.