pi-arxivist
Fetch arxiv papers as Markdown (pi extension)
Package details
Install pi-arxivist from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-arxivist- Package
pi-arxivist- Version
0.1.3- Published
- Jun 16, 2026
- Downloads
- not available
- Author
- lhufo
- License
- MIT
- Types
- extension
- Size
- 50.4 KB
- Dependencies
- 1 dependency · 2 peers
Pi manifest JSON
{
"extensions": [
"./dist/index.js"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
pi-arxivist
Fetch arxiv papers as clean Markdown, right inside pi. Zero config, zero system dependencies.
Arxiv provides LaTeX source tarballs for most papers. fetch_arxiv downloads the source, flattens \input/\include references, and converts the result to Markdown via pandoc. No PDF extraction, no garbled math, no lost structure.
Install
pi install npm:pi-arxivist
Usage
fetch_arxiv 1203.6859
fetch_arxiv https://arxiv.org/abs/1203.6859
fetch_arxiv https://arxiv.org/pdf/1203.6859
Accepts bare IDs, abstract URLs, or PDF URLs.
What it returns
- Title, authors, abstract — extracted from the document metadata (pandoc handles nested braces,
\thanksfootnotes) - Body as Markdown — math preserved as
$...$/$$...$$, unknown LaTeX commands passed through as raw TeX - Output path — full paper at
output/paper.mdinside the cache directory - Preamble path — macro definitions extracted to
preamble.texso you can inspect them on demand
The tool truncates output to fit context limits. Use read on the output path for the rest.
How it works
- Downloads the source tarball from
arxiv.org/e-print/<id> - Extracts with
tar - Finds the main
.texfile (heuristic: first file with\documentclass) - Recursively resolves
\input/\includecommands into a single flat document - Splits preamble from body, writes preamble to
preamble.tex - Extracts metadata (title, authors, abstract) via pandoc's JSON AST
- Converts body to Markdown via the official pandoc WASM binary
No system pandoc or LaTeX distribution needed.