pi-arxivist

Fetch arxiv papers as Markdown (pi extension)

Packages

Package details

extension

Install pi-arxivist from npm and Pi will load the resources declared by the package manifest.

npm repo home report

$ pi install npm:pi-arxivist

Package: pi-arxivist
Version: 0.1.7
Published: Jun 18, 2026
Downloads: 687/mo · 687/wk
Author: lhufo
License: MIT
Types: extension
Size: 62.7 KB
Dependencies: 2 dependencies · 3 peers

Pi manifest JSON

{
  "extensions": [
    "./dist/index.js"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-arxivist

Fetch arxiv papers as clean Markdown, right inside pi. Zero config, zero system dependencies.

Arxiv provides LaTeX source tarballs for most papers. fetch_arxiv downloads the source, flattens \input/\include references, and converts the result to Markdown via pandoc. No PDF extraction, no garbled math, no lost structure.

Install

pi install npm:pi-arxivist

Usage

fetch_arxiv 1203.6859
fetch_arxiv https://arxiv.org/abs/1203.6859
fetch_arxiv https://arxiv.org/pdf/1203.6859

Accepts bare IDs, abstract URLs, or PDF URLs.

What it returns

paper.md — full paper in the cache directory, math preserved as $...$ / $$...$$
meta.json — full frontmatter as JSON (title, abstract, authors, etc.)
preamble.tex — macro definitions that pandoc couldn't process, extracted for inspection

The tool truncates output to fit context limits. Use read on the output path for the full paper.

How it works

Downloads the source tarball from arxiv.org/e-print/<id>
Extracts with tar
Builds a dependency graph from \input/\include references across all .tex files, and selects the root by indegree
Resolves the graph into a single flat document (circular-reference-safe, \includeonly-aware)
Converts the full source to Markdown via the official pandoc WASM binary
Extracts metadata from the pandoc-generated YAML frontmatter
Extracts unprocessed preamble macros to preamble.tex

No system pandoc or LaTeX distribution needed.