pi-webfetch-to-markdown

Fetch web content as clean Markdown for AI consumption. Supports Cloudflare's Markdown for Agents content negotiation with Turndown fallback.

Package details

extension

Install pi-webfetch-to-markdown from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:pi-webfetch-to-markdown
Package
pi-webfetch-to-markdown
Version
1.0.1
Published
Mar 4, 2026
Downloads
58/mo · 13/wk
Author
richardanaya
License
MIT
Types
extension
Size
6.7 KB
Dependencies
1 dependency · 2 peers
Pi manifest JSON
{
  "extensions": [
    "./index.ts"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-webfetch-to-markdown

A pi extension that fetches web content and returns it as Markdown, optimized for AI consumption.

The Problem

Modern websites are built for humans, not AI agents. Raw HTML contains navigation bars, scripts, styling, and semantic markup that adds noise and increases token usage when processed by AI systems. As Cloudflare notes in their "Markdown for Agents" article, feeding raw HTML to AI is inefficient:

"Feeding raw HTML to an AI is like paying by the word to read packaging instead of the letter inside."

The Solution

This extension implements a two-tier approach:

1. Content Negotiation (Cloudflare's Markdown for Agents)

Following the Cloudflare "Markdown for Agents" standard, this extension first attempts to request Markdown directly from the server using the HTTP Accept: text/markdown header. When a website supports this (like Cloudflare-enabled zones), the server returns clean, structured Markdown instead of HTML.

Benefits:

  • ~80% reduction in token usage compared to HTML
  • No computation overhead for conversion
  • Preserves the content creator's intent
  • Returns metadata via frontmatter (title, description, etc.)

2. Fallback: Turndown HTML-to-Markdown Conversion

When a server doesn't support content negotiation, the extension falls back to Turndown — a robust HTML-to-Markdown converter. Turndown intelligently strips:

  • Scripts and styles
  • Navigation elements
  • Non-semantic markup
  • Inline styles and classes

Result: Clean, readable Markdown that AI can process efficiently.

Installation

Install via pi from npm:

pi install npm:pi-webfetch-to-markdown

Or install directly from GitHub:

pi install git:github.com/richardanaya/pi-webfetch-to-markdown

Pin to a specific version with @version (e.g., pi install npm:pi-webfetch-to-markdown@1.0.0).

Test without installing using pi -e git:github.com/richardanaya/pi-webfetch-to-markdown.

Usage

Once installed in your pi environment, the extension provides the webfetch_to_markdown tool:

// The tool accepts a URL and returns markdown
{
  url: "https://example.com/article"
}

How It Works

  1. Request: Sends HTTP request with Accept: text/markdown, text/html
  2. Negotiation: If server returns text/markdown, returns it directly
  3. Conversion: If server returns HTML, Turndown converts it to Markdown
  4. Output: Returns clean Markdown with source metadata

Why Markdown for AI?

As outlined in the Cloudflare article:

Format Token Count Efficiency
HTML ~16,180 tokens Baseline
Markdown ~3,150 tokens ~80% reduction

Markdown's explicit structure (headings, lists, links) makes it ideal for AI processing, resulting in better comprehension and reduced context window pressure.

Related

License

MIT