@lydst/pi-webfetch

A pi package that fetches public web pages for AI agents.

Package details

extension

Install @lydst/pi-webfetch from npm and Pi will load the resources declared by the package manifest.

$ pi install npm:@lydst/pi-webfetch
Package
@lydst/pi-webfetch
Version
0.1.2
Published
Apr 3, 2026
Downloads
206/mo · 6/wk
Author
lydst
License
unknown
Types
extension
Size
50 KB
Dependencies
4 dependencies · 2 peers
Pi manifest JSON
{
  "extensions": [
    "./dist/extensions"
  ]
}

Security note

Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.

README

pi-webfetch

pi-webfetch is a standalone TypeScript pi package that registers exactly one tool: webfetch.

The tool fetches a single public URL and returns AI-friendly markdown by default, with optional plain text or cleaned HTML output.

Install

Local path

If you are installing from a source checkout, run npm install first so the compiled extension exists under dist/extensions.

pi install /path/to/pi-webfetch

Git

pi install https://github.com/langgengydst/pi-webfetch

npm

pi install npm:@lydst/pi-webfetch

Direct extension loading

After building the package locally:

npm run build
pi --no-extensions -e ./dist/extensions/webfetch.js

Tool

Name

webfetch

Description

Fetch a public web page and return AI-optimized content in markdown, text, or html.

Arguments

Argument Type Default Notes
url string required Fully-qualified http or https URL
format "markdown" | "text" | "html" "markdown" Markdown is the default output
timeout number 20000 Milliseconds, min 1000, max 120000
maxChars number 25000 Applied after conversion, min 1000, max 200000
mainContentOnly boolean true Prefer readable article/main content
includeMetadata boolean true Includes optional source metadata such as title and description
includeLinksSummary boolean false Adds discovered links to details.links
includeImagesSummary boolean false Adds discovered images to details.images

Behavior

  • Accepts only http and https URLs.
  • Uses fetch with AbortController timeout enforcement.
  • Follows redirects and records the final URL in metadata.
  • Accepts text/html, text/plain, and text/markdown.
  • Converts HTML to cleaned markdown by default.
  • Preserves code blocks during markdown conversion.
  • Applies maxChars to the final returned text, including the source header.
  • Truncates after conversion with a visible suffix when maxChars is exceeded.
  • Returns stable error codes such as INVALID_URL, UNSUPPORTED_PROTOCOL, TIMEOUT, HTTP_ERROR, and UNSUPPORTED_CONTENT_TYPE.

Result shape

Successful calls return content like:

{
  "content": [{ "type": "text", "text": "Source: https://example.com/..." }],
  "details": {
    "url": "https://example.com/start",
    "finalUrl": "https://example.com/final",
    "format": "markdown",
    "status": 200,
    "ok": true,
    "contentType": "text/html",
    "fetchedAt": "2026-04-03T10:00:00.000Z",
    "truncated": false,
    "originalLength": 1234,
    "returnedLength": 1234
  }
}

Failures still return a tool result with readable text plus details.errorCode and details.isError.

When the fetched source is already text/plain or text/markdown, details.format reflects the actual returned representation.

Usage examples

  • Fetch https://example.com and summarize it.
  • Use webfetch to get https://example.com/blog/post as markdown.
  • Fetch https://example.com/changelog as text with a 10000ms timeout.
  • Use webfetch on https://example.com/docs with format html and maxChars 12000.

Development

npm test
npm run build
npm pack --dry-run

The package builds ESM output into dist/ and exposes the compiled extension directory through:

{
  "pi": {
    "extensions": ["./dist/extensions"]
  }
}

Limitations

  • No JavaScript browser automation or rendered-page execution.
  • No login-protected or session-authenticated pages.
  • No multi-page crawling.
  • No PDF or image OCR.
  • No guaranteed robots.txt enforcement. This package does not claim robots enforcement in v1.