doc scraper (jina.ai).com
doc scraper (jina.ai).com logo

Doc Scraper (Jina.ai)

Converts web documentation to clean markdown using Jina.ai's API, enabling easy transformation of online docs for conten...

Created byApr 23, 2025

Doc Scraper MCP Server

[![smithery badge](https://smithery.ai/badge/@askjohngeorge/mcp-doc-scraper)](https://smithery.ai/server/@askjohngeorge/mcp-doc-scraper)
A Model Context Protocol (MCP) server that provides documentation scraping functionality. This server converts web-based documentation into markdown format using jina.ai's conversion service.

Features

  • Scrapes documentation from any web URL
  • Converts HTML documentation to markdown format
  • Saves the converted documentation to a specified output path
  • Integrates with the Model Context Protocol (MCP)

Installation

Installing via Smithery

To install Doc Scraper for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@askjohngeorge/mcp-doc-scraper):
  1. Clone the repository:
  1. Create and activate a virtual environment:
  1. Install the dependencies:

Usage

The server can be run using Python:

Tool Description

The server provides a single tool:
  • **Name**: `scrape_docs`
  • **Description**: Scrape documentation from a URL and save as markdown
  • **Input Parameters**: - `url`: The URL of the documentation to scrape - `output_path`: The path where the markdown file should be saved

Project Structure

Dependencies

  • aiohttp
  • mcp
  • pydantic

Development

To set up the development environment:
  1. Install development dependencies:
  1. The server uses the Model Context Protocol. Make sure to familiarize yourself with [MCP documentation](https://modelcontextprotocol.io/).

License

MIT License