jinaai.com
jinaai.com logo

JinaAI

Extracts and processes web content for efficient parsing and analysis of online information

Created byApr 22, 2025

mcp-jinaai-reader


Notice

This repository is no longer maintained.
The functionality of this tool is now available in mcp-omnisearch, which combines multiple MCP tools in one unified package.
Please use mcp-omnisearch instead.

A Model Context Protocol (MCP) server for integrating Jina.ai's Reader API with LLMs. This server provides efficient and comprehensive web content extraction capabilities, optimized for documentation and web content analysis.

Features

  • Advanced web content extraction through Jina.ai Reader API
  • Fast and efficient content retrieval
  • Complete text extraction with preserved structure
  • Clean format optimized for LLMs
  • Support for various content types including documentation
  • Built on the Model Context Protocol

Configuration

This server requires configuration through your MCP client. Here are examples for different environments:

Cline Configuration

Add this to your Cline MCP settings:

Claude Desktop with WSL Configuration

For WSL environments, add this to your Claude Desktop configuration:

Environment Variables

The server requires the following environment variable:
  • JINAAI_API_KEY: Your Jina.ai API key (required)

API

The server implements a single MCP tool with configurable parameters:

read_url

Convert any URL to LLM-friendly text using Jina.ai Reader.
Parameters:
  • url (string, required): URL to process
  • no_cache (boolean, optional): Bypass cache for fresh results. Defaults to false
  • format (string, optional): Response format ("json" or "stream"). Defaults to "json"
  • timeout (number, optional): Maximum time in seconds to wait for webpage load
  • target_selector (string, optional): CSS selector to focus on specific elements
  • wait_for_selector (string, optional): CSS selector to wait for specific elements
  • remove_selector (string, optional): CSS selector to exclude specific elements
  • with_links_summary (boolean, optional): Gather all links at the end of response
  • with_images_summary (boolean, optional): Gather all images at the end of response
  • with_generated_alt (boolean, optional): Add alt text to images lacking captions
  • with_iframe (boolean, optional): Include iframe content in response

Development

Setup

  1. Clone the repository
  1. Install dependencies:
  1. Build the project:
  1. Run in development mode:

Publishing

  1. Update version in package.json
  1. Build the project:
  1. Publish to npm:

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Acknowledgments

mcp-jinaai-reader


Notice

This repository is no longer maintained.
The functionality of this tool is now available in mcp-omnisearch, which combines multiple MCP tools in one unified package.
Please use mcp-omnisearch instead.

A Model Context Protocol (MCP) server for integrating Jina.ai's Reader API with LLMs. This server provides efficient and comprehensive web content extraction capabilities, optimized for documentation and web content analysis.

Features

  • Advanced web content extraction through Jina.ai Reader API
  • Fast and efficient content retrieval
  • Complete text extraction with preserved structure
  • Clean format optimized for LLMs
  • Support for various content types including documentation
  • Built on the Model Context Protocol

Configuration

This server requires configuration through your MCP client. Here are examples for different environments:

Cline Configuration

Add this to your Cline MCP settings:

Claude Desktop with WSL Configuration

For WSL environments, add this to your Claude Desktop configuration:

Environment Variables

The server requires the following environment variable:
  • JINAAI_API_KEY: Your Jina.ai API key (required)

API

The server implements a single MCP tool with configurable parameters:

read_url

Convert any URL to LLM-friendly text using Jina.ai Reader.
Parameters:
  • url (string, required): URL to process
  • no_cache (boolean, optional): Bypass cache for fresh results. Defaults to false
  • format (string, optional): Response format ("json" or "stream"). Defaults to "json"
  • timeout (number, optional): Maximum time in seconds to wait for webpage load
  • target_selector (string, optional): CSS selector to focus on specific elements
  • wait_for_selector (string, optional): CSS selector to wait for specific elements
  • remove_selector (string, optional): CSS selector to exclude specific elements
  • with_links_summary (boolean, optional): Gather all links at the end of response
  • with_images_summary (boolean, optional): Gather all images at the end of response
  • with_generated_alt (boolean, optional): Add alt text to images lacking captions
  • with_iframe (boolean, optional): Include iframe content in response

Development

Setup

  1. Clone the repository
  1. Install dependencies:
  1. Build the project:
  1. Run in development mode:

Publishing

  1. Update version in package.json
  1. Build the project:
  1. Publish to npm:

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see the LICENSE file for details.

Acknowledgments