A Model Context Protocol server for browser automation using Python scripts. For use with Cline
<a href="https://glama.ai/mcp/servers/0aqrsbhx3z"><img width="380" height="200" src="https://glama.ai/mcp/servers/0aqrsbhx3z/badge" alt="Browser Use Server MCP server" /></a>
Features
Browser Operations
screenshot: Capture a screenshot of a webpage (full page or viewport)
get_html: Retrieve the HTML content of a webpage
execute_js: Execute JavaScript on a webpage
get_console_logs: Get console logs from a webpage
All operations support custom interaction steps (e.g., clicking elements, scrolling) after page load.
Prerequisites
(Optional but recommended) Install Xvfb for headless browser automation:
Xvfb (X Virtual Frame Buffer) creates a virtual display, allowing browser automation without detection as a bot. Learn more about Xvfb here.
Install Miniconda or Anaconda
Create a Conda environment:
Set up LLM configuration:
The server supports multiple LLM providers. You can use any of the following API keys:
The server will automatically use the first available API key it finds. You can optionally customize the model and base URL for any provider using the environment variables.
Installation
Installing via Smithery
To install Browser Use Server for Claude Desktop automatically via Smithery:
Clone this repository
Install dependencies:
Build the server:
MCP Configuration
Add the following configuration to your Cline MCP settings:
Replace:
YOUR_HOME with your actual home directory name
your_api_key with your actual API keys
Usage
Run the server:
The server will be available on stdio and supports the following operations:
Screenshot
Parameters:
url: The webpage URL (required)
full_page: Whether to capture the full page or just the viewport (optional, default: false)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Get HTML
Parameters:
url: The webpage URL (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Execute JavaScript
Parameters:
url: The webpage URL (required)
script: JavaScript code to execute (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Get Console Logs
Parameters:
url: The webpage URL (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Example Cline Usage
Here are some example tasks you can accomplish using the browser-use server with Cline:
Modifying Web Page Elements during Development
To change the color of a heading on a page that requires authentication:
This task demonstrates:
Multi-step browser automation using comma-separated steps
Authentication handling
Cookie acceptance
DOM manipulation
CSS styling changes
The server will execute these steps sequentially, handling any required interactions along the way.
Configuration
LLM Configuration
The server supports multiple LLM providers with their default configurations:
GLHF: Uses deepseek-ai/DeepSeek-V3 model
Ollama: Uses qwen2.5:32b-instruct-q4_K_M model with 32k context window
Groq: Uses deepseek-r1-distill-llama-70b model
OpenAI: Uses gpt-4o-mini model
Openrouter: Uses deepseek/deepseek-chat model
Github: Uses gpt-4o-mini model
DeepSeek: Uses deepseek-chat model
Gemini: Uses gemini-2.0-flash-exp model
You can override these defaults using environment variables:
MODEL: Set a custom model name for any provider
BASE_URL: Set a custom API endpoint URL (if the provider supports it)
Vision Support
The server supports vision capabilities through the USE_VISION environment variable:
Set USE_VISION=true to enable vision capabilities for browser operations
Default is false to optimize performance when vision is not needed
Useful for tasks that require visual understanding of webpage content
Xvfb Support
The server automatically detects if Xvfb is installed and:
Uses xvfb-run when available, enabling better browser automation without bot detection
Falls back to direct execution when Xvfb is not installed
A Model Context Protocol server for browser automation using Python scripts. For use with Cline
<a href="https://glama.ai/mcp/servers/0aqrsbhx3z"><img width="380" height="200" src="https://glama.ai/mcp/servers/0aqrsbhx3z/badge" alt="Browser Use Server MCP server" /></a>
Features
Browser Operations
screenshot: Capture a screenshot of a webpage (full page or viewport)
get_html: Retrieve the HTML content of a webpage
execute_js: Execute JavaScript on a webpage
get_console_logs: Get console logs from a webpage
All operations support custom interaction steps (e.g., clicking elements, scrolling) after page load.
Prerequisites
(Optional but recommended) Install Xvfb for headless browser automation:
Xvfb (X Virtual Frame Buffer) creates a virtual display, allowing browser automation without detection as a bot. Learn more about Xvfb here.
Install Miniconda or Anaconda
Create a Conda environment:
Set up LLM configuration:
The server supports multiple LLM providers. You can use any of the following API keys:
The server will automatically use the first available API key it finds. You can optionally customize the model and base URL for any provider using the environment variables.
Installation
Installing via Smithery
To install Browser Use Server for Claude Desktop automatically via Smithery:
Clone this repository
Install dependencies:
Build the server:
MCP Configuration
Add the following configuration to your Cline MCP settings:
Replace:
YOUR_HOME with your actual home directory name
your_api_key with your actual API keys
Usage
Run the server:
The server will be available on stdio and supports the following operations:
Screenshot
Parameters:
url: The webpage URL (required)
full_page: Whether to capture the full page or just the viewport (optional, default: false)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Get HTML
Parameters:
url: The webpage URL (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Execute JavaScript
Parameters:
url: The webpage URL (required)
script: JavaScript code to execute (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Get Console Logs
Parameters:
url: The webpage URL (required)
steps: Comma-separated actions or sentences describing steps to take after page load (optional)
Example Cline Usage
Here are some example tasks you can accomplish using the browser-use server with Cline:
Modifying Web Page Elements during Development
To change the color of a heading on a page that requires authentication:
This task demonstrates:
Multi-step browser automation using comma-separated steps
Authentication handling
Cookie acceptance
DOM manipulation
CSS styling changes
The server will execute these steps sequentially, handling any required interactions along the way.
Configuration
LLM Configuration
The server supports multiple LLM providers with their default configurations:
GLHF: Uses deepseek-ai/DeepSeek-V3 model
Ollama: Uses qwen2.5:32b-instruct-q4_K_M model with 32k context window
Groq: Uses deepseek-r1-distill-llama-70b model
OpenAI: Uses gpt-4o-mini model
Openrouter: Uses deepseek/deepseek-chat model
Github: Uses gpt-4o-mini model
DeepSeek: Uses deepseek-chat model
Gemini: Uses gemini-2.0-flash-exp model
You can override these defaults using environment variables:
MODEL: Set a custom model name for any provider
BASE_URL: Set a custom API endpoint URL (if the provider supports it)
Vision Support
The server supports vision capabilities through the USE_VISION environment variable:
Set USE_VISION=true to enable vision capabilities for browser operations
Default is false to optimize performance when vision is not needed
Useful for tasks that require visual understanding of webpage content
Xvfb Support
The server automatically detects if Xvfb is installed and:
Uses xvfb-run when available, enabling better browser automation without bot detection
Falls back to direct execution when Xvfb is not installed