Integrates computer vision and automation libraries to enable AI-assisted control of desktop applications through visual...
Created byApr 22, 2025
omniparser-autogui-mcp
This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI.Confirmed on Windows.
License notes
This is MIT license, but Excluding submodules and sub packages.OmniParser's repository is CC-BY-4.0.Each OmniParser model has a different license (reference).
Installation
Please do the following:
(Other than Windows, use export instead of set.)(If you want langchain_example.py to work, uv sync --extra langchain instead.)
Add this to your claude_desktop_config.json:
(Replace D:\\CLONED_PATH\\omniparser-autogui-mcp with the directory you cloned.)
env allows for the following additional configurations:
OMNI_PARSER_BACKEND_LOADIf it does not work with other clients (such as LibreChat), specify 1.
TARGET_WINDOW_NAMEIf you want to specify the window to operate, please specify the window name.If not specified, operates on the entire screen.
OMNI_PARSER_SERVERIf you want OmniParser processing to be done on another device, specify the server's address and port, such as 127.0.0.1:8000.The server can be started with uv run omniparserserver.
SSE_HOST, SSE_PORTIf specified, communication will be done via SSE instead of stdio.
SOM_MODEL_PATH, CAPTION_MODEL_NAME, CAPTION_MODEL_PATH, OMNI_PARSER_DEVICE, BOX_TRESHOLDThese are for OmniParser configuration.Usually, they are not necessary.
Usage Examples
Search for "MCP server" in the on-screen browser.
etc.
omniparser-autogui-mcp
This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI.Confirmed on Windows.
License notes
This is MIT license, but Excluding submodules and sub packages.OmniParser's repository is CC-BY-4.0.Each OmniParser model has a different license (reference).
Installation
Please do the following:
(Other than Windows, use export instead of set.)(If you want langchain_example.py to work, uv sync --extra langchain instead.)
Add this to your claude_desktop_config.json:
(Replace D:\\CLONED_PATH\\omniparser-autogui-mcp with the directory you cloned.)
env allows for the following additional configurations:
OMNI_PARSER_BACKEND_LOADIf it does not work with other clients (such as LibreChat), specify 1.
TARGET_WINDOW_NAMEIf you want to specify the window to operate, please specify the window name.If not specified, operates on the entire screen.
OMNI_PARSER_SERVERIf you want OmniParser processing to be done on another device, specify the server's address and port, such as 127.0.0.1:8000.The server can be started with uv run omniparserserver.
SSE_HOST, SSE_PORTIf specified, communication will be done via SSE instead of stdio.
SOM_MODEL_PATH, CAPTION_MODEL_NAME, CAPTION_MODEL_PATH, OMNI_PARSER_DEVICE, BOX_TRESHOLDThese are for OmniParser configuration.Usually, they are not necessary.