Just Prompt - A lightweight MCP server for LLM providers

just-prompt is a Model Control Protocol (MCP) server that provides a unified interface to various Large Language Model (LLM) providers including OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, and Ollama. See how we use the ceo_and_board tool to make hard decisions easy with o3 here.

Tools

The following MCP tools are available in the server:

`prompt`: Send a prompt to multiple LLM models

`prompt_from_file`: Send a prompt from a file to multiple LLM models

`prompt_from_file_to_file`: Send a prompt from a file to multiple LLM models and save responses as markdown files

`ceo_and_board`: Send a prompt to multiple 'board member' models and have a 'CEO' model make a decision based on their responses

`list_providers`: List all available LLM providers

`list_models`: List all available models for a specific LLM provider

Provider Prefixes

every model must be prefixed with the provider nameuse the short name for faster referencing

o or openai: OpenAI

a or anthropic: Anthropic

g or gemini: Google Gemini

q or groq: Groq

d or deepseek: DeepSeek

l or ollama: Ollama

Features

Unified API for multiple LLM providers

Support for text prompts from strings or files

Run multiple models in parallel

Automatic model name correction using the first model in the --default-models list

Ability to save responses to files

Easy listing of available providers and models

Installation

Environment Variables

Create a .env file with your API keys (you can copy the .env.sample file):

Then edit the .env file to add your API keys (or export them in your shell):

Claude Code Installation

In all these examples, replace the directory with the path to the just-prompt directory.

Default models set to openai:o3:high, openai:o4-mini:high, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.5-pro-preview-03-25, and gemini:gemini-2.5-flash-preview-04-17.

If you use Claude Code right out of the repository you can see in the .mcp.json file we set the default models to...

The --default-models parameter sets the models to use when none are explicitly provided to the API endpoints. The first model in the list is also used for model name correction when needed. This can be a list of models separated by commas.

When starting the server, it will automatically check which API keys are available in your environment and inform you which providers you can use. If a key is missing, the provider will be listed as unavailable, but the server will still start and can be used with the providers that are available.

Using `mcp add-json`

Copy this and paste it into claude code with BUT don't run until you copy the json

JSON to copy

With a custom default model set to openai:gpt-4o.

With multiple default models:

Using `mcp add` with project scope

`mcp remove`

claude mcp remove just-prompt

Running Tests

Codebase Structure

Note: Code block was split into 2 parts due to size limits.

Context Priming

READ README.md, pyproject.toml, then run git ls-files, and 'eza --git-ignore --tree' to understand the context of the project.

Reasoning Effort with OpenAI o Series

For OpenAI o series reasoning models (o4-mini, o3-mini, o3) you can control how much internal reasoning the model performs before producing a visible answer.

Append one of the following suffixes to the model name (after the provider prefix):

:low minimal internal reasoning (faster, cheaper)

:medium balanced (default if omitted)

:high thorough reasoning (slower, more tokens)

Examples:

openai:o4-mini:low

o:o4-mini:high

When a reasoning suffix is present, just prompt automatically switches to the OpenAI Responses API (when available) and sets the corresponding reasoning.effort parameter. If the installed OpenAI SDK is older, it gracefully falls back to the Chat Completions endpoint and embeds an internal system instruction to approximate the requested effort level.

Thinking Tokens with Claude

The Anthropic Claude model claude-3-7-sonnet-20250219 supports extended thinking capabilities using thinking tokens. This allows Claude to do more thorough thought processes before answering.

You can enable thinking tokens by adding a suffix to the model name in this format:

anthropic:claude-3-7-sonnet-20250219:1k - Use 1024 thinking tokens

anthropic:claude-3-7-sonnet-20250219:4k - Use 4096 thinking tokens

anthropic:claude-3-7-sonnet-20250219:8000 - Use 8000 thinking tokens

Notes:

Thinking tokens are only supported for the claude-3-7-sonnet-20250219 model

Valid thinking token budgets range from 1024 to 16000

Values outside this range will be automatically adjusted to be within range

You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.)

Thinking Budget with Gemini

The Google Gemini model gemini-2.5-flash-preview-04-17 supports extended thinking capabilities using thinking budget. This allows Gemini to perform more thorough reasoning before providing a response.

You can enable thinking budget by adding a suffix to the model name in this format:

gemini:gemini-2.5-flash-preview-04-17:1k - Use 1024 thinking budget

gemini:gemini-2.5-flash-preview-04-17:4k - Use 4096 thinking budget

gemini:gemini-2.5-flash-preview-04-17:8000 - Use 8000 thinking budget

Notes:

Thinking budget is only supported for the gemini-2.5-flash-preview-04-17 model

Valid thinking budget range from 0 to 24576

Values outside this range will be automatically adjusted to be within range

You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.)

Resources

https://docs.anthropic.com/en/api/models-list?q=list+models

https://github.com/googleapis/python-genai

https://platform.openai.com/docs/api-reference/models/list

https://api-docs.deepseek.com/api/list-models

https://github.com/ollama/ollama-python

https://github.com/openai/openai-python

Just Prompt - A lightweight MCP server for LLM providers

Tools

The following MCP tools are available in the server:

`prompt`: Send a prompt to multiple LLM models

`prompt_from_file`: Send a prompt from a file to multiple LLM models

`prompt_from_file_to_file`: Send a prompt from a file to multiple LLM models and save responses as markdown files

`ceo_and_board`: Send a prompt to multiple 'board member' models and have a 'CEO' model make a decision based on their responses

`list_providers`: List all available LLM providers

`list_models`: List all available models for a specific LLM provider

Provider Prefixes

every model must be prefixed with the provider nameuse the short name for faster referencing

o or openai: OpenAI

a or anthropic: Anthropic

g or gemini: Google Gemini

q or groq: Groq

d or deepseek: DeepSeek

l or ollama: Ollama

Features

Unified API for multiple LLM providers

Support for text prompts from strings or files

Run multiple models in parallel

Automatic model name correction using the first model in the --default-models list

Ability to save responses to files

Easy listing of available providers and models

Installation

Environment Variables

Create a .env file with your API keys (you can copy the .env.sample file):

Then edit the .env file to add your API keys (or export them in your shell):

Claude Code Installation

In all these examples, replace the directory with the path to the just-prompt directory.

Default models set to openai:o3:high, openai:o4-mini:high, anthropic:claude-3-7-sonnet-20250219:4k, gemini:gemini-2.5-pro-preview-03-25, and gemini:gemini-2.5-flash-preview-04-17.

If you use Claude Code right out of the repository you can see in the .mcp.json file we set the default models to...

Using `mcp add-json`

Copy this and paste it into claude code with BUT don't run until you copy the json

JSON to copy

With a custom default model set to openai:gpt-4o.

With multiple default models:

Using `mcp add` with project scope

`mcp remove`

claude mcp remove just-prompt

Running Tests

Codebase Structure

Note: Code block was split into 2 parts due to size limits.

Context Priming

READ README.md, pyproject.toml, then run git ls-files, and 'eza --git-ignore --tree' to understand the context of the project.

Reasoning Effort with OpenAI o Series

For OpenAI o series reasoning models (o4-mini, o3-mini, o3) you can control how much internal reasoning the model performs before producing a visible answer.

Append one of the following suffixes to the model name (after the provider prefix):

:low minimal internal reasoning (faster, cheaper)

:medium balanced (default if omitted)

:high thorough reasoning (slower, more tokens)

Examples:

openai:o4-mini:low

o:o4-mini:high

Thinking Tokens with Claude

The Anthropic Claude model claude-3-7-sonnet-20250219 supports extended thinking capabilities using thinking tokens. This allows Claude to do more thorough thought processes before answering.

You can enable thinking tokens by adding a suffix to the model name in this format:

anthropic:claude-3-7-sonnet-20250219:1k - Use 1024 thinking tokens

anthropic:claude-3-7-sonnet-20250219:4k - Use 4096 thinking tokens

anthropic:claude-3-7-sonnet-20250219:8000 - Use 8000 thinking tokens

Notes:

Thinking tokens are only supported for the claude-3-7-sonnet-20250219 model

Valid thinking token budgets range from 1024 to 16000

Values outside this range will be automatically adjusted to be within range

You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.)

Thinking Budget with Gemini

You can enable thinking budget by adding a suffix to the model name in this format:

gemini:gemini-2.5-flash-preview-04-17:1k - Use 1024 thinking budget

gemini:gemini-2.5-flash-preview-04-17:4k - Use 4096 thinking budget

gemini:gemini-2.5-flash-preview-04-17:8000 - Use 8000 thinking budget

Notes:

Thinking budget is only supported for the gemini-2.5-flash-preview-04-17 model

Valid thinking budget range from 0 to 24576

Values outside this range will be automatically adjusted to be within range

You can specify the budget with k notation (1k, 4k, etc.) or with exact numbers (1024, 4096, etc.)

Resources

https://docs.anthropic.com/en/api/models-list?q=list+models

https://github.com/googleapis/python-genai

https://platform.openai.com/docs/api-reference/models/list

https://api-docs.deepseek.com/api/list-models

https://github.com/ollama/ollama-python

https://github.com/openai/openai-python

Just Prompt (Multi-LLM Provider)

Just Prompt - A lightweight MCP server for LLM providers

Tools

Provider Prefixes

Features

Installation

Environment Variables

Claude Code Installation

Using `mcp add-json`

Using `mcp add` with project scope

`mcp remove`

Running Tests

Codebase Structure

Context Priming

Reasoning Effort with OpenAI o Series

Thinking Tokens with Claude

Thinking Budget with Gemini

Resources

Just Prompt - A lightweight MCP server for LLM providers

Tools

Provider Prefixes

Features

Installation

Environment Variables

Claude Code Installation

Using `mcp add-json`

Using `mcp add` with project scope

`mcp remove`

Running Tests

Codebase Structure

Context Priming

Reasoning Effort with OpenAI o Series

Thinking Tokens with Claude

Thinking Budget with Gemini

Resources

Related servers

MagicSlides MCP Server

Time

CUA MCP Server

WhatsApp Bridge