LLM Providers (BYOLLM)

Neural Inverse never routes your code through Neural Inverse servers. You bring your own API keys — they are stored locally and used directly from your machine.

Supported Providers

Cloud Providers

Provider	Setup
Anthropic	API key from console.anthropic.com
OpenAI	API key from platform.openai.com
Google Gemini	API key from aistudio.google.com
xAI (Grok)	API key from x.ai
DeepSeek	API key from platform.deepseek.com
Mistral	API key from console.mistral.ai
Groq	API key from console.groq.com
OpenRouter	API key from openrouter.ai
GitHub Models	GitHub PAT with `models:read` scope
Fireworks AI	API key from fireworks.ai
Cerebras	API key from cloud.cerebras.ai
AWS Bedrock	AWS credentials + region (default: `us-east-1`)
Google Vertex AI	GCP project + region (default: `us-west2`)
Microsoft Azure	Azure resource name + API key + API version

Local / Self-Hosted Providers

Provider	Default endpoint	Notes
Ollama	`http://localhost:11434`	Models auto-detected
vLLM	`http://localhost:8000`	Models auto-detected
LM Studio	`http://localhost:1234`	Models auto-detected
LiteLLM	Custom endpoint	OpenAI-compatible proxy
OpenAI-Compatible	Custom endpoint	Any OpenAI-format API, custom headers supported

Default Models

Neural Inverse ships with a curated default model list per provider. You can add any model string not in the list.

Anthropic: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001, claude-3-7-sonnet-latest, claude-3-5-sonnet-latest, claude-3-5-haiku-latest

OpenAI: gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.1-codex, o3, o4-mini

Google Gemini: gemini-3.1-pro-preview, gemini-3-flash-preview, gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.0-flash

xAI: grok-4.20, grok-4-1-fast-reasoning, grok-3, grok-3-mini

DeepSeek: deepseek-chat, deepseek-reasoner

GitHub Models: openai/gpt-4.1, openai/gpt-4.1-mini, openai/gpt-4.1-nano, openai/o4-mini, openai/o3-mini, deepseek/deepseek-r1, meta/llama-4-scout-17b-16e-instruct, mistralai/mistral-small-2503, xai/grok-3-mini

Fireworks AI: accounts/fireworks/models/llama-v3p3-70b-instruct, accounts/fireworks/models/deepseek-r1, accounts/fireworks/models/qwen3-235b-a22b, accounts/fireworks/models/qwen3-32b, accounts/fireworks/models/gemma-4-31b-it, accounts/fireworks/models/gpt-oss-120b, accounts/fireworks/models/gpt-oss-20b

Cerebras: llama3.1-8b, gpt-oss-120b, qwen-3-235b-a22b-instruct-2507

Ollama / vLLM / LM Studio: Models are auto-detected from your running server — no manual list needed.

Adding a Provider

Open Settings (gear icon or Cmd+, / Ctrl+,).
Go to Neural Inverse > LLM Providers.
Select the provider and enter your API key or endpoint.
Click Verify to test the connection.

For local providers (Ollama, vLLM, LM Studio), start your server first — Neural Inverse polls the endpoint to discover available models automatically.

Selecting a Model per Feature

Neural Inverse lets you assign different models to different features:

Chat — the model used in the sidebar chat panel
Autocomplete — the model used for inline code completions
Ctrl+K — the model used for inline edit (Cmd+K / Ctrl+K)
Apply — the model used when applying suggested changes
Power Mode — the model used by Power Mode agents

Each feature has an independent model selection. You can use a fast local model for autocomplete and a powerful cloud model for Power Mode.

OpenAI-Compatible Endpoints

For any provider that exposes an OpenAI-format API (LiteLLM proxy, custom inference server, etc.):

Select OpenAI Compatible as the provider.
Enter the base endpoint URL.
Enter the API key (leave blank if not required).
Optionally add custom request headers as JSON (e.g. for authentication headers).

Air-Gapped / Offline Setup

For environments with no internet access:

Run Ollama or vLLM on a local server within your network.
Point Neural Inverse at the local endpoint.
All LLM traffic stays within your network boundary.

No Neural Inverse cloud dependency is required after initial installation.

Was this page helpful?

On this page