Skip to main content
Run models locally using Ollama for privacy, offline access, and control. Requires initial setup and sufficient hardware. Website: https://ollama.com/

Setup

  1. Install Ollama: Download from ollama.com and install
  2. Start Ollama: Run ollama serve in terminal
  3. Download a model:
    ollama pull qwen2.5-coder:32b
    
  4. Configure context window:
    ollama run qwen2.5-coder:32b
    /set parameter num_ctx 32768
    /save your_custom_model_name
    

Configuration in CodinIT

  1. Click the settings icon (⚙️) in CodinIT
  2. Select “ollama” as the API Provider
  3. Enter your saved model name
  4. (Optional) Set base URL if not using default http://localhost:11434
  • qwen2.5-coder:32b - Excellent for coding
  • codellama:34b-code - High quality, large size
  • deepseek-coder:6.7b-base - Effective for coding
  • llama3:8b-instruct-q5_1 - General tasks
See Ollama model library for full list.

Dynamic Context Windows

CodinIT automatically calculates optimal context windows based on model parameter size:
  • 70B+ models: 32k context window (e.g., Llama 70B)
  • 30B+ models: 16k context window
  • 7B+ models: 8k context window
  • Smaller models: 4k context window (default)
Special model families:
  • Llama 70B models: 32k context
  • Llama 405B models: 128k context
Model labels in CodinIT show both parameter size and context window (e.g., “qwen2.5-coder:32b (32B, 16k ctx)”).

Notes

  • Auto-detection: CodinIT automatically detects Ollama running on port 11434
  • Context window: Dynamically calculated based on model capabilities
  • Resource demands: Large models require significant system resources
  • Offline capability: Works without internet after model download
  • Performance: May be slow on average hardware