Skip to main content

What is a context window?

A context window is the maximum amount of text an AI model can process at once. Think of it as the model’s “working memory” - it determines how much of your conversation and code the model can consider when generating responses.
What are tokens? Tokens are small chunks of text that AI models read. A token is roughly ¾ of a word. For example, the word “hamburger” becomes two tokens: “ham” and “burger”. When we talk about context window sizes like “128K tokens,” that means the AI can read about 96,000 words at once.
Key point: Larger context windows allow the model to understand more of your codebase at once, but may increase costs and response times.

Quick reference

SizeTokensApproximate WordsUse Case
Small8K-32K6,000-24,000Single files, quick fixes
Medium128K~96,000Most coding projects
Large200K~150,000Complex codebases
Extra Large400K+~300,000+Entire applications
Massive1M+~750,000+Multi-project analysis

Model context windows

ModelContext WindowEffective Window*Notes
Claude Sonnet 4.51M tokens~500K tokensBest quality at high context
GPT-5400K tokens~300K tokensThree modes affect performance
Gemini 2.5 Pro1M+ tokens~600K tokensExcellent for documents
DeepSeek V3128K tokens~100K tokensOptimal for most tasks
Qwen3 Coder256K tokens~200K tokensGood balance
*Effective window is where the model maintains high quality. Beyond this point, the AI may start “forgetting” earlier parts of your conversation.

What counts toward context

  1. Your current conversation - All messages in the chat
  2. File contents - Any files you’ve shared or CodinIT has read
  3. Tool outputs - Results from executed commands
  4. System prompts - CodinIT’s instructions (minimal impact)

Optimization strategies

1. Start fresh for new features

/new - Creates a new task with clean context
Benefits:
  • Maximum context available
  • No irrelevant history
  • Better model focus

2. Use @ mentions strategically

Instead of including entire files:
  • @filename.ts - Include only when needed
  • Use search instead of reading large files
  • Reference specific functions rather than whole files

3. Enable auto-compact

CodinIT can automatically summarize long conversations to free up space in the context window:
  • Settings → Features → Auto-compact
  • Preserves important context
  • Reduces token usage

Context window warnings

Signs you’re hitting limits

Warning SignWhat It MeansSolution
”Context window exceeded”Hard limit reachedStart new task or enable auto-compact
Slower responsesModel struggling with contextReduce included files
Repetitive suggestionsContext fragmentationSummarize and start fresh
Missing recent changesContext overflowUse checkpoints to track changes

Best practices by project size

Small projects (< 50 files)

  • Any model works well
  • Include relevant files freely
  • No special optimization needed

Medium projects (50-500 files)

  • Use 128K+ context models
  • Include only working set of files
  • Clear context between features

Large projects (500+ files)

  • Use 200K+ context models
  • Focus on specific modules
  • Use search instead of reading many files
  • Break work into smaller tasks

Advanced context management

Plan/Act mode optimization

Leverage Plan/Act mode for better context usage:
  • Plan Mode: Use smaller context for discussion and planning
  • Act Mode: Include necessary files when you’re ready to write code
Configuration example:
Plan Mode: DeepSeek V3 (128K) - Lower cost planning
Act Mode: Claude Sonnet (1M) - Maximum context for coding

Context pruning strategies

These are ways CodinIT can reduce the amount of text in your context window:
  1. Temporal pruning: Remove older parts of your conversation that are no longer relevant
  2. Semantic pruning: Keep only the code sections related to your current task
  3. Hierarchical pruning: Keep the big picture but remove fine details

Token counting tips

Rough estimates

  • 1 token ≈ 0.75 words (so 1,000 tokens is about 750 words)
  • 1 token ≈ 4 characters
  • 100 lines of code ≈ 500-1000 tokens

File size guidelines

File TypeTokens per KB
Code~250-400
JSON~300-500
Markdown~200-300
Plain text~200-250

Context window FAQ

Q: Why do responses get worse with very long conversations?

A: Models can lose focus with too much context. The “effective window” is typically 50-70% of the advertised limit.

Q: Should I use the largest context window available?

A: Not always. Larger contexts increase cost and can reduce response quality. Match the context to your task size.

Q: How can I tell how much context I’m using?

A: CodinIT shows token usage in the interface. Watch for the context meter approaching limits.

Q: What happens when I exceed the context limit?

A: CodinIT will either:
  • Automatically compact the conversation (if enabled)
  • Show an error and suggest starting a new task
  • Truncate older messages (with warning)

Recommendations by use case

Use CaseRecommended ContextModel Suggestion
Quick fixes32K-128KDeepSeek V3
Feature development128K-200KQwen3 Coder
Large refactoring400K+Claude Sonnet 4.5
Code review200K-400KGPT-5
Documentation128KAny budget model