Context Management

Context
Compaction

Work indefinitely on complex tasks. Lurus Code intelligently compresses your context - without losing information.

Automatic

No Data Loss

70% Token Savings

Try Now How It Works

70%

Token Savings

Average compression rate

200k

Max Context

Tokens per conversation

Instant

Compression Time

With Haiku model

100%

Path Preservation

All file paths preserved

How Context Compaction Works

An intelligent process that compresses your context without losing important information.

Threshold Reached

When 50% (or 85% in Max Mode) of context is used, compression starts automatically.

Split Messages

70% older messages are summarized, 30% most recent are kept verbatim.

Create State Snapshot

A structured XML document captures all critical information: goal, files, tasks, constraints.

Continue Seamlessly

The agent continues with compressed context - no interruption, no questions asked.

Compaction Features

Everything you need to work indefinitely on complex tasks.

Automatic Compaction

Claude Code Style - Seamless & Intelligent

When your context reaches 50% of the limit, Lurus Code automatically compresses the conversation. You continue working seamlessly - the agent retains all important information.

50% threshold (default)
85% threshold (Max Mode)
Minimum 6 messages
Calibration via API feedback
Fail-safe fallback

State Snapshot

Structured XML Summarization

Compression creates a structured <state_snapshot> with all critical information: goal, constraints, file paths, actions and task status. No context is lost.

<overall_goal> - Main objective
<active_constraints> - Rules
<key_knowledge> - Technical details
<artifact_trail> - All file paths
<task_state> - Status of all tasks
<user_messages> - User instructions

Relevance Scoring

Intelligent Prioritization over FIFO

Instead of simply deleting old messages, Lurus Code scores each message by relevance. Important information is preserved, redundant content is removed.

User messages: Score 90
File edits: Score 80
Errors: Score 70
Bash commands: Score 40
Redundant reads: Score 5
Recency boost: 2x

Tool Output Truncation

Intelligently Truncate Large Outputs

Large tool outputs (Grep, Read, Bash) are intelligently truncated. Stale reads for unedited files are reduced to 5 lines - the agent can re-read the file if needed.

50k token budget per output
Stale read detection
Grep/Glob top-10 truncation
Edited files stay complete
Automatic hints

Max Mode

Maximum Context Utilization

For complex tasks: Max Mode delays compaction until 85% of context. More context = better results for large refactorings or code reviews.

/max slash command
85% instead of 50% threshold
Ideal for large codebases
Manually toggleable
Status visible in /status

Context Condensation

Compress Project Instructions

When LURUS.md or project rules become too large, Lurus Code automatically condenses them. All MUST/NEVER/ALWAYS rules are preserved - only redundancy is removed.

Haiku model (fast & cheap)
80k character input limit
Verbatim rules preservation
Fallback: Head/tail truncation
Token budget configurable

State Snapshot Format

This is what compressed context looks like - structured, complete, lossless.

state_snapshot.xml

<state_snapshot>
  <overall_goal>Implement user authentication with OAuth2</overall_goal>
  <active_constraints>
    - Use TypeScript strict mode
    - Follow existing patterns in src/auth/
    - No external dependencies without approval
  </active_constraints>
  <key_knowledge>
    - Auth service at src/services/auth.service.ts
    - JWT secret in .env.local
    - Database: PostgreSQL with Prisma
  </key_knowledge>
  <artifact_trail>
    MODIFIED: src/services/auth.service.ts - Added OAuth2 flow
    CREATED: src/types/oauth.types.ts - OAuth response types
    READ: src/config/database.ts - Connection settings
  </artifact_trail>
  <task_state>
    1. [DONE] Create OAuth types
    2. [IN PROGRESS] Implement token refresh
    3. [TODO] Add logout endpoint
  </task_state>
</state_snapshot>

Slash Commands

# Manual compaction
/compact

# Enable Max Mode (85% threshold)
/max

# Check status
/status
# → Context: 45% (Max Mode: ON)

# Change utility model for summarization
/utilitymodel sonnet

lurus.config.json

// lurus.config.json
{
  "compactionModel": "haiku",
  "compactInstructions": "Preserve all API endpoints and their parameters",
  "maxContextTokens": 200000
}

Technical Details

For developers who want to know the details.

Compression Algorithm

70% older messages → Summarized
30% recent messages → Kept verbatim
Clean boundary detection (user message start)
Incremental summarization with existing snapshot
Verification pass for path completeness

Security Features

Prompt injection protection in summaries
Adversarial content detection
Strict XML format enforcement
No command execution from history
Sanitized tool outputs

Configuration

compactionModel in lurus.config.json
compactInstructions for custom prompts
/utilitymodel for summarization model
Max context tokens per model
Safety buffer: 20k tokens

With vs. Without Compaction

The difference is dramatic - especially for long sessions.

Without Compaction

Context limit reached after ~30 minutes
Session must be restarted
Context is completely lost
Agent forgets previous work
Files must be re-read
High token costs from repetition

With Compaction

Unlimited session length possible
Automatic background compression
All important information preserved
Agent remembers everything
No redundant file reads
70% token savings

Related Features

Context Compaction works seamlessly with these features.

Session Management

Persistent sessions with auto-save and seamless resume.

Learn more →

Inline Edit

Select code, give instruction, AI edits - directly in your editor.