ZeroClaw — 자동 컴팩션

Layer 2 학습. 실제 소스 코드 기반. 파일: src/agent/agent.rs, src/memory/consolidation.rs, src/agent/memory_loader.rs

두 가지 메커니즘

컴팩션이 두 개의 독립적인 레이어에서 동작해.

대화 진행
    │
    ├─ 매 turn 완료 후
    │   └─ consolidation.rs ← LLM이 대화를 읽고 기억 추출 (fire-and-forget)
    │
    └─ turn() 루프 안에서
        └─ trim_history() ← max_history_messages 초과 시 오래된 메시지 삭제

Layer 1: trim_history() — 단순 절단

turn() 안에서 tool call 결과를 history에 push할 때마다 호출.

fn trim_history(&mut self) {
    let max = self.config.max_history_messages; // 기본값 50

    // system 메시지와 일반 메시지를 분리
    let mut system_messages = Vec::new();
    let mut other_messages = Vec::new();

    for msg in self.history.drain(..) {
        match &msg {
            ConversationMessage::Chat(chat) if chat.role == "system" => {
                system_messages.push(msg); // system은 항상 보존
            }
            _ => other_messages.push(msg),
        }
    }

    // 초과분 앞에서 제거 (가장 오래된 것부터)
    if other_messages.len() > max {
        let drop_count = other_messages.len() - max;
        other_messages.drain(0..drop_count);
    }

    self.history = system_messages;
    self.history.extend(other_messages);
}

특징:

system prompt(도구 목록, 안전 지침)는 절대 삭제 안 됨
가장 오래된 대화부터 잘라냄
LLM 호출 없음. 즉각적.
50개 제한에서 51번째가 들어오면 1번째를 버림

Layer 2: consolidation.rs — LLM 기억 추출

매 turn 완료 후 백그라운드에서 LLM이 대화를 읽고 두 가지를 추출.

// loop_.rs 에서 fire-and-forget으로 호출
tokio::spawn(async move {
    consolidate_turn(provider, model, memory, user_msg, assistant_resp).await
});

추출 프롬프트

You are a memory consolidation engine. Given a conversation turn, extract:
1. "history_entry": A brief summary (1-2 sentences)
2. "memory_update": NEW facts/preferences/decisions worth remembering long-term.
                    Return null if nothing new was learned.

Respond ONLY with valid JSON:
{"history_entry": "...", "memory_update": "..." or null}

두 단계 저장

pub async fn consolidate_turn(...) {
    let turn_text = format!("User: {}\nAssistant: {}", user_msg, assistant_resp);

    // 4000자 초과 시 잘라냄 (char boundary 안전하게)
    let truncated = ...;

    // LLM 호출 (temperature 0.1 - 결정적)
    let raw = provider.chat_with_system(
        Some(CONSOLIDATION_SYSTEM_PROMPT),
        &truncated,
        model,
        0.1,
    ).await?;

    let result: ConsolidationResult = parse_consolidation_response(&raw, &turn_text);

    // Phase 1: Daily 카테고리에 대화 요약 저장
    let history_key = format!("daily_{}_{}", date, uuid);
    memory.store(&history_key, &result.history_entry, Daily, None).await?;

    // Phase 2: Core 카테고리에 새 사실 저장 (있을 때만)
    if let Some(update) = result.memory_update {
        let imp = importance::compute_importance(&update, &Core); // 중요도 계산
        conflict::check_and_resolve_conflicts(memory, ...).await; // 충돌 검사
        memory.store_with_metadata(&mem_key, &update, Core, None, None, Some(imp)).await?;
    }
}

구체적인 예

대화:
  User: "Rust는 메모리 안전성이 중요해"
  Assistant: "맞아요. borrow checker가..."

↓ LLM이 추출:

history_entry:
  "User discussed Rust memory safety and borrow checker concepts."
  → daily_2026-03-22_{uuid} 로 Daily 저장

memory_update:
  "User values memory safety in systems programming, prefers Rust."
  → core_{uuid} 로 Core 저장, 중요도 계산 포함

Layer 3: DefaultMemoryLoader — 기억을 대화에 주입

다음 turn 시작 시 저장된 기억을 user message 앞에 주입.

async fn load_context(&self, memory, user_message, session_id) {
    let entries = memory.recall(user_message, limit=5, ...).await?;

    let mut context = String::from("[Memory context]\n");
    for entry in entries {
        // 필터링 기준
        if is_assistant_autosave_key(&entry.key) { continue; } // legacy 노이즈 스킵
        if should_skip_autosave_content(&entry.content) { continue; } // cron 메시지 스킵
        if score < min_relevance_score(0.4) { continue; } // 관련도 낮으면 스킵

        writeln!(context, "- {}: {}", entry.key, entry.content);
    }

    context // → user message 앞에 붙여서 LLM에 전달
}

결과적으로 LLM이 받는 메시지:

[Memory context]
- core_abc: User values memory safety in systems programming, prefers Rust.
- daily_2026-03-22_xyz: User discussed Rust memory safety and borrow checker concepts.

[2026-03-22 15:30:00 KST] 오늘 과제 마감이 있어?

전체 흐름

turn N 완료
    │
    ├─ trim_history()
    │   max 50개 초과? → 오래된 메시지 삭제 (system 보존)
    │
    └─ tokio::spawn(consolidate_turn())  ← 비동기, 다음 turn 안 기다림
            │
            ├─ LLM이 대화 읽기 (temperature 0.1)
            ├─ history_entry → Daily 카테고리 저장
            └─ memory_update → Core 카테고리 저장 (있을 때만)
                                  + 중요도 계산
                                  + 충돌 검사

turn N+1 시작
    │
    └─ memory_loader.load_context()
            │
            ├─ recall(user_message, limit=5)  ← FTS + 벡터 하이브리드 검색
            └─ "[Memory context]\n- key: value\n" 형태로 user message 앞에 주입

trim_history vs consolidation 비교

| | trim_history | consolidation | |--|---|---| | 시점 | 매 tool call 후 | turn 완료 후 (비동기) | | 방식 | 단순 오래된 것 삭제 | LLM이 의미 추출 | | 비용 | 없음 | LLM 호출 1회 | | 결과 | history 크기 제한 | 장기 기억 저장 | | 목적 | context window 관리 | 세션 간 기억 지속 |

compact_context 플래그

[agent]
compact_context = false  # 기본값

true로 설정 시 loop_.rs에서 history 전체를 LLM이 압축하는 추가 레이어가 동작해. trim_history가 단순히 오래된 것을 자르는 것과 달리, LLM이 전체 히스토리를 읽고 핵심만 남긴 요약으로 교체해.

내 프로젝트 전략

일정 관리 agent는 세션 간 과제/일정 정보가 지속돼야 해:

[agent]
max_history_messages = 30   # 일정 쿼리는 짧아서 충분
compact_context = false      # 단순 trim으로 충분

[memory]
auto_save = true             # turn마다 user message 자동 저장
backend = "sqlite"

consolidation이 자동으로 과제 관련 대화를 Core로 추출해줌:

User: "알고리즘 과제 제출했어"
→ memory_update: "User submitted algorithm assignment on 2026-03-22"
→ Core 카테고리로 저장

다음 대화 시 자동 주입:

[Memory context]
- core_abc: User submitted algorithm assignment on 2026-03-22

오늘 할 일이 뭐야?