More

mksglu · 2026-03-08T10:47:19 1772966839

OpenAI published "Harness Engineering" in February 2026. The thesis: engineers don't write code anymore. They design environments, specify intent, and build feedback loops. Agents do the rest. Codex proved the concept. Symphony showed the architecture in Elixir/OTP. I wanted the same thing with Claude Code.

So I built Hatice.

mksglu · 2026-03-07T07:35:44 1772868944

OpenAI published "Harness Engineering" in February 2026. The thesis: engineers don't write code anymore. They design environments, specify intent, and build feedback loops. Agents do the rest.

Codex proved the concept. Symphony showed the architecture in Elixir/OTP. I wanted the same thing with Claude Code.

So I built Hatice.

mksglu · 2026-03-05T19:55:05 1772740505

Six days ago I posted a write-up on Hacker News explaining how Context Mode reduces Claude Code's context consumption by 98%. I expected a handful of comments. I got 565 points, 107 comments, and the kind of feedback that makes you stay up until 4 AM shipping features.

What's new:

  - Five platform adapters: Claude Code, Gemini CLI, VS Code Copilot, OpenCode, Codex CLI
  - Session Continuity: survives context compactions, restores state automatically (~3h sessions vs ~30 min)
  - ctx_batch_execute: one call replaces 30+ individual commands
  - Three-layer fuzzy search: Porter stemming, trigram substring, Levenshtein
  - FTS5 deduplication, background mode, SSL cert auto-detection

Breaking: repo renamed claude-context-mode → context-mode, tools prefixed with ctx_.

  Install (Claude Code): /install context-mode
  Install (all platforms): npm install -g context-mode

  GitHub: https://github.com/mksglu/context-mode
  Release notes: https://github.com/mksglu/context-mode/releases/tag/v1.0.0

Four adapters are in beta — PRs welcome. The codebase is designed for Agentic Engineering: point your coding agent at it and let it propose fixes.

Thank you to everyone who commented, starred, and opened issues on the original post. You made this release happen.

mksglu · 2026-02-28T21:17:47 1772313467

Thanks, really appreciate hearing that! Glad it's working well for your team.

tomhow · 2026-03-01T01:09:55 1772327395

HN Mod here. Is the date on the post an error? It says Feb 2025 but the project seems new. I initially went to put a date reference on the HN title but then realised it's more likely a mistake on your post.

doctorpangloss · 2026-03-01T18:38:48 1772390328

His post, code and all the replies here are LLM authored and don't make any sense. He has no idea why his Claude Code instance wrote Feb 2025 instead of Feb 2026. I mean all his results are placebos or nonsense. I can also start new conversations with only 2% of the context in it, or you can call compact, it will all work better. The post has to be flagged.

mksglu · 2026-02-28T21:17:29 1772313449

Yeah it's basically pre-compaction, you're right. The key difference is nothing gets thrown away. The full output sits in a searchable FTS5 index, so if the model realizes it needs some detail it missed in the summary, it can search for it. It's less "decide what's relevant upfront" and more "give me the summary now, let me come back for specifics later."

mksglu · 2026-02-28T21:17:04 1772313424

That's the theory and it does hold up in practice. When context is 70% raw logs and snapshots, the model starts losing track of the actual task. We haven't run formal benchmarks on answer quality yet, mostly focused on measuring token savings. But anecdotally the biggest win is sessions lasting longer before compaction kicks in, which means the model keeps its full conversation history and makes fewer mistakes from lost context.

overfeed · 2026-03-01T05:03:38 1772341418

> When context is 70% raw logs and snapshots, the model starts losing track of the actual task

Which frontier model will (re)introduce the radical idea of separating data from executable instructions?

mksglu · 2026-02-28T21:16:33 1772313393

That's a fair point and honestly the ideal approach. But in practice most people don't hand-curate their MCP server list per task. They install 5-6 servers and suddenly have 80 tools loaded by default. Context-mode doesn't solve the tool definition bloat, that's the input side problem. It handles the output side, when those tools actually run and dump data back. Even with a focused set of tools, a single Playwright snapshot or git log can burn 50k tokens. That's what gets sandboxed.

mksglu · 2026-02-28T21:14:34 1772313274

It doesn't break the cache. The raw data never enters the conversation history, so there's nothing to invalidate. A short summary goes into context instead of the full payload, and the model can search the full data from a local FTS5 index if it needs specifics later. Cache stays intact because you're just appending smaller messages to the conversation.

mksglu · 2026-02-28T21:14:06 1772313246

Nice approach. Same core idea as context-mode but specialized for your build domain. You're using SQLite as a structured knowledge cache over YAML rule files with keyword lookup. Context-mode does something similar but domain-agnostic, using FTS5 with BM25 ranking so any tool output becomes searchable without needing predefined schemas. Cool to see the pattern emerge independently from a completely different use case.

mksglu · 2026-02-28T21:12:54 1772313174

That's true, Claude Code does truncate large outputs now. But 25k tokens is still a lot, especially when you're running multiple tools back to back. Three or four Playwright snapshots or a batch of GitHub issues and you've burned 100k tokens on raw data you only needed a few lines from. Context-mode typically brings that down to 1-2k per call while keeping the full output searchable if you need it later.