Hacker Newsnew | past | comments | ask | show | jobs | submit | prateekdalal's commentslogin

Hi HN, Over the last 3 days, I shipped a major update to Kakveda (OSS). The goal: unify infrastructure monitoring, observability, Kubernetes visibility, profiling, DB monitoring, and RUM into a single open source stack. What’s included in v1.0.5: • Full host metrics (CPU, memory, disk, network, processes, load, temp) • Docker metrics + diagnostics • Kubernetes inventory (nodes, pods, deployments, services, configmaps, secrets) • Realtime service map (dependency graph) • APM error grouping + exception replay • Service-level latency/error percentiles • Continuous profiling + trace correlation • Database monitoring (slow queries, fingerprints, explain support) • RUM (web vitals, frontend errors, user activity) • SLO / error budget diagnostics • Cross-telemetry correlation • Unified native agent (infra + APM + K8s) Most vendors (Datadog, AppDynamics, etc.) provide these features across multiple paid modules. Kakveda provides them in OSS. Would love feedback from infra/SRE folks on architecture and scalability considerations. Repo: https://github.com/prateekdevisingh/kakveda


Over the last 48 hours we shipped a major update to our open-source observability + infra + AI runtime stack.

Big changes:

Renamed kakveda-aankh → kakveda-netra

Built as a real host-level agent (CLI + systemd support)

Background mode (--start, --stop, --bg-status)

Full infra metrics:

CPU, memory, disk, network, load

Docker container metrics

temperature, FD usage, diagnostics

Kubernetes inventory:

nodes, pods, deployments, services, configmaps, secrets

kubeconfig auto-detection

Dashboard upgrades:

golden signals

trend charts

thresholds + alert coloring

detailed drilldowns

Custom dashboards:

saved layouts

YAML import/export

Runtime control of host agent from dashboard

Cost comparison docs (Datadog, AppDynamics, Logz.io, LangSmith etc.)

The idea is simple: Infra monitoring + AI/agent governance + observability in one platform.

Still early. Would love feedback from infra engineers and K8s folks. Github : https://github.com/prateekdevisingh/kakveda


We’ve released v1.0.2 and v1.0.3 of Kakveda (open source AI governance runtime).

This release shifts integration from manual middleware to an SDK-first model.

Key change: Instead of calling /warn and /publish manually, agents now integrate via: from kakveda_sdk import KakvedaAgent

agent = KakvedaAgent()

agent.execute( prompt="delete user records", tool_name="db_admin", execute_fn=real_function )

The SDK handles:

Pre-flight policy checks

Event publishing

Trace ingestion

Dashboard registration

Heartbeat monitoring

Fail-closed behavior

Circuit breaker logic

Legacy manual helpers were removed to reduce integration friction.

The goal is to treat LLMs as suggestion engines while keeping execution inside a deterministic governance layer.

Would appreciate feedback from folks running multi-agent systems in production.


Over the past year, I’ve noticed something interesting in production AI systems:

Failures don’t just happen — they repeat.

Slightly different prompts. Different agents. Same structural breakdown.

Most tooling today focuses on:

Prompt quality

Observability

Tracing

But very few systems treat failures as structured knowledge that should influence future execution.

What if instead of just logging AI failures, we:

Store them as canonical failure entities

Generate deterministic fingerprints for new executions

Match against prior failures

Gate execution before the mistake repeats

This changes the boundary between “AI suggestion” and “system authority.”

Curious how others are thinking about structured failure memory in AI systems — especially once agents start touching real tools.


Thanks for sharing — that article points at the same core tension. Determinism isn’t about rejecting probabilistic systems, it’s about deciding where uncertainty is allowed to live.

What keeps breaking in practice is when probabilistic reasoning leaks into places that expect reproducibility and accountability.


I mostly agree — using LLMs to help author deterministic code is a good fit. The distinction I’m trying to draw is that once that code exists, the determinism has to live outside the model.

LLMs can assist in creating the rules, but they shouldn’t be the place where those rules are enforced or bypassed at runtime.


I doubt ISO-9000 gets “replaced” so much as interpreted more strictly in the presence of LLMs. ISO-9000 isn’t about how work is done — it’s about whether processes are defined, repeatable, auditable, and improvable.

From that lens, LLMs actually create tension rather than an escape hatch. A system whose outputs can’t be reproduced, explained, or bounded makes it harder to demonstrate compliance, not easier. Saying “we use AI” doesn’t satisfy requirements around traceability, corrective action, or process control.

My guess is that ISO-style frameworks will push organizations toward explicitly classifying where LLMs are allowed to operate: as advisory inputs, as drafting aids, or as automation under defined controls — with clear ownership and validation steps around them.

In other words, the pressure probably won’t be to loosen standards, but to reassert them: define where probabilistic components sit, what checks exist before outputs become authoritative, and how failures are detected and corrected. Without that structure, it’s hard to see how certification survives unchanged.


I don’t disagree with the underlying concern. In practice, “probabilistic” often does translate to unreliable when you put these systems in environments that expect reproducibility and accountability.

Where I think the framing matters is in how we respond architecturally. Treating LLMs as “just another unreliable program” is reasonable — but enterprises already have patterns for dealing with unreliable components: isolation, validation, gates, and clear ownership of side effects.

The problem we’re seeing is that LLMs are often dropped past those boundaries — allowed to directly author decisions or actions — which is why the downstream damage you mention (journals, courts, OSS) feels so chaotic.

The “suggestion engine” framing isn’t meant to excuse that behavior; it’s meant to reassert a familiar control model. Suggestions are cheap. Execution and publication are not. Once you draw that line explicitly, you can start asking the same questions enterprises always ask: who approves, what’s logged, and what happens when this is wrong?

Without that separation, I agree — you’re effectively wiring an unreliable component straight into systems that assume trust, and the failure modes shouldn’t surprise anyone.


That sounds very aligned. I like the way you phrased it - deterministic policy that agents can not bypass is exactly the right boundary, especially once you assume prompt injection and misalignment are not edge cases but normal operating conditions.

On the use case side, what we have been seeing (and discussing internally) isn’t one narrow workflow so much as a recurring pattern across domains: anywhere an LLM starts influencing actions that have irreversible or accountable consequences.

That shows up in security, but also in ops, infra, finance, and internal tooling - places where “suggesting” is fine, but executing without a gate is not. In those environments, the blocker usually isn’t model capability; it is the lack of a deterministic layer that can enforce constraints, log decisions, and give people confidence about why something was allowed or stopped.

Security tends to surface this problem first because the blast radius is obvious, but we are starting to see similar concerns come up once agents touch production systems, money, or compliance-sensitive workflows.

I am curious from your side — are you finding that security teams are more receptive to this model than other parts of the org, or are you still having to convince people that “agent autonomy” needs hard boundaries?


Yeah you're right security is ground zero - it's where "LLM said it's fine" first stops being acceptable.

My worry: industry is pushing "LLM guarding LLM" as the solution because its easy to ship. But probabilistic defense like that won't work and creates systemic risk.

Would love to hear more about your use-cases. Email in bio if you're up for it.


This resonates a lot, and I think your example actually captures the core failure mode really well.

What your PM asked for isn’t an “agentic pipeline” problem - it’s an organizational knowledge and accountability problem. LLMs are being used as a substitute for missing context, missing ownership, and missing validation paths.

In a system like that (30+ years, COBOL, interdependent routines), the hardest parts are not parsing code — they are understanding why things exist, which constraints were intentional, and which tradeoffs are still valid. None of that lives in the code, and no model can infer it reliably without human anchors.

This is where I have seen LLMs work better as assistive tools rather than autonomous agents: helping summarize, cluster, or surface patterns — but not being expected to produce “the” design document, especially when there is no stakeholder capable of validating it.

Without determinism around inputs, review, and ownership, the output might look impressive but it’s effectively unverifiable. That’s a risky place to be, especially for early-career engineers being asked to carry responsibility without authority.

I don’t think the problem is that LLMs are not powerful enough — it is that they are often being dropped into systems where the surrounding structure (governance, validation, incentives) simply isn’t there.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: