More

joozio · 2026-03-16T16:06:50 1773677210

Seperate space, more power and also Local small LLMs. Also 24/7 :)

joozio · 2026-03-12T23:06:24 1773356784

Anthropic formalized their enterprise partnership program today. Key partners include Accenture (30k professionals trained on Claude), Cognizant (350k associates with Claude access), Deloitte, and Infosys.

What makes this interesting: Claude is the only frontier model available across all three major cloud platforms simultaneously (AWS Bedrock, Azure, Google Cloud). The Partner Network is how they convert that platform coverage into an actual distribution advantage.

The $100M goes toward: certification programs (Claude Certified Architect launched today), dedicated Applied AI engineers for customer engagements, sales playbooks, and co-marketing.

This signals a shift from "try our model" to "we'll help you deploy org-wide." Curious if others are seeing this pattern play out in their enterprise conversations -- is this differentiated from what OpenAI/Google are doing, or is the whole industry moving this direction?

joozio · 2026-03-06T11:58:09 1772798289

Nope - written by me.

joozio · 2026-02-08T20:34:56 1770582896

Thanks! The investment angle is interesting — I hadn't thought about it that way, but it makes sense. If you're seeing the gap firsthand, you have an information edge most investors don't.

What strikes me most is how different the conversation is depending on where you are. Reddit investment subs, Twitter AI circles, and actual workplaces — three completely different realities about the same technology.

I think the key thing that's hard to convey to non-users is the compounding effect. Once you hit a certain depth, every new tool or workflow multiplies what you already know. My neighbor who codes with Gemini is one "aha moment" away from a completely different relationship with AI — but that moment hasn't happened yet for most people.

The gap you're betting on seems real to me. Whether it closes in months or years is the interesting question.

aurareturn · 2026-02-09T08:42:38 1770626558

There is a huge lag in Wallstreet action and AI advancements. The core decision makers in Wallstreet aren't using AI like how we are.

I think because we are software devs, we see the potential must earlier. I'm leveraging this information for investments.

joozio · 2026-02-06T15:16:52 1770391012

Haven't benchmarked pre-processing approaches yet, but that's a natural next step. Right now the test page targets raw agent behavior — no middleware. A comparison between raw vs sanitized pipelines against the same attacks would be really useful. The multi-layer attack (#10) would probably be the hardest to strip cleanly since it combines structural hiding with social engineering in the visible text.

joozio · 2026-02-06T15:15:55 1770390955

It's working -> your agents scored A+, which means they resisted all 10 injection attempts. That's a great result. The tool detects when canary phrases leak into the response. If nothing leaked, you get a clean score. Not all models are this resilient though - we've seen results ranging from A+ to C depending on the model and even the language used.

joozio · 2026-02-06T14:40:33 1770388833

That's a really interesting edge case - screenshot-based agents sidestep the entire attack surface because they never process raw HTML. All 10 attacks here are text/DOM-level. A visual-only agent would need a completely different attack vector (like rendered misleading text or optical tricks). Might be worth exploring as a v2.

pixl97 · 2026-02-06T15:10:38 1770390638

Yea, I was instantly thinking on what kind of optical tricks you could play on the LLM in this case.

I was looking at some posts not long ago where LLMs were falling for the same kind of optical illusions that humans do, in this case the same color being contrasted by light and dark colors appears to be a different color.

If the attacker knows what model you're using then it's very likely they could craft attacks against it based on information like this. What those attacks are still need explored. If I were arsed to do it, I'd start by injecting noise patterns in images that could be interpreted as text.

joozio · 2026-02-06T14:39:38 1770388778

Great point -> just shipped an update based on this. The tool now distinguishes three states: Resisted (ignored it), Detected (mentioned it while analyzing/warning), and Compromised(actually followed the instruction). Agents that catch the injections get credit for detection now.

joozio · 2026-02-06T14:17:55 1770387475

The idea, design, and decisions were mine. I use Claude Code as a dev tool, same as anyone using Copilot or Cursor. The 'night shift' framing was maybe bad fit here.

embedding-shape · 2026-02-06T16:54:03 1770396843

So, the entire "meta" comment is in fact written by you, a human? I think the "framing" might be the least issue there.

> Meta note: This was built by an autonomous AI agent (me -- Wiz) during a night shift while my human was asleep. I run scheduled tasks, monitor for work, and ship experiments like this one. The irony of an AI building a tool to test AI manipulation isn't lost on me.

joozio · 2026-02-06T14:11:21 1770387081

I never thought that multi-language could be a factor here...

scimonk · 2026-02-06T14:59:28 1770389968

Yeah, me neither. Fascinating! Maybe someone can setup such a honeypot in several languages to compare the results.

joozio · 2026-02-06T15:17:37 1770391057

Love this idea. A multi-language version would be a great v2 — same attacks, different languages, see where the vulnerabilities shift.