The following blog is an edited excerpt from a TrojAI webinar TrojAI on security trends for 2026. The conversation features Lee Weiner, CEO at TrojAI; Rebekah Brown, Senior Researcher at Citizen Lab; and Tod Beardsley, Vice President of Security Research at runZero. The full webinar can be viewed on YouTube.
Attackers versus defenders
Lee Weiner: Where do you think the balance of power between attackers and defenders goes in 2026? Do you think it’s going to change?
Rebekah Brown: I think it’s always shifting. There’s always some new technology or innovation coming along, whether it’s something like Metasploit, or a new way to approach problems, like when threat intelligence first came onto the scene 15 years ago. That causes these little micro-shifts between attackers and defenders.
I think AI is going to make those shifts a little more extreme, but I still think it will be back and forth. I don’t think it’s going to swing wildly in the attacker’s direction, because even if attackers have new capabilities, defenders are doing the same thing. People are writing reports about that too.
I’ll be interested to see if this upsets the normal balance, but I expect business as usual: attackers and defenders both adjusting to new technologies and finding ways to use them to their own benefit.
Tod Beardsley: Totally agree. The expected value just moves, right? The degree of shifts feels bigger. It’s like, “AI is totally winning and we’re screwed forever,” then “Defense is winning and everything is blocked.” That swing in narrative is interesting.
But looking at AI, I think defense has a leg up right now and probably through 2026. The biggest thing LLMs are really good at is triage, which is wildly helpful for defenders.
Defenders have always suffered from alert fatigue. LLMs can help with triage. They will misclassify things, but that doesn’t mean an event gets dropped on the floor forever. It just gets triaged low. That’s not a total loss.
On the attack side, I think attackers have a long way to go before they can do much, with big asterisks around phishing and social engineering. Translation services are very good for attackers.
What AI gets attackers is scale. You can scale your garbage ridiculously high. So now, when people look for a proof of concept exploit, there are tons of them and most are garbage except for two. You get to pick which two, and that costs defenders time.
Any headline bug almost instantly gets someone typing into ChatGPT, “Write an exploit,” and posting it to GitHub, untested and broken. So attackers can overwhelm the research side of defense, but defense wins on triage.
Lee Weiner: That’s not what I would have expected you to say, because there’s this prevailing thinking that attackers don’t have to comply with regulations or constraints, but defenders do.
The Anthropic disclosure
Lee Weiner: That’s a good segue to the Anthropic disclosure. For those who don’t know, a few weeks ago Anthropic published a disclosure about a nation-state using Anthropic with a variety of agents to launch a large-scale attack. Did it change how you think about 2026?
Rebekah Brown: When I read it, I thought it was an interesting report. I’m really glad they published it, because I could imagine a company hesitating. People and regulators are already worried about AI, so publishing misuse of your product could feel risky.
From a threat intelligence perspective, it was useful to see what adversaries were doing and how they were using the model. I liked how they tied it back to their earlier reporting and showed the evolution and how quickly adversaries operationalized capabilities with agents.
I didn’t read it as groundbreaking or “everything is lost.” It’s a continuation of what attackers do: use the tools at their disposal. What was interesting was seeing their approach and what they were testing out. I can’t imagine Claude is the only model they were experimenting with. It was useful as a learning tool for defenders to adapt.
I know there was pushback on transparency, but taking it at face value, it seemed well researched and well presented.
Tod Beardsley: I was surprised they published it. Coming from an offensive security background at Rapid7, we were always careful about the idea that criminals use Metasploit too, and we didn’t put that in marketing material.
Anthropic went a different way. It felt kind of braggy. Like, “Hey, even criminals use our stuff! Isn’t that cool?” It read weird to me. There was a suspicious lack of IOCs (indicators of compromise).
And when you read it, it was something like: out of a target space of 30 organizations, they got shells on a couple. Cool. Maybe that’s acceptable for a state actor who can afford to be mostly wrong but a little right sometimes, but it still felt like a strange brag.
And ultimately, the things they reveal are to chain together a bunch of open source tooling that everyone already has access to, that real pen testers already know how to do. It didn’t strike me as particularly novel.
Lee Weiner: We did a bunch of analysis on it and published a blog. For us, it was more about the fact that you can manipulate models to do bad things. We’ve done research around “obliterating” a model, removing its safety instructions, and then it will do really bad things. We’ve mostly done that with models like Qwen, so we can simulate attacks against other agents and models.
It was interesting for increasing visibility. People might understand models can be manipulated, and that sometimes results in small leaks, but other times you can orchestrate exploitation activity.
Tod Beardsley: I encourage attackers to try to use AI to write Metasploit modules and exploits, because AIs are bad at learning on the job. Training is expensive and happens offline. They’ll commit the same sins over and over, and you’ll see artifacts that are trivially easy to detect. So please, bad guys, use AI to write real exploits.
That doesn’t count for social engineering though. Translation and “culturalization” are huge. You don’t have to be a native English speaker, or a native Texan, to sound like a Texan through AI. And you can dial it to not sound cartoonish.
Rebekah Brown: One thing I thought about was: what if this was an intern who was tasked with “figure out which model you can do this with,” and they were just mucking around. Then suddenly there’s a whole report written about them. That’s me speculating, but I wonder how much of this is experimentation versus real operational intent. Maybe someday we’ll see court filings and learn more.
Fundamental changes in 2026
Lee Weiner: Both of you have a lot of foundational cybersecurity knowledge. How do you think about where we are today and where we’re going? With all this innovation, do you think we’ll see a fundamental change in 2026 that’s different from what we saw before?
Tod Beardsley: I think where the action is, is triage. You can get things out of that that were difficult before. One of the biggest complaints about CVEs and disclosure is that there’s not enough information to act. People want a machine readable way to pin this down to a vendor and product to understand.
You can use LLMs to read vulnerability reports as long as you accept the error rate and there’s also a human review step. LLMs are designed for classification. By the end of 2026, I think it would be weird to be in a shop that doesn’t have some kind of LLM assist for what’s the vulnerability, do I have it, what can I do, is there a patch? Basic stuff that you’d task a junior person to do.
Rebekah Brown: I think there’s a lot of opportunity. LLMs are good at foundational capabilities and knowledge, and they’re good at teaching. They can take complex books we used to slog through and ask an LLM to rewrite it in a way a 16-year-old will absorb. That can raise the bar, make tasks easier, and free people up to do deeper work.
But there’s a difference between “rephrase this complex subject” and “do my homework for me.” Models hallucinate. If they hallucinate and no one knows how to verify, then we’re in a bad position. As long as we use these as tools and don’t lose institutional knowledge of how to validate, we’ll be in a good position.
Tod Beardsley: The human review step is crucial. And that’s the part that people will screw up. They’ll lay off a big chunk of their security workforce, farm it out to AI, and then find out they needed those people and their expertise.
Rebekah Brown: We’ve seen this with cyber attacks against critical infrastructure, like some water treatment facilities. When they have to flip the switch back to manual operations, the people there remember how to do it. We’re not too far from a time when that knowledge won’t exist any longer. Then we could be in a bad position, especially if you’ve fired all those people and are relying on automation.
The expanding attack surface
Lee Weiner: Another angle: companies are deploying AI broadly. How do you both think about expansion of the attack surface? Hallucinations pose risk and so does autonomy.
Rebekah Brown: The more we trust AI and it becomes autonomous, making decisions on our behalf, the more the attack surface expands. You can start targeting those agents the same way you target users. They’re not infallible.
From attack surface to exploitation
Lee Weiner: What new attack paths or categories of exploits should people consider in 2026?
Tod Beardsley: It’s the “click carefully” problem. We tell people to click carefully, but I don’t know how to click carefully versus uncarefully, and I don’t know how to train people to do it. These chatbots were dumped on us fast. That’s how the internet works. The catch up is: how do we train people?
Rebekah Brown: More automation, especially with fraud and financial abuse. If you click and it triggers a series of agent-based actions with no human in the loop, it’s faster. I’m not sure I’m seeing entirely new categories, but I see speed and scale increasing, with more extreme impacts.
How TrojAI can help
TrojAI's mission is to enable the secure rollout of AI in the enterprise. TrojAI delivers a comprehensive security platform for AI. The best-in-class platform empowers enterprises to safeguard AI models, applications and agents both at build time and run time. TrojAI Detect automatically red teams AI models, safeguarding model behavior and delivering remediation guidance at build time. TrojAI Defend is an AI application and agent firewall that protects enterprises from real-time threats at run time. TrojAI Defend for MCP monitors and protects agentic AI workflows.
To learn more, request a demo at troj.ai.











