What happens when AI goes rogue (and how to stop it)

We have seen AI morphing from answering simple chat questions for school homework to attempting to detect weapons in the New York subway, and now being found complicit in the conviction of a criminal who used it to create deepfaked child sexual abuse material (CSAM) out of real photos and videos, shocking those in the (fully clothed) originals.

While AI keeps steamrolling forward, some seek to provide more meaningful guardrails to prevent it going wrong.

We’ve been using AI in a security context for years now, but we’ve warned it wasn’t a silver bullet, partially because it gets critical things wrong. However, security software that “only occasionally” gets critical things wrong will still have quite a negative impact, either spewing massive false positives triggering security teams to scramble unnecessarily, or missing a malicious attack that looks “just different enough” from malware that the AI already knew about.

This is why we’ve been layering it with a host of other technologies to provide checks and balances. That way, if AI’s answer is akin to digital hallucination, we can reel it back in with the rest of the stack of technologies.

While adversaries haven’t launched many pure AI attacks, it’s more correct to think of adversarial AI automating links in the attack chain to be more effective, especially at phishing and now voice and image cloning from phishing to supersize social engineering efforts. If bad actors can gain confidence digitally and trick systems into authenticating using AI-generated data, that’s enough of a beachhead to get into your organization and begin launching custom exploit tools manually.

To stop this, vendors can layer multifactor authentication, so attackers need multiple (hopefully time-sensitive) authentication methods, rather than just a voice or password. While that technology is now widely deployed, it is also widely underutilized by users. This is a simple way users can protect themselves without a heavy lift or a big budget.

Is AI at fault? When asked for justification when AI gets it wrong, people simply quipped “it’s complicated”. But as AI gets closer to the ability to cause physical harm and impact the real world, it’s no longer a satisfying and adequate response. For example, if an AI-powered self-driving car gets into an accident, does the “driver” get a ticket, or the manufacturer? It’s not an explanation likely to satisfy a court to hear how complicated and opaque it might be.

What about privacy? We’ve seen GDPR rules clamp down on tech-gone-wild as viewed through the lens of privacy. Certainly AI-derived, sliced and diced original works yielding derivatives for gain smacks afoul of the spirit of privacy – and therefore would trigger protective laws – but exactly how much does AI have to copy for it to be considered derivative, and what if it copies just enough to skirt legislation?

Also, how would anyone prove it in court, with but scant case law that will take years to become better tested legally? We see newspaper publishers suing Microsoft and OpenAI over what they believe is high-tech regurgitation of articles without due credit; it will be interesting to see the outcome of the litigation, perhaps a foreshadowing of future legal actions.

Meanwhile, AI is a tool – and often a good one – but with great power comes great responsibility. The responsibility of AI’s providers right now lags woefully behind what’s possible if our new-found power goes rogue.