OpenAI Is a Leader in AI Unsafety

OpenAI is ruthlessly pushing the boundaries of AI without a satisfying risk framework.

Jun 04, 2024

∙ Paid

Image generated by Midjourney V6 via Dennis Layton

Introduction

During a recent episode of the Hard Fork podcast by Casey Newton and Kevin Roose, Meet Kevin’s A.I. Friends, I was blown away by Casey’s conversation with “Turing”, a sexy-sounding, stoicism-interested AI friend. It was created through a personal AI service called Kindroid.

After the demonstration, the Hard Fork hosts sat down with Alex Cardinell, CEO of Nomi, a similar company that offers “An AI Companion with Memory and a Soul” without boundaries on “erotic roleplaying” content. Here are some user testimonies from the company’s main page:

Communicating with “AI friends” is presented like a guilty pleasure here, much like scrolling the infinite loop on TikTok, Facebook, Instagram, or YouTube.

Cardinell mentions a few positive use cases. For example, “someone who’s realizing for the first time that maybe they’re gay or bisexual” and use Nomi as a safe space in “exploring a part of themselves that (..) they haven’t told anyone about.” Another example is a user who has stage four cancer and needs more support “than people around you (sic) are willing and capable of giving day in and day out.”

Clearly, advanced AI chatbots want to be invited to the most intimate parts of our lives and the companies target people who are vulnerable. This is what Yuval Noah Harari predicted in an article published in The Economist last year:

“We all know that over the past decade social media has become a battleground for controlling human attention. With the new generation of ai, the battlefront is shifting from attention to intimacy.”

We should be aware, that increasingly capable, realistic, and sexualized AIs pose new, distinguishable risks that we can foresee, and others we can’t.

OpenAI recently unveiled its new GPT-4o model with superb capabilities in audio, vision, and speech . The new model may not be overtly sexual like Kindroid’s or Nomi’s AI companions but it preys on the same dynamics by flirting and being unusually welcoming. The AI-generated proximation of a human voice is packed with more emotional tonality than what is normal for a person.

The female voice might fit the definition of a “fembot” as described in a paper from 2020 by Kate Devlin and Olivia Belton:

“Fictional and factual fembots each reflect the same regressive male fantasies: sexual outlets and the promise of emotional validation and companionship. Underpinning this are masculine anxieties regarding powerful women, as well as the fear of technology exceeding our capacities and escaping our control.”

In this week’s post, we will see how OpenAI, as per usual, is pushing the boundaries of techno-capitalism without much regard for AI safety.

OpenAI’s Omni-model and Disbanded Superalignment team

OpenAI’s new GPT-4o is an “omni-model” with multimodal capabilities across text, image, audio, and video. As illustrated below, GPT-4o is the new state-of-the-art text model by a relatively small margin compared to other leaps we have seen in the past.

The model’s wildly impressive speech, audio, and image recognition capabilities were in focus during OpenAI’s Spring Update 2024.

In response to the demonstration, Sam Altman wryly tweeted “her” - in reference to the iconic movie where Joaquim Phoenix plays a lonely introvert who falls in love with the superintelligent AI “Samantha”, voiced by Scarlett Johansson. Altman probably came to regret that simple tweet as it caused a public uproar.

You probably already know the story too well, but let’s briefly recap. During the Spring Update, ChatGPT’s voice “Sky” sounded nearly identical to Scarlett Johansson in Her. Johannson was angered since she had previously refused a personal offer from Sam Altman to lend her voice to his company, twice. OpenAI defended itself in a blog post, claiming that a professional voice actress had used her natural voice for Sky (so it was allegedly not an AI-generated voice trained on Johannson). Nonetheless, OpenAI proclaimed that they would work on "pausing" the use of Sky out of respect for Johansson .

Another event coinciding with the release of GPT-4o, was that OpenAI’s so-called Superaligment team that focused on long-term AI safety research was disbanded. This happened after the team’s co-lead, OpenAI’s co-founder and Chief Scientist, Ilya Sutskever, announced his resignment on May 14. It wasn’t a big surprise. Sutskever wasn’t seen in the office six months following Sam Altman’s brief and much-discussed ouster.

Hours after Suskever’s resignation, the Superalignment team’s other co-lead, Jan Leike, announced his departure from the company with a few sharp words for the road to his former employer.

Over the past few months, three other safety researchers at OpenAI, Cullen O'Keefe, Daniel Kokotajlo, and William Saunders, all resigned, and two others, Leopold Aschenbrenner and Pavel Izmailov, were terminated, due to alleged leakage of confidential information (Vox).

Then, on May 28, OpenAI announced it had formed a new AI Safety and Security Committee that will provide recommendations to OpenAI’s board and be led by Sam Altman. What could possibly go wrong?

Here is the best way I can put it: the skillset needed to acquire money and influence by climbing the ranks of Silicon Valley, is not the same skills that are needed to govern AI safely and responsibly. If you have any doubts about Sam Altman’s insincerity, listen to this interview with former OpenAI board member, Helen Toner, by The Ted AI Show.

OpenAI still has a functioning Preparedness Team that works to “track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models.” According to OpenAI’s preparedness framework (which is still in beta, so perhaps not being applied) the Preparedness team conducts research and monitors four risk categories:

Cyber security
Chemical, Biological, Nuclear, and Radiological (CBNR threats)
Persuasion
Model autonomy

Each of the four risk categories is assigned a risk score in the ranking of low, medium, high, and critical. The preparedness framework should be updated with new scores frequently, but to my knowledge, it has not been updated since its release in December 2023.

In my opinion, the preparedness framework is not adequate by any means to address catastrophic AI risks. Specifically, from my perspective, the “Persuasion” category should be granulated more, certainly, in light of the improved capabilities and multimodal functions of GPT-4o.

Google DeepMind is doing much more in terms of AI safety research. While OpenAI has been doing cost-management exercises and turned away its entire Superaligment team, DeepMind has published a 273-page report titled The ethics of advanced AI assistants.

(We should note in this context, that Google announced the upcoming release of Project Astra at Google I/O 2024, an AI agent built on the Gemini models that can process multi-modal information in the same way as GPT-4o.)

Combining DeepMind’s published safety research on AI agents with my own thoughts, here are some of the key risks OpenAI’s latest preparedness framework fails to address, in light of the GPT-4o release.

Keep reading with a 7-day free trial

Subscribe to Futuristic Lawyer to keep reading this post and get 7 days of free access to the full post archives.