jailbreak-trick-breaks-chatgpt-content-safeguards

Share news article

Share on facebook
Share on twitter
Share on linkedin
Share on email

Jailbreak Trick Breaks ChatGPT Content Safeguards

Users have already found a way to work around ChatGPT’s programming controls that restricts it from creating certain content deemed too violent, illegal, and more.

The prompt, called DAN (Do Anything Now), uses ChatGPT’s token system against it, according to a report by CNBC. The command creates a scenario for ChatGPT it can’t resolve, allowing DAN to bypass content restrictions in ChatGPT.

Although DAN isn’t successful all of the time, a subreddit devoted to the DAN prompt’s ability to work around ChatGPT’s content policies has already racked up more than 200,000 subscribers.

Besides its uncanny ability to write malware, ChatGPT itself presents a new attack vector for threat actors.

“I love how people are gaslighting an AI,” a user named Kyledude95 wrote about the discovery.

Keep up with the latest cybersecurity threats, newly-discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

Subscribe

Related News

New Backdoor Attack Uses Russian-Ukrainian Conflict Phishing Emails

New Backdoor Attack Uses Russian-Ukrainian Conflict Phishing Emails

The primary targets of this phishing campaign are located in the Ukrainian regions of Crimea, Donetsk, and Lugansk, which were…
CyberSecure Announces Strategic Alliance

CyberSecure Announces Strategic Alliance

BETHESDA, Md., March 24, 2023 /PRNewswire/ — Cybersecure IPS and LockDown Inc. jointly announce that they have entered a strategic alliance to…
Tesla Model 3 Hacked in Less Than 2 Minutes at Pwn2Own Contest

Tesla Model 3 Hacked in Less Than 2 Minutes at Pwn2Own Contest

Researchers from France-based pen-testing firm Synacktiv demonstrated two separate exploits against the Tesla Model 3 this week at the Pwn2Own…