In a paper from the AI lab Anthropic, which produces the large language model (LLM) behind the ChatGPT rival Claude, researchers described an attack they called "many-shot jailbreaking". It is as simple as it is effective.
Claude, like most large commercial AI systems, contains safety features designed to encourage it to refuse certain requests, such as to generate violent or hateful speech, or produce instructions for illegal activities. A user who asks the system for instructions to build a bomb, for example, will receive a polite refusal to engage.
Denne historien er fra April 04, 2024-utgaven av The Guardian.
Start din 7-dagers gratis prøveperiode på Magzter GOLD for å få tilgang til tusenvis av utvalgte premiumhistorier og 9000+ magasiner og aviser.
Allerede abonnent ? Logg på
Denne historien er fra April 04, 2024-utgaven av The Guardian.
Start din 7-dagers gratis prøveperiode på Magzter GOLD for å få tilgang til tusenvis av utvalgte premiumhistorier og 9000+ magasiner og aviser.
Allerede abonnent? Logg på
'Writing a book is tough but being a pro is harder'
The big interview Conor Niland The author of the William Hill Sports Book of the Year holds no bitterness towards tennis, which gave him a dream to chase yet meagre rewards
Carabao Cup to trial VAR explanations in stadiums
In-stadium VAR announcements are to be trialled for the first time in English football in the Carabao Cup semi-finals next week.
Hutchinson haunts his former club as Ipswich rewrite the title script
Ipswich's wait is over. Portman Road would not be denied its first Premier League win since April 2002.
Newcastle close on top four and expose Amorim's daunting task
Manchester United are dipping towards the drop zone under Ruben Amorim, a head coach who took over on 11 November and forgot to pack the \"bounce\" often gifted to an employer after sacking the last guy.
Lamptey Saves Seagulls After Rogers Runs Show for Villa
Tariq Lamptey spent most of his night worried about how his direct opponent Morgan Rogers was the most-likely matchwinner, but it was the right-back who settled the result with a fine equaliser as Brighton secured a draw at Aston Villa.
Undermined Fonseca leaves Milan with 'calm conscience'
The Portuguese struggled from the start at San Siro but his sacking still reflects badly on the club's board
Boulter dreading 'terrible' prospect of playing fiance
Katie Boulter, the British No 1, admitted she was hoping to avoid having to play her Australian fiance, Alex de Minaur, after leading Great Britain to victory over Argentina in the United Cup.
Djokovic and Kyrgios raise curtain on season of change
The Serb has struck unlikely alliances in push for renewed success, but others hope to master a shifting landscape
Injury threatens to rule Lake out of Six Nations
Dewi Lake is in danger of missing the Six Nations after having biceps surgery.
Boland blows down India to give Australia series lead
Ultimately, Australia's gamble paid off.