Who owns the data used to train AI?
PC Pro|September 2023
Elon Musk says he owns it. Twitter's Ts & Cs suggest otherwise. James O'Malley investigates who really owns the data being used to train AI
James O'Malley
Who owns the data used to train AI?

For decades, the fields of rocket science and brain surgery have been cited as fields of endeavour that present almost unimaginable levels of complexity. Now we might want to add another tricky job to the list: managing Twitter.

Since Elon Musk dropped $44 billion and took control of Twitter at the end of last year, it hasn't gone well. The CEO who, let's not forget, is heavily invested in both rocket and neural science - has seen the value of the social network plummet. One study found that more than half of Twitter's top 1,000 advertisers have given up on the platform since his takeover.

The stress is starting to show. When Microsoft announced that it would be pulling advertising from the platform, reportedly because it refused to pay hiked API-access fees, Musk responded with a tweeted threat: "They trained illegally using Twitter data. Lawsuit time."

His argument is that Al models such as the ones created by Microsoft and its partner OpenAI, the firm behind ChatGPT, were getting a free ride on Twitter's data. Large language models (LLMs) that power AI tools such as ChatGPT have been "trained" on text taken from across the internet. This could conceivably have included data from Twitter.

Now Musk wants his pound of flesh. But who really owns data once it's out on the internet? Does Musk have any right to lay claim to it? The answer, you'll be shocked to hear, is complicated.

Scrapes of wrath

"There are so many variables that help to answer whether a specific scraping act is legal or illegal," said Denas Grybauskas, head of legal at web intelligence collection firm Oxylabs.

His company specialises in writing scrapers - software and tools that automate the work of downloading the contents of a website or individual web page, then extracting and organising the data. It's the equivalent of saving a web page on your computer, but automated and performed at mass scale.

This story is from the {{IssueName}} edition of {{MagazineName}}.

Start your 7-day Magzter GOLD free trial to access thousands of curated premium stories, and 9,000+ magazines and newspapers.

This story is from the {{IssueName}} edition of {{MagazineName}}.

Start your 7-day Magzter GOLD free trial to access thousands of curated premium stories, and 9,000+ magazines and newspapers.

MORE STORIES FROM PC PROView all
Key things to look for when buying a mini PC
PC Pro

Key things to look for when buying a mini PC

Buying a mini PC isn't like buying a laptop or a fully fledged desktop PC, but a pitfall-laden experience that sits somewhere in between

time-read
4 mins  |
December 2024
BRANDS YOU CAN TRUST
PC Pro

BRANDS YOU CAN TRUST

Whenever you buy something in the coming year, why not draw on the experience of thousands of discerning buyers?

time-read
5 mins  |
December 2024
5 things we learned from Lenovo Tech World'24
PC Pro

5 things we learned from Lenovo Tech World'24

In a landmark event where the CEOs of AMD, Intel and Nvidia all took to the stage, the theme of \"smarter AI for all\" was never far away, writes Tim Danton

time-read
5 mins  |
December 2024
The Darktrace leading to government
PC Pro

The Darktrace leading to government

British security firm Darktrace has been mired in controversy. Now its former CEO is a government minister. Rois Ni Thuama and Barry Collins investigate

time-read
9 mins  |
December 2024
Microsoft is doing more harm to Arm than good, argues Jon Honeyball
PC Pro

Microsoft is doing more harm to Arm than good, argues Jon Honeyball

You know that sinking feeling you get when something is not quite right? That nagging doubt that it shouldn't be like this? It was like that when I read that Qualcomm has cancelled its Snapdragon X developer kit, a desktop Mac mini-like box designed for developers to create and test apps for Windows on Arm (WoA).

time-read
3 mins  |
December 2024
How do we know how smart AI really is?
PC Pro

How do we know how smart AI really is?

Maths questions. Silly word puzzles. Counting the letter \"r\" in a sentence. Nicole Kobie reveals how we're trying to work out exactly how intelligent AI is

time-read
7 mins  |
December 2024
Missed call Whatever happened to the Acorn Communicator?
PC Pro

Missed call Whatever happened to the Acorn Communicator?

When Acorn launched its 16-bit Communicator computer with a built-in modem, it struggled to get potential buyers to listen, as David Crookes explains

time-read
9 mins  |
December 2024
STEVE CASSIDY-"Getting workers to do simple jobs in the 16th century was not much different from the 21st"
PC Pro

STEVE CASSIDY-"Getting workers to do simple jobs in the 16th century was not much different from the 21st"

Why 16th century \"networking\" legislation still has an impact, and why the term AI is confusing to punters as well as a waste of natural resources

time-read
8 mins  |
December 2024
JON HONEYBALL -"The more I have to do with UK telcos, the more broken their systems seem to be"
PC Pro

JON HONEYBALL -"The more I have to do with UK telcos, the more broken their systems seem to be"

After being tempted by the iPhone 16 Pro Max - for professional reasons, honest - and the Watch 2 Ultra, Jon discovers not everything is perfect in Apple's new generation

time-read
10 mins  |
December 2024
Apple iPhone 16 Pro
PC Pro

Apple iPhone 16 Pro

A bigger display, borrowed 5x tetraprism zoom from the Max and no price hike make this the best iPhone

time-read
7 mins  |
December 2024