Find What You Love, Pay Less – Explore the Best Prices on the Hottest Products Today

It turns out you can train AI models without copyrighted material

AI companies claim their tools couldn’t exist without training on copyrighted material. It turns out, they could — it’s just really hard. To prove it, AI researchers trained a new model that’s less powerful but much more ethical. That’s because the LLM’s dataset uses only public domain and openly licensed material.

The paper (via The Washington Post) was a collaboration between 14 different institutions. The authors represent universities like MIT, Carnegie Mellon and the University of Toronto. Nonprofits like Vector Institute and the Allen Institute for AI also contributed.

The group built an 8 TB ethically-sourced dataset. Among the data was a set of 130,000 books in the Library of Congress. After inputting the material, they trained a seven-billion-parameter large language model (LLM) on that data. The result? It performed about as well as Meta’s similarly sized Llama 2-7B from 2023. The team didn’t publish benchmarks comparing its results to today’s top models.

Performance comparable to a two-year-old model wasn’t the only downside. The process of putting it all together was also a grind. Much of the data couldn’t be read by machines, so humans had to sift through it. “We use automated tools, but all of our stuff was manually annotated at the end of the day and checked by people,” co-author Stella Biderman told WaPo. “And that’s just really hard.” Figuring out the legal details also made the process hard. The team had to determine which license applied to each website they scanned.

So, what do you do with a less powerful LLM that’s much harder to train? If nothing else, it can serve as a counterpoint.

In 2024, OpenAI told a British parliamentary committee that such a model essentially couldn’t exist. The company claimed it would be “impossible to train today’s leading AI models without using copyrighted materials.” Last year, an Anthropic expert witness added, “LLMs would likely not exist if AI firms were required to license the works in their training datasets.”

Of course, this study won’t change the trajectory of AI companies. After all, more work to create less powerful tools doesn’t jive with their interests. But at least it punctures one of the industry’s common arguments. Don’t be surprised if you hear about this study again in legal cases and regulation arguments.

Trending Products

- 37% Lenovo New 15.6″ Laptop, Inte...
Original price was: $879.98.Current price is: $549.99.

Lenovo New 15.6″ Laptop, Inte...

0
Add to compare
- 11% Thermaltake V250 Motherboard Sync A...
Original price was: $89.99.Current price is: $79.99.

Thermaltake V250 Motherboard Sync A...

0
Add to compare
- 20% Dell KM3322W Keyboard and Mouse
Original price was: $24.99.Current price is: $19.99.

Dell KM3322W Keyboard and Mouse

0
Add to compare
- 20% Sceptre Curved 24-inch Gaming Monit...
Original price was: $99.97.Current price is: $79.97.

Sceptre Curved 24-inch Gaming Monit...

0
Add to compare
- 30% HP 27h Full HD Monitor – Diag...
Original price was: $229.99.Current price is: $159.99.

HP 27h Full HD Monitor – Diag...

0
Add to compare
- 18% Wireless Keyboard and Mouse Combo &...
Original price was: $39.99.Current price is: $32.99.

Wireless Keyboard and Mouse Combo &...

0
Add to compare
- 40% ASUS 27 Inch Monitor – 1080P,...
Original price was: $197.54.Current price is: $119.00.

ASUS 27 Inch Monitor – 1080P,...

0
Add to compare
- 19% Lenovo V14 Gen 3 Business Laptop, 1...
Original price was: $739.00.Current price is: $599.00.

Lenovo V14 Gen 3 Business Laptop, 1...

0
Add to compare
- 27% Amazon Fundamentals – 27 Inch...
Original price was: $164.39.Current price is: $119.99.

Amazon Fundamentals – 27 Inch...

0
Add to compare
- 42% View 270 Plus TG ARGB Black Mid Tow...
Original price was: $137.58.Current price is: $79.99.

View 270 Plus TG ARGB Black Mid Tow...

0
Add to compare
.

We will be happy to hear your thoughts

Leave a reply

BestPriceTrend
Logo
Register New Account
Compare items
  • Total (0)
Compare
0
Shopping cart