Concerns are Growing over Meta's Use of Unpublished Books to Train AI Models

Tech giant accused of using unpublished books to train its AI models without authors' consent.

Mar 31, 2025

Concerns are Growing over Meta's Use of Unpublished Books to Train AI Models

Factors of Copyright

Last week, The Atlantic released a new tool designed to search LibGen, a database allegedly used by Meta to train its artificial intelligence (AI) models, sparking widespread concern because it contains numerous unpublished works.

Copyright Issues

One writer, Maris Kreizman, in an article for Literary Hub, revealed that her forthcoming essay collection was found within this database. Kreizman stated that her collection is set for publication on July 1st, yet Meta had already accessed and used her work to train its AI models. This was shocking as such practices are extremely rare in the publishing industry.

Generally, you can find digital copies of unpublished works on legitimate platforms like NetGalley and Edelweiss, which have strict terms and conditions governing their use.

More to Come

With the advancement of artificial intelligence technology, more creators are worried about their work being used without authorization. Kreizman's discovery is not an isolated incident and has sparked a broader conversation about creators' rights, intellectual property, and how to protect these rights in the rapidly evolving landscape of AI.

The concern for many authors and creators doesn't just stop at the use of their unpublished work but is also disrespectful to their creative labor and potentially damaging to their careers. The incident also raises questions about the sources of data Meta uses to train its AI, particularly regarding its adherence to legal and ethical standards.