Almost 200,000 books are being used to train artificial intelligence systems by some of the biggest companies in technology. The problem? No one told the authors.

The system is called Books3, and according to an investigation by The Atlantic, the data set is based on a collection of pirated e-books spanning all genres, from erotic fiction to prose poetry. Books help generative AI systems with learning how to communicate information.

Some AI training text can be pulled from articles that are posted on the internet, but high-quality AI requires high-quality text to absorb language from, according to the Atlantic, which is where books come in. Books3 is already the subject of multiple lawsuits against Meta and other companies using the system to train AI.

Now, thanks to a database published by The Atlantic last week pulling from Books3, authors can see whether their books specifically are being used to train these AI systems. And many are not happy.

鈥淚鈥檓 completely gutted and whipsawed. I am outraged and at the same time feel utterly helpless,鈥 wrote Mary H. K. Choi on social media, upon discovering her work was being used. 鈥淚鈥檓 furious and want to fight but I鈥檓 also so tired.鈥

Choi, whose debut novel 鈥淓mergency Contact鈥 appeared in the database, further explained her feelings in an email. The book, which centers on a young Korean-American woman navigating a new relationship, was 鈥渄eeply personal,鈥 and Choi was initially told her story was 鈥渢oo quiet and niche.鈥 The book later went on to become a New York Times bestseller, and found audiences around the world.

鈥淎 book encapsulates infinite choices, boundless permutations and even shortcomings of the author at the time. To think that all this life can be chucked into a vast churning pool to be extruded into a giant algorithmic, generative sausage machine reduces so much so swiftly,鈥 she said. 鈥淣ot just financially for the authors but it beggars booksellers, librarians, and readers from so many intimacies.鈥

Min Jin Lee, author of novels 鈥淧achinko鈥 and 鈥淔ree Food for Millionaires,鈥 expressed similar thoughts on social media, bluntly calling the use of her books 鈥渁 theft.鈥

鈥淚 spent three decades of my life to write my books,鈥 she said. 鈥淭he Al large language models did not 鈥榠ngest鈥 or 鈥榮crape鈥 鈥榙ata.鈥 Al companies stole my work, time, and creativity. They stole my stories. They stole a part of me.鈥

Nora Roberts, the prolific romance novelist, has 206 books used in the Books3 database, according to The Atlantic. That number is the highest by any living author, and second only to William Shakespeare. She called the database, and its use by tech companies, 鈥渁ll kinds of wrong.鈥

鈥淲e are human beings, we are writers, and we鈥檙e being exploited by people who want to use our work, again without permission or compensation, to `write鈥 books, scripts, essays because it鈥檚 cheap and easy,鈥 Roberts said in a statement to CNN.

That exploitation of writers didn鈥檛 shock author Nik Sharma, whose cookbook 鈥淪eason鈥 was found in the database.

鈥淚鈥檓 horrified but not surprised that I鈥檇 be taken advantage of,鈥 he said in a social media post. 鈥淥bviously, I wasn鈥檛 even asked for permission or received any compensation for the use of my work to train AI.鈥

AI is inevitable, Sharma said later in an email 鈥 hence his lack of surprise. What was most aggravating, he said, is that no one was contacted about usage or payment. After all, education isn鈥檛 free in the US, he said; teachers are paid, and textbooks are bought.

鈥淚t鈥檚 the Wild West right now with AI, and governmental policy on this is in its infancy,鈥 Sharma said. 鈥淎nd consequently, tech companies are taking full advantage while they can. I鈥檓 glad it was just one cookbook and not my others.鈥

Meta, which has used the Books3 database according to The Atlantic, did not respond to a request for comment.

A spokesperson for Bloomberg noted in a statement that the company had 鈥渦sed a number of different data sources,鈥 including Books3, to train its initial BloombergGPT model, an AI model for the financial industry. But, according to the spokesperson, Bloomberg will 鈥渘ot include the Books3 dataset among the data sources used to train future commercial versions of BloombergGPT.鈥

Not every author is upset about their work being used by AI. James Chappel, whose academic book on the modern Catholic church was used in the database, said on social media that he doesn鈥檛 鈥渃are at all.鈥

鈥淚 want my book to (be) read!鈥 he wrote. 鈥淚 want it to educate!鈥

Chappel did not respond to requests for further comment.

AI, in the hands of large corporations, has morphed into a significant concern for many writers. The Writers Guild of America went on strike this summer in part to demand limits on using AI in writing films and television shows. ChatGPT in particular has been used for everything from writing assignments to legal briefs.

Writers aren鈥檛 alone in their concerns. With the popularity of text-to-image AI systems, visual artists were in same situation last year, discovering their work was being used to train AI without permission. Together, both instances highlight concerns around AI鈥檚 increasing reach into all forms of art, where work can sometimes be intensely personal or intimate.

The conversation raised by Books3 comes just as US President Joe Biden announced plans to introduce an executive order on AI this fall, saying that the country will lead 鈥渢he way toward responsible AI innovation.鈥

For writers, though, the constant battles surrounding AI and their work can be deflating. For Choi, discovering her book had been used in the midst of the WGA strike, in which AI was a hotly debated subject, was 鈥渟urreal.鈥

鈥淚 was gutted,鈥 she said over email. 鈥淚t truly felt as though any gains or traction there was to be made in one arena could be so handily wiped out in another.鈥

And still, Choi said she knows her book, in the midst of thousands of others, is 鈥渋nsultingly inconsequential,鈥 despite its importance to her.

鈥淚 think the part that sucks most profoundly about all of it is that in my more hopeless moments it all feels absolutely inevitable,鈥 she said.

Choi isn鈥檛 alone in that feeling of inevitability. Roberts called for unity among writers and audiences alike to combat these issues.

鈥淲e who create stories need to unite to fight this abuse of our talent and hard work,鈥 she said. 鈥淲e need to stand for our work, and each other鈥檚 work. I hope readers and viewers stand with us on this vital issue.鈥