Publisher warning: Google Play Books and AI models

Right now, we are witnessing a gigantic refocusing of publisher and tech interest in artificial intelligence, thanks to the success of new AI platforms like openAI and Stable Diffusion. For these platforms to work, they need data to create language models. Lots of data. As a publisher, I fear that Google Play Books may become part of a giant AI modeling scheme, either for Google’s own AI training efforts or used by third parties.

Google has never been a friend to book publishers. To participate in Google Play Books, publishers have to agree to a minimum 20% content sample being made available in Google Books:

To get words to be a possible match for Google Books searches, we scan the full text of your book. If a user searches a word that appears on a page of your book, your book can be listed in the search results, even if the title doesn’t contain the word. When you make your book fully searchable, it won’t be fully visible. You can control the percentage of your book’s pages that users can browse. …

To protect your content, Google limits the number of pages users can find. You can choose to make 20% to 100% of your book’s content browseable.

This always rubbed me the wrong way. It wasn’t just giving away content for Google’s benefit. I was also sensitive to scrapers and pirates taking these bits and pieces and remixing and republishing them without permission or licensing fees.

So why go along with Google Play? I wanted my company’s ebooks to have another outlet besides Amazon’s Kindle platform. Still, after 2019, I did not upload any new books or new editions to Google Play for the reasons listed above.

Now, as Big Tech races to monetize new AI language models, they’re looking for quality content. Last year, a senior Google executive called AI-generated content “spam.”  This means they will be casting about for other sources of original content that help can train their AI models. They already have access to Google Play Books/Google Books, YouTube transcripts, and other sources of original text that content creators and copyright holders have granted rights to.

Google Play Books AI model
Deactivating titles on Google Play Books

I do not want my company’s books to be used by Google to train its AIs or the AI models of partner companies unless my company is fairly compensated. For this reason, I have withdrawn 15 out of 17 ebooks that were still available on Google Play Books. The titles include my book Lean Media, which is still available for sale on other platforms.

I am also going to be monitoring ongoing legal cases relating to CoPilot (AI code generator) and Stable Diffusion in which content creators and publishers were not asked permission to have their copyrighted content used for AI model training.

This is big, people. Watch this space.

Leave a Comment

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.

Scroll to Top