OpenAI is working with numerous publishers to potentially license their articles, expanding the scope of its content acquisition efforts to train its artificial intelligence models. Tom Rubin, head of intellectual property and content at OpenAI, reported on ongoing positive negotiations with several publishers.

Apple wants to buy publishers’ articles to train generative AI
Apple’s approach differs significantly from that of other companies, which buy rights to content after training generative models.

OpenAI has already signed a major agreement with Axel Springer SE, the parent company of Politico, and an undisclosed agreement with the Associated Press that will allow the company to fill the need for fresh and accurate data. This is quite important, since data is now one of the main sources of success in the field of AI.

However, OpenAI faces a major legal challenge from The New York Times, which has sued OpenAI and Microsoft for using its articles without permission. This lawsuit could have serious consequences for OpenAI, potentially leading to huge financial liabilities and a requirement to remove any training data obtained from The New York Times, which is a difficult task. The legal battle also complicates OpenAI's negotiations with other media outlets.

The Times sues OpenAI and Microsoft over copyright infringement by AI
This is probably the first serious lawsuit concerning AI and copyright.

Rubin emphasizes the difference between OpenAI's use of content to train AI and the traditional reproduction or replacement of content by search engines and social networks. In contrast, The New York Times accuses OpenAI of directly copying its content via ChatGPT, citing examples of the AI generating text that closely resembles Times articles, and OpenAI disputes this claim. The New York Times insists it has legal permission to use its work commercially, which it says OpenAI and Microsoft have not received.

In my opinion, the trial between the Times and OpenAI with Microsoft could be a key event in the field of modern copyright law. Any decision in this process will become a precedent either in one direction or the other. Satisfaction of the claim will seriously slow down the development of AI due to long and difficult negotiations to acquire licenses, and its dismissal will undermine the entire current system of copyright protection.

Currently, data is the main resource in AI – the one who has the data decides. Therefore, OpenAI, Apple and other companies developing artificial intelligence systems are trying to negotiate with the main content owners – publishers. It is quite possible that in the near future we will also witness lawsuits related to the protection of rights to images, videos and songs – these modalities are also actively used in AI right now.

Share this post