OpenAI wants to license articles from a dozen publishers

OpenAI is working with numerous publishers to potentially license their articles, expanding the scope of its content acquisition efforts to train its artificial intelligence models. Tom Rubin, head of intellectual property and content at OpenAI, reported on ongoing positive negotiations with several publishers.

OpenAI has already signed a major agreement with Axel Springer SE, the parent company of Politico, and an undisclosed agreement with the Associated Press that will allow the company to fill the need for fresh and accurate data. This is quite important, since data is now one of the main sources of success in the field of AI.

However, OpenAI faces a major legal challenge from The New York Times, which has sued OpenAI and Microsoft for using its articles without permission. This lawsuit could have serious consequences for OpenAI, potentially leading to huge financial liabilities and a requirement to remove any training data obtained from The New York Times, which is a difficult task. The legal battle also complicates OpenAI's negotiations with other media outlets.

Rubin emphasizes the difference between OpenAI's use of content to train AI and the traditional reproduction or replacement of content by search engines and social networks. In contrast, The New York Times accuses OpenAI of directly copying its content via ChatGPT, citing examples of the AI generating text that closely resembles Times articles, and OpenAI disputes this claim. The New York Times insists it has legal permission to use its work commercially, which it says OpenAI and Microsoft have not received.

In my opinion, the trial between the Times and OpenAI with Microsoft could be a key event in the field of modern copyright law. Any decision in this process will become a precedent either in one direction or the other. Satisfaction of the claim will seriously slow down the development of AI due to long and difficult negotiations to acquire licenses, and its dismissal will undermine the entire current system of copyright protection.

Currently, data is the main resource in AI – the one who has the data decides. Therefore, OpenAI, Apple and other companies developing artificial intelligence systems are trying to negotiate with the main content owners – publishers. It is quite possible that in the near future we will also witness lawsuits related to the protection of rights to images, videos and songs – these modalities are also actively used in AI right now.

OpenAI wants to license articles from a dozen publishers

Nvidia announced Blackwell B200 GPU: key points

Apple wants to integrate Google's Gemini into the iPhone

The Times sues OpenAI and Microsoft over copyright infringement by AI

Apple wants to buy publishers' articles to train generative AI

Gemini is Google's answer to GPT-4: the revolution failed

Meta and IBM launch the AI Alliance: pros and cons