New York Times vs OpenAI-Microsoft is the clash of the decade

February 3, 2024

The legal battle between the New York Times, OpenAI, and Microsoft could prove decisive in the future projection to delineate the boundaries within which companies can operate to train generative artificial intelligence software. The issue goes beyond the two challengers and, precisely because it pits two giants of their respective sectors against each other, it also represents the duel between the two fronts that have arisen in the wake of AI’s rapid expansion: the enthusiasts of the new technology on the one hand, and those who instead want to impose a brake on the development of GenAI, to first understand what stakes to set to regulate the phenomenon.

The stakes are high because, irrespective of the judges’ decisions, the verdict could lead to upheaval for the world of information, bringing about transformations in the strategies or realisation of work by publishers and journalists or upsetting the method by which OpenAI and other competing companies train tools to respond quickly and accurately (at least in theory) to user requests.

Side effects of ChatGPT

In the lawsuit filed in federal court in Manhattan, although without specifying the financial figure, the New York Times seeks compensation for billions of dollars derived from the drop in site visits and lost subscriptions, as the reduction is linked to the results generated by ChatGPT, which takes up the newspaper’s articles to respond to user requests.

This is an activity that is repeated non-stop and that OpenAi and Microsoft (which invested $13 billion in the company led by Sam Altman and implemented GPT4’s technology in the Bing search engine and the chatbot Copilot) carry out without paying a fee to the Times, which in turn invests a lot of money to produce the information content it publishes on the site.

Hence, the citation for copyright infringement claims that ChatGPT and Copilot have exploited millions of articles from the newspaper to be trained, thereby competing as a reliable source of information and taking away the audience from the New York newspaper itself.

New York Times

In support of this, the Times adds a long series of examples. One uses Browse With Bing (a feature powered by OpenAI technology) to find many answers replicating articles from Wirecutter, the New York Times review site. Unlike Google searches, however, in this case, Microsoft does not cite Wirecutter and cuts off referral links to e-commerce sites that ensure the source receives a commission on sales and clicks. ‘Less traffic and less revenue’ is the complaint of the newspaper reported in the lawsuit filed against OpenAI and Microsoft.

Microsoft Copilot — New York Times – Photo Credits: unsplash

There is another point highlighted by the Times, which concerns the false information included by chatbots in their replies. In this specific case, replies with untrue information that the GenAI software attributes to the newspaper. Again, there are several examples cited by the Times, which, pointing out that people tend to be satisfied with what the chatbots propose without visiting the sites of the sources that produced the news, puts forward the need to ‘protect independent journalism so as not to undermine journalism itself’.

An agreement that doesn’t sink anyone

Lining up the various contested acts, the New York Times holds OpenAI and Microsoft ‘liable for copyright infringement and billions of dollars in statutory and actual damages’. It, therefore, asks the judges to prevent the two companies from training GenAI models with its content and to force the removal of its texts from the databases used to train ChatGPT and Copilot.

“We are surprised and disappointed by the New York Times lawsuit; we have proceeded constructively and respect the rights of content creators and owners, with whom we are committed to working together to ensure benefits from AI technology through new revenue models,” said Lindsey Held, spokesperson for OpenAI. It should be recalled that the subpoena came after negotiations between the parties failed, and the New York Times blocked OpenAI’s web crawler (as did BBC, CNN and Reuters) to prevent the company from obtaining content produced to train ChatGPT.

The feeling is that we are facing the first act of a long and decisive duel to understand what the future of journalism and GenAI models will be. Bearing in mind two preeminent aspects, stopping the evolution of technology is almost impossible, as alternative forms would be found to continue spreading innovation, but completely bypassing those who strive to spread quality information cannot go unheeded. An agreement is therefore needed.