Copyright lawsuit could create new licensing option for media industry

By Tony Poland, LegalMatters Staff • A lawsuit claiming copyright infringement against ChatGPT creator OpenAI could provide the media industry with a valuable new revenue stream if successful, says Toronto intellectual property lawyer John Simpson.

The joint lawsuit, launched by Canadian news outlets including the Toronto Star, The Globe and Mail, CBC and Canadian Press, was filed late last year in the Ontario Superior Court of Justice.

“There are a number of pending lawsuits relating to copyright actions taking aim at generative artificial intelligence (AI) but this is certainly one to watch,” says Simpson, principal of IP boutique Shift Law Professional Corporation. “ChatGPT doesn’t represent any sort of existential threat to these media operations, at least in an obvious way. However, what is apparent when reading this claim is that the media sees a lost opportunity to license their content to OpenAI for the purposes of training its GPT models.

“To me, that is really the crux of the case,” he tells LegalMattersCanada.ca. “They are arguing there is a huge licensing opportunity that OpenAI is depriving them of because rather than asking for permission to use content for training purposes they are just taking it. On a fundamental level, media organizations are saying they have something of value and should be entitled to charge for that.”

Did not have a valid licence

According to the statement of claim “OpenAI has capitalized on the commercial success of its GPT models, building an expansive suite of GPT-based products and services, and raising significant capital ­– all without obtaining a valid licence from any of the News Media Companies.”

“In doing so, OpenAI has been substantially and unjustly enriched to the detriment of the News Media Companies,” the statement adds. “The News Media Companies accordingly seek damages and/or disgorgement to compensate them for the wrongful misappropriation of their Works, as well as permanent injunctive relief to prevent OpenAI from continuing with its unlawful conduct.”

An OpenAI spokesperson told CBC News that its models are trained on publicly available data and that the company is “grounded” in international copyright principles.

In an interview with CBC, media and technology researcher Richard Lachman said companies such as OpenAI maintain “it’s not off-base to use publicly available news articles to train an artificial intelligence system.”

“The argument of the companies is, ‘We’re essentially reading the news that was on a public website. That’s not illegal. A human can read the news,'” Lachman, an associate professor at Toronto Metropolitan University’s RTA School of Media, told CBC. “”Of course, [media] companies push back and say, ‘You’re not reading the news, you are scraping information. And that’s against our terms of service.'”

‘Arguably value in the content’

Simpson, who is not involved in the case but comments generally, explains there “is arguably value in the content the media companies provide beyond reporting the news to a human audience.”

“The fact that AI is allegedly taking this information and using it for training purposes arguable proves  that worth,” he says.

The media outlets are claiming they deserve compensation because of the appreciable effort that goes into gathering the information being used.

“The Works are the exercise of significant skill and judgment. Each Work represents the product of substantial collective expertise, talent, research, and investment. Each Work is researched, written, and edited by journalists and contributors, who are trained and compensated,” the statement of claim states. “The Works are fact checked for accuracy and fairness, before being readied for publication in accordance with the design and marketing standards of their respective publication.

“The journalistic process also involves necessary support from a wide range of departments, including legal, technology, operations, security, marketing, advertising, and subscriptions, many of which are essential to getting the Works into the hands of the public.”

The plaintiffs allege OpenAI gathered data “using a process called ‘scraping,’ which involves programmatically visiting websites across the entirety of the Internet, locating the desired information, and extracting or copying it in a structured format for further use or analysis.”

Simpson says he is curious to see what happens as the case unfolds.

“It will be interesting to see how the case is pleaded. Different infringing actions could be at issue,” he says “Much of the discussion concerning AI and copyright has been about how and when copyright is infringed. Is it being infringed at the training stage? Behind the scenes? Or is it being infringed with the output? In this case, certainly, the allegations are that it is in the training stage.”

Simpson says it will be interesting to see how the copyright infringement allegations are proven.

‘It is the right to copy a work’

“That is because copyright is as it sounds,” he says. “It is the right to copy a work. AI would not just have to be reading a website and learning how to do something but, in the process of training its model, it would actually have to be making copies of works to attract liability for copyright infringement.”

If a case can’t be made for copyright infringement it might be possible to make out a breach of contract claim, Simpson says.

“That is based on the idea that when a person goes to a website they must agree to its terms of use, which, for example, could prohibit scraping content,” he says. “But one threshold question, from a legal perspective, is can chatbots be legally bound by contractual terms? Should the person sending the bots be the one agreeing to those terms?”

The discovery process in this case could be enlightening, Simpson says.

“OpenAI could be required to open the window on much of what it is doing,” he says. “Of course, that information would surely be subject to protective orders meaning the public will not get to see it.”

OpenAI could offer several defences, including claiming that they are not making copies therefore they are not violating Canadian copyright laws, Simpson says.

“I imagine they will also say this action doesn’t belong in a Canadian court because the activities alleged are happening in the United States where OpenAI is headquartered,” he says. “While this case might seem straightforward there could be a whole host of defenses available which will make it a challenge for media companies to prove its case.

Lawsuit will be closely watched

Simpson says he expects the lawsuit will be closely watched by content owners and lawyers alike.

“It is not like Chat GPT is going to replace journalists so why are the media companies pursuing this? Essentially because it is a lost licensing opportunity,” he says. “The media has a lot riding on this. There are potentially huge revenue generating opportunities at issue.”

Simpson says “some very seemingly minor details about how these things happen can make the difference in proving infringement and making a case for copyright infringement or for breach of contract.”

“The primary infringement claim is copyright. But there is the option, if you can’t make out a copyright claim, that you can prove breach of contract,” he says. “It will be interesting to see what evidence the court is going to need, provided there is not a settlement prior to trial of course.

“I imagine it will require plenty of expert evidence about what the technology really involves and what goes on behind the scenes,” Simpson adds.