Thursday, April 06, 2023

Judge's Decision on Internet Archive's Controlled Digital Lending

The story is long and complex, so here's about the shortest Q&A summary that I can manage. Remember IANAL (I am not a lawyer), IAAL (I am a librarian). Also, I'm leaving out lots of details here, but provide links so that you can get to them. While this is playing out as a legal question, the societal issues are barely considered. I will try to give some thoughts on those soon.

Q: Who sued the Archive?
A: Four publishers: Hachette Book Group, Inc., HarperCollins Publishers LLC, John Wiley & Sons, Inc., Penguin Random House LLC.

Q: What did they sue about?
A: That the Archive digitized paper books for which the publishers hold the copyright and loaned the digital copies to people.

Q: Are these the only publishers whose works the Archive digitized?
A: Oh, no. There are probably thousands of others.

Q: What are the books that are named in the suit?
A: There are too many to list, but here are a few to give you an idea:

  • Elizabeth Gilbert's Eat, Pray, Love: One Woman's Search for Everything Across Italy, India and Indonesia
  • Malcolm Gladwell's Blink: The Power of Thinking Without Thinking
  • C. S. Lewis's The Lion, the Witch, and the Wardrobe
  • J. D. Salinger's The Catcher in the Rye
  • Laura Ingalls Wilder's The House on the Prairie

There are many minor works as well, and others whose titles you would recognize.
The full list is at:

Q: What was the Archive's legal defense for its actions?
A: They argue that digitization is analogous to the kind of time-shifting that is done through technologies like Tivo; it is a sort of "format shift" and therefore is fair use. They also argue that the Archive, as a non-profit library, is providing a lending service like libraries do with hard copies of books. It calls this process of digitizing and lending "Controlled Digital Lending." In Controlled Digital Lending the library treats the original hard copy and the digital copy as a single "thing" and lends either one or the other but not both at a time. This is called the "one-to-one principle" and it is designed to mimic the First Rights law of the US which is the basis for the legality of library lending.

Q: How did the court respond?
A: The judge looked at the four factors of the US copyright law and concluded that the Archive's use was not fair. He accepted the publisher's arguments that the lending of the books competed with the publishers' own digital and physical sales. He also bought the publishers' argument that the Archive, albeit a non-profit, gained status and therefore donations through the book lending service.

Q: Are there legal arguments to support Controlled Digital Lending?
A: Yes, ones have been made. In particular there is the work of Michelle Wu, who wrote "Building a Collaborative Digital Collection: A Necessary Evolution in Libraries.". Her initial thesis regarded law libraries and their difficulty in keeping up with the production of legal resources. Later, she was one of a group of legal scholars who developed a more general statement on Controlled Digital Lending. They argue that in this environment of increasing remote access to information, libraries have to be able to move beyond the requirement that users visit a physical space to access materials. And since not all materials have been provided in digital form, libraries need to take on the process of digitization for materials that they hold only in hard copy.

Q: Is this the first time that libraries have digitized materials?
A: No. Libraries have used various technologies, including digitization, to make materials available to disabled users. They also have digitized, faxed, and copied individual journal articles and book sections to satisfy interlibrary loan requests. They rarely have digitized entire books except to preserve rare materials, but those are generally free of copyrights due to their age.

Q: So, did the Archive do something wrong?
A: Possibly. For materials online a copyright holder can issue a "take down" notice, and the recipient is obligated to remove the item from access. The publishers claim that they gave the Archive a list of items to take down, but not all were removed. I haven't seen a statement from the Archive on why that method failed. Then, for about four months, during the beginning of the COVID pandemic in 2020, the Archive eliminated the one-to-one rule and allowed unlimited lending. This was done as a service to offset the fact that during that time many physical libraries were closed to their users, but it was not in keeping with the legal principles that had been laid out for Controlled Digital Lending. 

Another possible error was the digitization of materials for which the publishers have digital versions (ebooks) on offer. This makes the argument that the Archive was competing with the publishers more convincing. Copyright law also views "creative" works more strongly than factual works, and these are publishers of fiction as well as popular non-fiction, types of works that one could see as worthy of maximum copyright protection. Materials intended for research and education (academic journal articles, scientific treatises) are more likely to meet the "purpose" requirement of a fair use assessment. It is quite a bit harder to claim fair use copying for "something fun to read" and the publishers in the suit are all major purveyors of popular reading. 

 Continuing on, most libraries have a limited user base: universities serve current students, staff and faculty; public libraries serve residents in their jurisdiction. The Archive was lending materials globally. That latter is both an argument against the Archive, if you are a publisher, and an argument for the Archive, if you support equal access to information. 

Q: Didn't we go through this already with Google books?

Not quite. Google never allowed anyone to read its digitized books. It stated that its digitization project was to provide searching within the text of books, and users were only displayed snippets, not the whole book. That was deemed to be fair use by the court. Since then, Google books has mainly been acquiring digital texts provided by publishers, and the amount of visible content is part of the agreement between Google and its book partners.

Q: Could a different implementation of Controlled Digital Lending succeed?
A: Possibly. There are libraries that have partnered with the Archive in this project but were not mentioned in the lawsuit; it is unclear whether they will be able to continue lending their digitized books - although they may have to find another technical solution to the lending service, which is currently run by the Archive. There is also the possibility that a digitization project that had specific service goals, like the one initially proposed by Wu for law libraries, would be easier to defend. Both the Archive and the earlier digitization project, Google Books, decided that it was expedient to digitize first and ask permission later. They also both digitized indiscriminately, including old and new, academic and popular. Google eventually adopted an "opt-in" model in its publisher relations, although as the search engine of record what it has to offer is a level of visibility that no one else can provide. The other option is to limit access to books in the public domain, which cuts off almost the last century of works. 

Q: What's next?
A: There will be appeals by the Archive, but if those do not alter the court's view then the Archive will be required to compensate the publishers for its infringement of their rights. Presumably that compensation will be based on some estimated amount that the publishers were damaged. So far I have no seen any actual figures that would be used to make such a determination.


No comments: