Monday, November 03, 2008

Determining Copyright Status

Among the many interesting bits in the Google/AAP agreement is Section E which essentially lays out in detail what steps Google must take to determine if an item is or is not in the public domain. As we know, this is not easy. The agreement states that two people must view the title page of the work (yes, it says "two people") to determine if the item has a copyright notice, and to check the place of publication. To determine if copyright has been renewed, "Google shall search either the United States Copyright Renewal Records or a copy thereof." If a renewal record isn't found, and the work has a copyright date before 1964, then it is presumed to be in the public domain.

I decided to try this out, at least the part about checking the renewal. I did my searches in two databases: Stanford's and Rutgers'.

I happen to have a copy of Orwell's 1984 with detailed copyright notices. It lists the first copyright as 1949, by Harcourt, Brace and Jovanovich, Inc. It then says "Copyright renewed 1977 by Sonia Brownell Orwell." It also includes "Copyright 1984 by Virgin Cinema Films Limited" although I must say that I'm not sure why that latter copyright notice is in the book.

A search on '1984' in the Rutgers' database yields no hits, but using the author's name I find 37 items, of which one reads:
AUTH: George Orwell, translation: Amelie Audiberti. NM: translation.
TITL: 1984.
ODAT: 1Jul50; DREG: 7Nov77 RREG: R678090. RCLM: AFO-2377. Amelie Audiberti, nee Elisabeth Savane (A)
A search in the Stanford database gets me:
Title    1984 NM: translation
Author George Orwell, translation: Amelie Audiberti
Registration Date 1Jul50
Renewal Date 7Nov77
Registration Number AFO-2377
Renewal Id R678090
Renewing Entity Amelie Audiberti, nee Elisabeth Savane (A)
Both of these seem to be for the same item, and it's a translation of the book 1984. The renewal listed in the book for the English text is not in the databases. The instructions to Google say nothing about taking renewal dates from the book, so this one would appear to be in the public domain by the agreement's criteria.

Picking up another book of the right age, I have Proust's "The Captive" in the Modern Library edition, the "C. K. Scott Moncrieff" translation, with "Copyright, 1929, by Random House, Inc." on the title page.

In Stanford's database I get:

Title    The captive. Translated by C. K. Scott Monorieff
Author PROUST, MARCEL
Registration Date 27Jun29
Renewal Date 7Sep56
Registration Number A9965
Renewal Id R176423
Renewing Entity Random House, Inc. (PWH)

In Rutgers I get:
CLNA: RANDOM HOUSE, INC.
TITL: The captive.
XREF: Proust, Marcel.
Unfortunately, this latter doesn't include a date, so I'm not sure that this record provides sufficient information. Fortunately, the Stanford database gives more information. Unfortunately, the Stanford record gives the title and what we librarians would call the "statement of responsibility" in the same field, and misspells the name of the translator. This may make it more difficult for any automated matching of the records. (I am assuming that Google will be doing automated matching, not hand searching of the database. That may be a mistaken assumption, especially since they have agreed that two humans will view the title page.)

This next (and last) one is an especially interesting case. I have a copy of Rebecca West's "Black Lamb and Grey Falcon: A Journey through Yugoslavia" printed by Penguin books in 1994. It gives the copyright date as "1940, 1941" and the renewal date as "1968, 1969", both under the name of Rebecca West.

A search on the title in Rutgers' database gets me these three records:

CLNA: WEST, ROBERT.
TITL: Black lamb and grey falcon. (In Atlantic monthly, Feb.-May 1941)
ODAT: 21Jan41 OREG: B482882; 19Feb41 RREG: Rebecca West ; 12Aug68; R441634-441631.

CLNA: WEST, REBECCA.
TITL: Black lamb and grey falcon; a journey through Yugoslavia. Pub. serially in the Atlantic monthly, Dec. 17, 1940-Apr. 17, 1941. NM: additions.
ODAT: 20Oct41; A158501 RREG: Rebecca West ; 10Jan69; R453530.

CLNA: WEST PUB. CO.
TITL: Black lamb and grey falcon. (In The Atlantic monthly, Jan. 1941)
ODAT: 20Dec40; B479489 RREG: Rebecca West ; 2Jan68; R426137.

As you can tell, some part of the book was originally published in the Atlantic Monthly as a serial. From these records it's difficult to tell exactly what issues of the monthly it was included in, and the "Claimants" are all different. In the Stanford database it's a bit more clear. There are five records; four are duplicates for the original articles in the Atlantic Monthly and one more called "Additions." Each of the four duplicate records is like this one:
Title    Black lamb and grey falcon. (In Atlantic monthly, Feb.-May 1941)
Author WEST, REBECCA.
Registration Date 21Jan41, 19Feb41,21Mar41 21Apr41
Renewal Date 12Aug68
Registration Number B482882, B488595, , B492319,, B495868
Renewal Id R441633
Renewing Entity Rebecca West (A)
I suppose that the four renewal records are one for each item in the Atlantic Monthly, but they each have the same information. Only the fifth record, the one for "additions," includes the subtitle that appears on the book. The presence of the article records is puzzling because Stanford claims to have included only records for the renewal of books. In fact, it is easy to find records for articles in the database, so it's probably best to assume that the database covers text in general.

Even for the human searcher, it may be difficult to connect the book and the records because there is nothing in the book itself to indicate that it was previously published in a journal. In fact, the introduction merely mentions that the book itself was first published in two volumes in 1941.

The book was published in two volumes because it is nearly 1200 pages long. The archives of the Atlantic Monthly list the four articles with this same name as containing 24, 24, 26, and 24 pages, respectively. It's rather hard to understand how those articles, as copyrighted, could be the same as a 1200 page book. We are left only with the record that claims to be "Additions" and that has the same subtitle as the book:

Title   Black lamb and grey falcon; a journey through Yugoslavia.
Pub. serially in the Atlantic monthly, Dec. 17, 1940-Apr. 17, 1941.
NM: additions
Author WEST, REBECCA
Registration Date 20Oct41
Renewal Date 10Jan69
Registration Number A158501
Renewal Id R453530
Renewing Entity Rebecca West (A)
Again, title field contains quite a bit of information beyond the title, and it just isn't crystal clear to me that this record is for the book and not for the articles. If it is for the book, then the idea that 1200 pages were published serially over four journal issues is quite a stretch. Plus, the Monthly archive claims that the dates are Jan, Feb, Apr and May, 1941.

Underlying this statement: "To determine if copyright has been renewed, "Google shall search either the United States Copyright Renewal Records or a copy thereof" is a great deal more complexity than that one sentence implies. It makes me wonder if the negotiators for the AAP are fully aware of how inaccurate the results might be. (An example: the author field in a record for an article by George Orwell reads: "Author George Orwell. U. S. ed. pub. as Shooting an elephant, 26Oct50, A49135".) If they are aware of it, then I must commend them for taking the practical path and allowing Google to make books available based on this evidence. If a copyright holder notifies Google that a book has been determined to be public domain in error, Google is obliged to change the status of the work from public domain to "in copyright," but is not held liable for infringement if the steps for determining public domain were followed and documented as laid out in the agreement.

It will be hard to determine, however, if Google should happen to err on the side of copyright, and lists as under copyright works that are actually in the public domain. While copyright holders can be expected to make sure that their works are properly protected, works in the public domain have no rights holder to monitor their status, and no one assigned to protect the public interest.

One other caveat, which appears in Section E, is:
Any determination by Google that a work is a Public Domain Book is solely for the purposes of Section 3.2(d)(v) and is not to be relied on or invoked for any other purposes, including determining whether a work is in fact in the public domain under the Copyright Act.
Basically, this means that just because Google determines that a book is in the public domain doesn't mean that's the legal status of the book. It also means that the rest of us can't use the excuse: "But Google says it's in the public domain." I have not heard whether Google will make the documentation of its copyright search available, and it's that documentation that has the real value. It's kind of like algebra: the answer is important, but what really matters is how you got the answer.

[Note: keep an eye on the Open Library and Creative Commons for some work on copyright determination that will be openly accessible.]

1 comment:

David Fulmer said...

Karen, it's important to remember that the online copyright renewal databases are based on lists of copyright renewal records maintained by the US Copyright Office and published in several ways, including a copyright card catalog on the fourth floor of the James Madison Memorial Building of the Library of Congress.

Many of the copyright renewals in the Stanford Copyright Renewal Database, for example, are based on transcriptions of the Catalog of Copyright Entries (CCE), issued by the US Copyright Office. This serial publication included a list of all copyright registrations including renewals received by the Copyright Office. The Online Books Page has scanned copies of many of them.

The renewal of the translation of "The Captive" is on page 1500 of Part 2 of the CCE from 1956. If you look at the scanned image you can see how much the c in Moncrieff looks like an o. I have used the Stanford Database extensively and I have not come across very many typos but there are some.

I'm actually quite optimistic about automating a lot of this matching. To use "The Captive" as an example, you could take the 100 subfield a "Proust, Marcel," for author, the 245 subfield a "The captive," for title and the 260 subfield c (just the digits) "1929" for registration year, plug that into an advanced search in the Stanford Copyright Renewal Database and voila, you have a single, correct match. The spelling mistake you noticed in the Stanford Copyright Renewal Database doesn't make a difference. Now when they've got "Chili growth and development" instead of "Child growth and development" that might be a problem. But like I said, I think these mistakes are rare.

At the University of Michigan library we check both title and author in the Stanford Copyright Renewal Database when we investigate the copyright status of a book so we probably would have found the correct renewal record for "1984" which is under "Nineteen eighty-four" not "1984".

While Wikipedia is correct in noting that "1984" will not enter the public domain in the United States until 2044 (95 years after the date of publication - see this chart), in some countries "1984" is already in the public domain and the book is easily accessible online. I dare the Association of American Publishers to sue American college students RIAA-style for illegally downloading "1984"! Get caught reading indeed.

You're right about "Black Lamb and Grey Falcon" being interesting-it's even more interesting than you realized.

The original copyright registrations were filed by the Atlantic Monthly and Rebecca West in 1941. Again, the CCE pages showing the renewals are worth looking at.

This page is from the CCE of 1968 showing the renewals of periodicals. You can see that the Atlantic Monthly was very thorough about renewing their copyrights.

v.167 no.1, the January 1941 issue, was copyrighted December 20, 1940 (B479489) and renewed January 8, 1968 (R426493). Rebecca West renewed her article January 2, 1968 (R426137) and her renewal is in the CCE Renewals for Books and Submissions to Periodicals, 1968 Part 1 page 1341.

The other renewal records that you found in the Stanford Database correspond to other renewal IDs for those articles in the Atlantic. In Part 2 of the 1968 CCE (they were issued twice per year), on page 2843, you will find Rebecca West's renewals from August 12, 1968 with IDs R441631 through R441634. The Atlantic Monthly renewed their copyrights and Rebecca West separately renewed the copyright of her contribution to those issues.

You have to consult the 1969 CCE, Part 1 page 1315 for the renewal of the NM:(New matter) additions. Rebecca West appears to have copyrighted this material herself on October 20, 1941 (A158501). The records you identified as duplicates in the Stanford Database are in fact different-look again, the "Renewal Id" number is different for each.

It would be wrong to assume that the Stanford Database covers text in general, it only includes US Class A (book) renewals. That page image from the Online Books Page of the Atlantic Monthly renewals shows that the Atlantic renewed the entirety of many issues of their magazine yet these renewals aren't in the Stanford Database.

I think you can see how useful the Stanford Database can be once you learn how to use it. It brings together the copyright renewal information from many different issues of the CCE, as well as some later records from the Copyright Office's separate, online, incomplete database.

I think you can also see how much more complicated this can get. Different parts of a single book can have different copyright statuses. Had West neglected to renew the copyright of the additions then most, but not all, of the two volume work would be in the public domain. On the other hand, had she neglected to renew the relatively short part of the book that appeared in the Atlantic, that still would be in copyright because of that magazine's renewals of their issues from 1941.

We recently had a patron interested in acquiring from us a reprint of "Hereditary Genius" by Francis Galton, originally published in the nineteenth century. We discovered that we couldn't make a copy of our digitized copy which was published in 1962 because the copyright of the introduction by C. D. Darlington had been copyrighted and renewed. That's just a few pages of a book of more than 400 pages that is in copyright but we don't have a way of separating out the in copyright part from the public domain part.