Tuesday, June 30, 2009

Even paranoids....

I'm not the most diligent of bloggers, by any means, and the contents of this blog are pretty narrow in terms of topics. Mostly I have written about Google books, about RDA and other library metadata developments, and recently about OCLC. Although each post is probably offensive to someone out there, the total number of enemies that I can make is probably quite small -- and compared to some bloggers nearly infinitesimal.

So imagine my surprise this morning when I received a notice from Google saying that my blog had been marked as Spam, and would be removed if I didn't take action. There are two ways that your blog can get the Spam qualification: 1) if it is caught by Google's automatic spam detectors and 2) if someone clicks on the "flag blog" link and reports it as spam.

Given the technical nature of my posts, I find the first possibility highly unlikely. This means that I must consider the latter. I hope it is only coincidence that my latest post (and one that has lingered here as the latest for a bit too long, perhaps) is a critique of OCLC and its record use policy. I would love to be able to say that I know that OCLC would not stoop to this kind of censorship, but unfortunately I have experience to the contrary.

Earlier this year I arrived in Dublin only to be refused admittance to a meeting that they had agreed that I could attend (and that I had flown all of the way to Ohio to attend). Than, a few months ago when OCLC was told that I would be writing an article for InfoToday on their "web-scale service" the journal's editor received numerous phone calls from OCLC's press person voicing OCLC management's "concern" that I had been chosen to write the article. What the editor was supposed to do about that concern wasn't articulated, but she kept me on the story and even resisted their request to review the article before it was published. It was a dramatic couple of days, and I'm very grateful to her for her unwavering defense of freedom of the press.

I admit that it is at least equally likely that some random person with a cosmic grudge decided to click on "this is spam," but you may understand why I'm beginning to be a bit paranoid, and wondering if I don't have real enemies.

Tuesday, June 09, 2009

OCLC Policy - What is the Question?

I have a difficult time understanding the discussion around the OCLC WorldCat Record Use Policy. At least one reason for my confusion is that I have yet to see an explanation of the problem that the policy is attempting to address. The recent talk by Jennifer Younger, on the initial recommendations of the WorldCat record use policy review board, left me with the same uncertainty: If this is the answer, than what is the question?

Many people have commented on Younger's slides, and in particular on the recommendation by the review board that OCLC abandon the November, 2008 policy and begin anew. Younger's talk, however, did not answer one of the key questions that she herself lists:

"Despite the statement on intent in the proposed policy and the use examples in the FAQ, respondents [to the review board's survey] indicated they do not see the problem, and where they do, they do not see how the proposed policy will address the problem." [Younger]

[Quotes in this post are taken from Peter Murray's much appreciated transcription of Younger's talk.]

Younger's brief defintion of the charge of the review board is:
"The review board is charged with recommending principles on which a new policy should be based." [Younger]
I strongly suggest that before beginning work on the principles that the review board take as much time as it needs to clarify the nature of the problem that the policy wishes to address. Unless the problem is clear, the policy cannot provide a coherent solution.

That said, what do we know about the problems as addressed by the policy and the review board?

"... a legal document..."


The policy document itself does not have a clear problem statement; it limits itself to stating what the rules might be for use and re-use of WorldCat records. The FAQ goes a bit further toward defining a problem, although as stated it seems to be saying that the problem is a lack of a policy:
"To be successful negotiating and working with prospective partners, many of which are in the private sector, OCLC needs to move beyond the Guidelines to a policy that will be recognized by these organizations and others outside the library-archives-museum space as a legal document. Achieving clarity about the rights and conditions for using and transferring WorldCat data is a precondition to OCLC's sitting down to talk, on its members' behalf, with organizations that otherwise might have little or no interest in promoting the use and visibility of library, archival, and museum collections and services." [FAQ]

In this statement OCLC arguing for its need to clarify the contract between OCLC and the libraries so that it can engage in revenue-producing deals with other entities. It seems unlikely that OCLC would not have the right or ability to make deals involving WorldCat records. If that is the question, then OCLC just needs an agreement with its membership that the organization can make use of the members' data in this way. But that's not the issue that the policy addressed. The issue isn't OCLC's use of the data but the libraries' use of the data. Without the policy, OCLC and the members are both free to monetize (or not monetize) the bibliographic records that they hold however they wish. The policy, however, specifically requires that all deals with other entities be controlled by OCLC. It also designates OCLC as the sole decision-maker for use of WorldCat records. It shouldn't be surprising that this may not be acceptable to all members.

What's Good for OCLC....

Statements by OCLC always equate the interests of OCLC with the interests of the members, a view that clearly isn't shared by all members. In her talk, Younger brings up this tension between OCLC and its members, which she calls "the gap problem":
"Perhaps then in this context we should not be surprised that 'the gap problem' emerged, that the proposed record use policy was perceived by many as putting OCLC's interests ahead of those of member libraries." [Younger]
Younger's talk places the solution to this problem squarely in the laps of the members, who are seen as the ones who need to change for the gap to be closed:
"Second, the equilibrium has been disrupted. We must revisit the social contract between OCLC and its members. ... But as new generations of members come into our ranks, it becomes more difficult to explain the social contract that is OCLC. Just as in ballroom dancing it takes two people to tango. We need to work together — OCLC and its members — to solve the gap problem as it relates to the past but more importantly to the future. We need to understand our respective roles in reinforcing the values for working within the OCLC collaborative and understand how those values can support working with other partners in the information ecosystem." [Younger]

As presented, the "gap" is that some members do not agree that what benefits OCLC always benefits the entire membership. The solution is to "reinforce the values for working within the OCLC collaborative." The assumption that what is good for OCLC is good for the members is presented without question. The 2008 policy lacked provisions for member input into the decision process relating to particular uses of WorldCat records, nor did it provide a mechanism for members to remedy decisions that they feel do not benefit them. Policy statements that give OCLC the sole decision-making power on record use ("may be withheld by OCLC, without liability, within its sole discretion" D. 3) understandably make some people nervous.

The "gap" is presented as a difference in perception, but never a difference in actual benefits. For example, in the FAQ, OCLC explains that one of the needs is for a policy that ensures a "fair return" to OCLC members. That "fair return" is revenue that goes to OCLC.

"The existing Guidelines, and now the revised Policy seek to support WorldCat’s continued value by ensuring that the use and transfer of WorldCat data outside the OCLC cooperative provides a fair return to OCLC members and benefits libraries, archives, and museums in general." [FAQ]
It shouldn't be surprising that a phrase like "fair return to OCLC members" causes members to ask what it is that they are getting. I can find nothing that explains what revenues have been received by OCLC for WorldCat data, and nothing specifically about how those revenues have benefited the cooperative. This points to what I perceive as one of the causes for the "gap problem" and that is a general lack of transparency about OCLC's business. It is not going to work to simply say to the membership "trust us, we have your interests in mind." The members have every right to ask for proof of that. OCLC, for its part, should be quite happy to show members how uses of WorldCat data have been for their benefit. (Note, OCLC's members council may have this information, but I don't find it in the annual reports, nor in the IRS 990 form, of which the most recent is the one for 2006.)

The Value Proposition

The one thing OCLC cannot rely on is any argument that OCLC must be supported and maintained simply because it is OCLC. Everything hinges on a convincing argument of WorldCat's "value." The value of WorldCat is invoked frequently, but no elaboration on the nature of the value is given:
"We must focus on the value of sustaining WorldCat for the benefit of members and non-members, for OCLC, libraries and other memory institutions, and other partners in the information ecosystem." [Younger]

Threats

The policy implied -- but never stated -- that OCLC was responding at least in part to some perceived threats. Younger uses the term "threat," but without elaborating or giving examples:

"One intent expressed in the proposed policy was to protect the members’ investment in WorldCat and ensure the use of WorldCat records would benefit the membership. We need to identify the major encroachments that threaten WorldCat." [Younger]
There are some hints that threats would be to the size, comprehensiveness and quality of the database. The policy's definition of "reasonable use" stated that reasonable use:

"... would not include any Use of WorldCat Records that:
a. discourages the contribution of bibliographic and holdings data to WorldCat, thus damaging OCLC Members' investment in WorldCat, and/or
b. substantially replicates the function, purpose, and/or size of WorldCat."

Younger expands briefly on the threat question, talking about comprehensiveness of the database, and cash flow:
"Would the proposed uses by an OCLC member, consortia, or other players lead to a less-comprehensive or authoritative WorldCat? Would the proposed uses draw a significant cash flow away from maintaining WorldCat? Would the proposed uses benefit some segment of the library and other memory institution community without materially diminishing the benefit and use of WorldCat by other members?" [Younger]
This last statement at least introduces the possibility that there could be uses of library bibliographic data that do not threaten WorldCat. Defining this line between threat and non-threat seems to be key to decision-making around record use. While possibly not appropriate for the policy itself, it could be defined in some detail in operational documents that are available to members and potential users of WorldCat records.

The Control Issue

There is an implication that allowing bibliographic data to "go wild" on the Internet would weaken OCLC. It seems obvious that the policy is designed to give OCLC control over the use of records in order to obtain revenue from the uses. Throughout Younger's talk she refers to "members" and "partners."
"Members see a future in which WorldCat is available for reasonable use on a non-discriminatory basis to members as well as to other partners." [Younger]
There is never any mention of the general public and no mention of open access. The vision here is clearly a closed system with usage controls in place. This was the problem with the policy as written, and it will continue to be a problem if OCLC (with or without its members) insists on maintaining control. There is no hint here that the OCLC model may need to change, that a walled bibliographic city no longer makes sense. It will be very disappointing if the review board does not challenge this basic assumption, if it does not explore other options for the future.

Conclusion

I hope that OCLC's members will insist on a clarification of the goals of the policy as well as on how those goals will be managed over time. Sticking my neck out, I conclude that:
  • there cannot be an workable policy without a clear problem statement to guide it
  • a library data silo is quite possibly not the best thing for the library community today, and this needs to be addressed
  • the idea that "what is good for OCLC is always good for OCLC's members" is unreasonable; no contract should be accepted that doesn't provide for negotiation between the library members and OCLC regarding uses of the WorldCat records

Tuesday, May 19, 2009

LCSH as linked data: beyond "dash-dash"

The SKOS version of LCSH developed by LC has made some choices in how LCSH would be presented in a linked-data format. One of these choices is that the complex headings (which is the vast majority of them) are treated as a single string:
Italy--History--1492-1559--Fiction

While this might fit appropriately as a SKOS vocabulary, in my opinion it does not work as linked data. I'm going to try to explain why, although it's quite complex. Part of that complexity is that LCSH is itself complex, primarly because there are many exceptions to any pattern that you might care to describe. (For more on this, I suggest Lois Mai Chan's Library of Congress Subject Headings, 4th edition, the chapter on geographic subject headings, pp. 67-89)

Taking the heading above, as I mentioned in my previous post, the geographic term Italy is not in LCSH even though it can indeed be used as a subject heading. Instead, Italy is defined as a name heading in the LC name authorities file. In that file, and only in the name file, alternate forms of the name are included (altLabels, in SKOS terminology):
451 __ |a Repubblica italiana (1946- )
451 __ |a Italian Republic (1946- )
451 __ |a Wlochy
451 __ |a Regno d’Italia (1861-1946)
451 __ |a It?alyah
451 __ |a Italia
451 __ |a Italie
451 __ |a Italien
451 __ |a Italii?a?
451 __ |a Kgl. Italienische Regierung
451 __ |a Ko¨nigliche Italienische Regierung

There are no altLabels in the LCSH entry for Italy--etc. And because the term Italy is buried in an undifferentiated string, there is no linked data way to say that the Italy in Italy--History--1492-1559--Fiction is the same as http://id.loc.gov/authorities/n79021783, which will presumably be the URI for the name.

It is assumed in LC authorities that the altLabels for a name term that appears in a subject heading apply to both the name used as a name and the name used as a subject heading. In the card catalog, where the name alone would appear first in the alphabetical browse of the cards, it was only necessary to make references to that "head" of the list, which would, in our case, be Italy alone. This has caused great problems in online catalogs where searching is by keyword, not a linear alphabetical search. Some systems manage to get around this by doing a string compare to the same subfields in name headings and subject headings, and then transferring the altLabel forms to the related subject headings.
$a Shakespeare, William, $d 1564-1616
$a Shakespeare, William, $d 1564-1616 $v Adaptations $v Periodicals
In this case, the $a and $d subfields represent the same authoritative entity. The rules say that they are, and must be, the same authoritative entity. If they don't match exactly then someone has done something wrong. They are both instances of a name identified as "n 78095332", and which will presumably be given the URI http://id.loc.gov/authorities/n78095332. There is no question about that.

There is also no question that when the name is used in a subject heading it has the full meaning that it is given in the name heading record, including alternate forms of the name and the many notes fields provided by the catalogers that created the authority record. That these don't appear in the LCSH file does not mean that it is not the case: it means only that the LCSH record assumes that the name record exists and provides that information, and that the information is applied to the name in the subject entry through the linear nature of the dictionary catalog.

We musn't confuse the form with the meaning. That LCSH has a rather arrested form is unfortunate, but it was never intended to be used outside of the context of the full set of authorities that gives full treatment to those things that have "proper names." (c.f. Chan, chapter 4)

If we wish for the LC authorities to be used in a linked data environment, then we have to make sure that the linking capabilities are there. Although I agree that each LCSH record has an identifier, and that identifier should be used, I don't agree that what is expressed in the LCSH record is a dumb, undifferentiated string. In this post I have addressed the relation to name headings, but there are other uses of controlled vocabularies within the subject headings that I haven't fully investigated yet.

Wednesday, May 13, 2009

LCSH as linked data: what is an LC Subject Heading?

The Library of Congress Subject Headings have been placed online in SKOS. You can search within the set or download the entire thing in RDF/XML or a n-triples. This is a welcome development.

I must say that I would also welcome some documentation on the decisions that were made, as viewing the actual data has left me with a number of questions. I'm going to begin my comments with a question about scope, and some confusion that is causing me as I think about how I would want to use this data.

What's an LC Subject Heading?

It appears that the LCSH file that is online represents those authority records whose LC control number begin with "sh", as in: sh 00009880. (Numbering 342,684 records.) However, if you do a Subject Authority Headings search in the LC authorities database you will retrieve any authority record that can be used as a subject. This means that you will retrieve personal names, corporate names, and geographic entities that can be used as subjects. (Note, this is probably a large portion of the name authority file.) This is a mixture of records with LCCNs that begin "n" (for name file) and those that begin "sh" (for subject heading file). I'm at a loss to explain/understand what determines whether a heading has an LCCN beginning with "sh" and would love to get an explanation.

The result is that a search in the LCSH file on the word "Italy" brings up 3,516 headings, with the word somewhere in the heading. However, the heading "Italy" alone is not included. You do have:
Italy, Central
Italy, Northern
Italy, Southern
and you have:
Italy, Northern--Civilization
Italy, Northern--Civilization--Germanic influences
etc.
But not "Italy."

A search in the name heading database on LC's online authority file yields a name heading entry for "Italy." That database (whose response is in the form of a browse list) has innumerable pages for corporate names under the initial term "Italy":
Italy.
Italy. Ambasciata (India)
Italy. Confederazione fascista degli industriali.
It also includes "Italy, Southern" with its LC control number "sh 85069035".

The upshot is that the LC Subject heading file at http://id.loc.gov is not the same as a subject heading search in the online authorities database. It also isn't always logical which file headings fall into. The "Italy. Ambasciata (India)" is in the name heading file as a corporate name, but "Palazzo Dell'Ambasciata di Spagna (Rome, Italy)" is in the subject heading file as a corporate name. There undoubtedly is a set of rules that explains all of this, but it seems to me that a separation of the subject file and the name files creates a split between headings that will not be mirrored in actual use.

This may not matter if the files are combined in the end, and the URI makes it look like all authorities will have ids that directly follow "/authorities/" in the URI. However, although they are both coded as corporate names, the "Palazzo... " record gets the "cool URI" http://id.loc.gov/authorities/sh2002000509#concept. Note the ending in "concept". I don't know what hash ending will be given to entries from the names file, but I do find it odd that corporate names ccould have two different hash endings, depending on which file they are from. To be frank, especially since the division into different files doesn't seem terribly logical, and that many items in the name file can also be used as concepts, I would prefer that the "#" indicate the type of heading (personal name, corporate name, conference, geographical name, topic) rather than the file that it comes from. That is, that the "#" would reflect the MARC tag - 100, 110, 111, 150, 151.

Sunday, May 10, 2009

Walt Crawford should read the document

In his March, 2009 Cites & Insites, Walt Crawford does a roundup of comments on the Google/AAP settlement, and gets very agitated when reviewing some of my posts. I'm used to that. But agitation tends to cancel out reason, and Walt gets some things wrong that he might have understood better if he had kept a clear head.

In response to my criticism that Google is digitizing without regard to collection building, Walt says:
"I don’t know of any big academic library or public library that’s a single disciplinary collection—or, realistically, a set of well-curated collections. "
I'd like to hear from academic librarians on this one. My understanding was that an academic library is INDEED a set of well-curated collections.

Walt:
"I don’t remember public universities admitting to substantial costs in cooperating with Google."
What's the cost? Dan Greenstein estimated $1-2 per book. Cheap, but still considerable for a library scanning millions of books. The cost is primarily in staff time, shelving and reshelving books. Under this agreement, there is also the cost of meeting the security requirements that are imposed. (That's in Appendix D) These requirements, which are possibly quite reasonable, will have a greater cost than what most libraries do today for digital materials, and will be one of the primary reasons why some libraries do not contract to receive copies of the digitized items. (Note that some of the potential library partners are working hard to collaborate on the Hathi Trust, which does appear to meet the standards of the agreement; others, however, have decided that they will not attempt to store digital copies.)

In a post I argued that had libraries gone ahead and digitized their own collections (for the purposes of indexing and searching), that this probably would have been considered fair use.

Walt:
"Well…this is not a judicial finding. I find it unfortunate that Google didn’t fight the good fight, and I think it will make things much harder for another commercial entity to attempt similar digitization and use—but I don’t see that library use of “their own materials” has changed in any way."
Not of their hard copy materials, but legal minds think that this changes the landscape for digitization and the use of digitized materials, even closing some options that might have been available before.
"The proposed settlement agreement would give Google a monopoly on the largest digital library of books in the world. It and BRR, which will also be a monopoly, will have considerable freedom to set prices and terms and conditions for Book Search’s commercial services.... If asked, the authors of orphan books in major research libraries might well prefer for their books to be available under Creative Commons licenses or put in the public domain so that fellow researchers could have greater access to them. The BRR will have an institutional bias against encouraging this or considering what terms of access most authors of books in the corpus would want." Pam Samuelson
And to my statement:
"The digitization of books by Google is a massive project that will result in the privatization of a public good: the contents of libraries. While the libraries will still be there, Google will have a de facto monopoly on the online version of their contents."
Walt first prefaces it with:
"I take issue with the very first sentence, as I’ve taken issue consistently with the same claim by others with even higher profiles than Coyle (who are even less likely to ever admit they could be mistaken)."
Well, it would have been nice if he had said who they are. But thanks for letting me know that you consider me a "lower profile" person, Walt. He goes on to say:
"Nonsense. Sheer, utter nonsense. The libraries and contents will still be there. OCA will still be there. I’m sorry, but this one just drives me nuts: It’s demonization of the worst kind and an abuse of the language."
Well, I'm not sure how this abuses language, but there is general agreement that Google gets a monopoly... at least on out-of-print books, which is the vast majority of books in libraries. (Not on public domain books, which is what the OCA digitizes, but anyone can digitize public domain books.) So although the libraries and their contents will still be there, and can be used in hard copy as they are today, no one but Google can digitize the in-copyright works without incurring liability. So "monopoly on online version of their contents" is a factual statement, if you understand that public domain is public domain. (Note, this settlement agreement is extremely complex, with some real zingers hidden in its 134 pages. It's not possible to cover it all in a blog post, so anyone who is interested really needs to read the document itself, painful as that process is.)

In terms of preservation and longevity concerns, Walt asks:
"Won’t the fully-participating libraries have digital copies? I can’t think of institutions with better longevity."
To begin with, only fully participating libraries will have digital copies, and we don't yet know how many libraries will choose that option. Other libraries, even those that are only allowing Google to digitize public domain books, do not get to keep copies of the digital files. (Not only that, public domain libraries that have been cooperating with Google have to delete all of their copies of the files that they hold today, as per this agreement. See Appendix B-3.) The only party with copies of all of the files will be Google.

There are statements in the settlement about what happens if Google "fails to meet the Require Library Services Requirement" or simply decides not to continue. I refer you to page 84 of the settlement, and hope that someone can make sense out of it. The way I read it, libraries can then engage a third-party provider, who will receive the files from Google.

The key thing here is that even in the event of the failure of Google, libraries are not allowed to make uses of their own scans, such as those that are permitted to Google by this settlement. The restriction to "computational uses" and some other minor uses stands, even in that eventuality.

When I say:
"Google should be required to carry all digital Books without discrimination and without liability."
Walt replies:
"You mean “all digital books that Google’s scanned”? I suspect Google wouldn’t argue with this."
That is exactly what I mean, and Google does indeed argue with it. As a matter of fact, the settlement only obligates Google to provide access to at least 85% of the books it scans. That "access" refers to the subscription service that will be available to libraries and other institutions. The settlement says:
"Google may, at its discretion, exclude particular Books from one or more Display Uses for editorial or non-editorial reasons." p.36
That's followed by an affirmation of the "value of the principle of freedom of expression," which I must say rings a bit hollow in this context. Google has to notify the Registry if it has excluded a book, and to provide a digital copy of that book to the Registry. The Registry can then seek out a third party to provide services for excluded books. Here, however, is James Grimmelmann's concern on that front:
"The second is that no one besides the Registry might ever find out that Google has chosen to de-list a book. If the Registry doesn’t or can’t engage a replacement for Google, the book would genuinely vanish from this new Library of Alexandria. Perhaps that should happen for some books, but decisions like that shouldn’t be made in secret. When Google choses to exclude a book for editorial reasons, it should be [R13] required to inform the copyright owner and the general public, not just the Registry. "
What might Google exclude? Perhaps very little, but at the ALA panel in Denver in January, 2009, Dan Clancy of Google gave an off-the-cuff remark that, as I recall, had the word "pornography" in it. Given the recent embarassment of Amazon when it had to face the fact that many of its best sellers are rather salacious in nature, I can imagine Google also developing concern about the visibility of the texts that make us uncomfortable.

There are a lot of legitimate reasons for concern about this proposed settlement. And I don't think that anything that I have said is "nonsense."

Thursday, April 30, 2009

Updates on Google/AAP Settlement

There have been some developments (and some non-developments) in the saga of the settlement between Google and the Author's Guild/Association of American Publishers.

First, the Internet Archive requested to intervene in the suit, saying that it, too, was digitizing books and therefore should be given the same status as Google. Judge Denny Chin, the judge deciding whether the settlement will be granted, denied the Archive's request with no additional information as to his reasoning.

Next, a group of authors and author representatives asked for an extension on the comment/intervention period, saying that they needed more time to study the agreement and how it affects their rights. Judge Chin granted a four-month extension, which now delays the opt-out period until September of 2009. This, of course, means four more months in which Google and the AAP are in limbo waiting to hear if the settlement will pass the court's test. Meanwhile, Google is going forward as if the settlement will be granted, re-negotiating contracts with the participating libraries and digitizing the files at the Copyright Office. (This latter I know from comments by Robert Kasunic of the Copyright Office at a meeting in Berkeley on the settlement.)

There is no question that it is very difficult to understand the implications of this suit. I myself thought about various books and articles that I have written that have been published and could not determine if I would be a member of the class of rights holders that the AAP and the Author's Guild claim to represent. Items of mine originally published on the Net have been reproduced in books and journals, and some items have been translated and republished. While a living popular author may have kept some control over his or her work, there are many millions more of us who have had some truck with the published word but in less clear circumstances. Anyone without an intellectual property lawyer on retainer will be hard-pressed to make a determination of their status.

The third development, and the most important, is that thanks to the efforts of some hard-working people (including the Internet Archive and Consumer Watchdog), the Justice Department has decided to take a look at antitrust implications of the settlement agreement. It's been widely noted that the agreement gives Google a monopoly over the vast number of books that are orphaned. It has also been noted that the settlement contains a "most favored nation" clause -- that is, that it contains wording that would require that Google always be afforded the most favorable terms that anyone else receives. This is another way of preventing competition, thus, the anti-trust question.

Numerous amicus briefs, originally intended to meet the May 5 deadline before the extension, are appearing. A good place to follow this is on James Grimmelmann's blog, Laboratorium.

Wednesday, April 08, 2009

E-books rise

The IDPF and AAP keep some statistics on ebook sales. Their latest press release says:
Trade eBook sales were $8,800,000 for January, a very significant 173.6% increase over January 2008. Just a reminder these are wholesale revenues reported from 13 participating Trade Publishers.
E-book sales are still a very small portion of overall book sales, but while sales of hard copies are falling, e-books are rising. The Kindle may have had some influence here.

On a slightly different Kindle note, I received an email from "Monica GoldenbergOn behalf of the National Federation of the Blind":
I wanted to let you know that tomorrow, on April 7th in New York City, the Reading Rights Coalition, representing millions of disabled people who cannot read print, will protest the threatened removal of the text-to-speech function from e-books for the Amazon Kindle 2 which promised for the first time easy, mainstream access to over 255,000 books. Hundreds of disabled Americans (the blind and people with dyslexia, learning difficulties, spinal cord injuries, seniors losing vision, stroke survivors) will assemble to demand that the Authors Guild reverse its decision.
(Protest covered by CNET.)

Organizations serving and representing people with text disabilities have been very active in the development of e-book standards and technologies because the e-book promises to greatly increase the access that people with disabilities have to print resources. As e-books and e-book readers become mainstream, more books and cheaper devices become available to everyone, including disabled readers, who are otherwise a niche market with no profit possibilities for companies serving them.

The Author's Guild is missing the point here, restricting access rather than allowing e-books a larger audience. And that audience isn't only the legally blind -- it is all of us who are aging, who have tired eyes, who commute by car, or who just want to take a walk and read the newspaper at the same time. The ability to turn on text-to-speech, with its monotone voice and odd pronunciation, is not the main reason to purchase e-books, but it has its moments for all of us, and is essential for many.