I have just posted the preprint of my current column for the Journal of Academic Librarianship, titled "Mass Digitization of Books." It takes about 4-6 months for the columns to be published, and as I read over this one I can see that things have already changed. For example, when I wrote the column, Google was not yet allowing the download of its public domain books.
However, I should have included one more very important issue in the article, but it hadn't occurred to me at the time: the effect of this mass digitization on our catalogs. The cataloging rules require that the digital copy be represented in the catalog with its own record. This means that a library that undergoes a mass digitization project on its book collection faces doubling the number of book records in its catalog. Leaving aside the issues of user display for now, and assuming that the creation of the records requires very little human intervention, we can probably still calculate a significant cost in storage space (albeit cheap these days), the size of backups, the time to load and index all of those records, and a general overhead in the underlying database.
This brings up the issue of creating catalog entries that represent "multiple versions," that is, having a single record that contains the information for all of the different formats in which the book is available -- regular print, e-book version, digitized copy, large print. There are good arguments both for and against, and it's a complex discussion, but I'll just say that I am convinced that we could structure our catalog records in a way that would make this work.