State & Hill Fall 2013: Catalysts for Change

Page 7

Focus: Catalysts

S T A T E & HILL

‘Hathi,’ the Hindi word for elephant, signified the universities’ aspirations: a large collection with a powerful search engine and a long memory. Act 1

T

en years ago, Larry Page (BS ’95), co-founder of Google, contacted the University of Michigan to offer a rather unconventional gift to his alma mater: scans of the University’s entire collection of 7 million books, free of charge. Page had just developed a new scanning system that, unlike previous scanners, could produce text-searchable copies at unprecedented scale (millions of books per year rather than thousands), without damaging the books themselves. Courant, an economist then serving as provost— the chief academic officer of the University— remembers running a standard cost-benefit analysis. “What’s it going to cost?,” he asked the University’s head librarian, Bill Gosling. “What’s in it for us?” Beyond some staff time, Gosling explained that the costs would be borne by Page’s company. But the benefits, says Courant, “were at least very useful, and possibly super-useful.” For starters, digitization would provide a backup copy of the entire library, ensuring the longterm preservation of everything in the collection. “In the print world, preservation’s assured by the fact that there are a lot of copies out there, people toss them on the shelf, and they rot slowly,” says Courant. But what about historic collections now out of print? Or texts written by hand, before the invention of the printing press? Or limited edition print runs? Think of what happened to the priceless library of Timbuktu last January, or the Library of Congress in 1814, or the great Library of Alexandria, or countless other small libraries near and far. Libraries are safe, but hardly invincible, and preservation is a paramount concern. Digitization would also make it possible to run detailed text searches of all of the library’s collections. In the coming years, a descendant of the University of Michigan’s first AfricanAmerican student-athlete would use the HathiTrust Digital Library to locate news stories about her ancestor. An undergraduate honors student would use the HathiTrust to search the complete correspondence and writings of President Eisenhower for a thesis on Eisenhower’s attitudes toward nuclear weapons. And the U.S. Patent and Trade Office would use the HathiTrust to locate copies of patents lost

in an 1836 fire. Full-text searches would lead citizens, scholars, and policymakers directly to the material that interested them, enabling new and important discoveries. Finally, and perhaps most importantly, digitization would allow the libraries to offer free online access to all public domain works—generally works published before 1923, including almost all of the University’s rare historical collections—for anyone with an internet connection. What Wikipedia did for the encyclopedia, digitization could do for the library—but scholars would be able to access the original source documents themselves, not just a summary of their contents. It would be a powerful public service to the world. “That’s what libraries do,” says Courant. “That’s what universities do.” So after some back-and-forth haggling over the quality of the scans and access to original digital copies of each, the University of Michigan accepted Larry Page’s offer to digitize its collections. And several years later, after stepping down as provost, Courant was appointed dean of libraries, a post that would allow him to continue to work on the Google scanning project.

HathiTrust by the numbers as of November 12, 2013

10,846,727 total volumes 5,699,059 book titles 283,624 serial titles 3,796,354,450 pages 486 terabytes 128 miles of shelf equivalent 8,813 tons of print matter equivalent 3,492,430 volumes (~32% of total) in the public domain

7


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.