GOOGLE INDEXING BOOKS OF MAJOR LIBRARIES

PREV ARTICLE NEXT ARTICLE FULL ISSUE PREV FULL ISSUE

V7 2004 INDEX E-SYLUM ARCHIVE



The E-Sylum:  Volume 7, Number 51, December 19, 2004, Article 4

GOOGLE INDEXING BOOKS OF MAJOR LIBRARIES

  From Forbes magazine:
  "Google just made the Internet significantly bigger 
  -- at least for the worlds of search and book 
  publishing. 

  The Mountain View, Calif., search engine company has 
  reached agreements with Harvard University, The 
  University of Michigan, Stanford University, Oxford 
  University, and The New York Public Library to scan 
  their books and make the digitized contents searchable. 
  Up to 50 million titles are involved, including titles
  held in common by the libraries. 

  The project, which will probably take five or more 
  years to complete, will deliver a database of volumes
  that Google users can search. Users will be able to 
  download entire volumes in the database that are not 
  under copyright protection. Books under copyright 
  will be excerpted at varying lengths, depending on 
  whether Google has agreements with their publishers 
  to carry longer excerpts."

  To read the full article: Full Story

  From the New York Times:
  "It may be only a step on a long road toward the 
  long-predicted global virtual library. But the 
  collaboration of Google and research institutions ...
  is a major stride in an ambitious Internet effort by 
  various parties. The goal is to expand the Web 
  beyond its current valuable, if eclectic, body of 
  material and create a digital card catalog and 
  searchable library for the world's books, scholarly 
  papers and special collections."

  "Within two decades, most of the world's knowledge 
  will be digitized and available, one hopes for free
  reading on the Internet, just as there is free 
  reading in libraries today," said Michael A. Keller, 
  Stanford University's head librarian."

  To read the full article: Full Story

  From the Associated Press:
  "The Michigan and Stanford libraries are the only 
  two so far to agree to submit all their material to 
  Google's scanners.

  The New York library is allowing Google to include a
  small portion of its books no longer covered by 
  copyright while Harvard is confining its participation
  to 40,000 volumes so it can gauge how well the process 
  works. Oxford wants Google to scan all its books 
  originally published before 1901."

  "This is the day the world changes," said John Wilkin, 
  a University of Michigan librarian working with Google. 
  "It will be disruptive because some people will worry 
  that this is the beginning of the end of libraries. 
  But this is something we have to do to revitalize the 
  profession and make it more meaningful."

  To read the full article: Full Story

  From the Boston Globe:
  "Company spokeswoman Susan Wojcicki said the project
  is the fulfillment of a dream for founders Sergey Brin 
  and Larry Page. "This is something the founders wanted 
  to do before they even started Google," she said. "The 
  mission of the company, from the day it started, was 
  to organize the world's information and make it easily 
  accessible."

  But Google also hopes that its book search service will
  give it a major edge over rival search services, including
  an up-and-coming challenge from software titan Microsoft 
  Corp. "Google has constantly over time always been 
  increasing our search index," said Wojcicki. "Having a 
  more comprehensive search engine . . . leads to, we 
  believe, a better product." In turn, that means more 
  visitors to Google's search service, which makes money 
  by selling advertisements."

  To read the full article: Full Story

  How does this commercial effort affect nonprofit efforts
  to digitize some of the same material?  In earlier E-Sylums
  we discussed the "million book" plans.  From the San Jose
  Mercury News: 

  "Libraries from India, China, Egypt, Canada and the 
  Netherlands, for instance, are working with the San 
  Francisco-based non-profit Internet Archive on a plan to
  create a publicly available digital archive of one million 
  books on the Internet.

  "The public domain belongs to the public and should be 
  publicly accessible without running only into commercial 
  interests,'' said Brewster Kahle, founder and president of
  the Internet Archive. ``There's room for both, and I hope 
  that we do not evolve into an either-or situation."

  To read the full article: Full Story

  Bill  Rosenblum writes: "My son works for the University 
  of Michigan library as a digital librarian (whatever that
  is) and has been involved in the acquisition of scholarly 
  publications to be put on line.  He told me that he and 
  his colleagues were told of the Google plan about two 
  hours before the press release and were as surprised as 
  most everybody else."

  Dick Johnson adds: "It made news this week. Five major 
  libraries in U.S. and U.K. agreed to have their books of 
  greatest scholarly interest digitized and will be placed 
  on Google's website for anyone in the world to access. 
  This continued a plan announced earlier, and reported 
  in E-Sylum last week, that a group of libraries in the 
  U.S., Canada, Netherlands, Egypt and China plan to 
  digitize one million books, with 70,000 available by 
  April 2005. 

  The five major libraries who have agreed to open their 
  stacks are Harvard, University of Michigan, Stanford 
  and the New York Public Library in the U.S. and Oxford 
  University in England. The agreement with each library 
  differs. Harvard's agreement is limited to 40,000 volumes, 
  in contrast to the full collections at Stanford and 
  Michigan; NYPL agreed to "fragile material not under 
  copyright."

  This has come about at the present time because Google 
  became wealthy from its stock offering last summer. It
  is employing its newly gained wealth to stretch its 
  already humongous databank towards a long-predicted 
  global virtual library. The cost is estimated at $10 
  to digitize each book.

  The digitizing task is labor intensive. It requires 
  several people to operate sophisticated scanners whose 
  high-resolution cameras capture one page at a time. At 
  Stanford Google hopes to scan 50,000 pages a day within 
  a month, doubling this amount with more people and 
  equipment.

  When this story first broke, December 14th, 629 
  newspapers ran the story or commented on it before Google 
  took the story down. One of the best was by George Kerevan 
  editorializing in Scotsman.com. "I can't wait," he wrote, 
  "for Google to get on-line with the Bodleian Library's one
  million books. Yet here's one other thing I learned from
  a physical library space: the daunting scale of human 
  knowledge and our inability to truly comprehend only a
  fraction of it."

  How soon until a large number of numismatic works will 
  be digitized, perhaps among those millions of books in 
  five or more libraries, is yet to be seen. Existing 
  numismatic libraries, however, still have a major 
  function to perform in gathering bound books and 
  documents for present and future numismatic scholars
  to use."

  Kerevan's comments: Kerevan's comments

  [There are a lot of caveats in Google's ambitious plan;
  for example, Harvard is hedging, wanting proof that the
  process will not damage its holdings.  But it's another
  important step in the march toward digitization.  I
  question the $10/book estimate, for despite all the high-
  tech trappings, the drudgery of scanning and correcting
  text is still a slow process, and time equals money;
  see the following item by Mike Marotta's about the effort
  going into making The Electronic Numismatist.  If Google
  uses gentle but efficient book-scanning robots (which I'm
  not sure exist yet), then perhaps the $10/volume estimate
  is correct, but human editors with subject matter knowledge
  are still likely to do a better job of digitization,
  albeit at a higher price.

  Collectively, how many out-of-copyright numismatic works
  are in those libraries?  More importantly for writers
  and researchers, how many tidbits of numismatic knowledge
  are locked in those pages, currently unseen and unknown?
  As more works become accessible through indexing, more and
  more new numismatic information is likely to become 
  available to researchers.  It could indeed be a whole new
  world.  -Editor]

Wayne Homren, Editor

The Numismatic Bibliomania Society is a non-profit organization 
promoting numismatic literature. See our web site at coinbooks.org.

To submit items for publication in The E-Sylum, write to the Editor 
at this address: whomren@coinlibrary.com

To subscribe go to: https://my.binhost.com/lists/listinfo/esylum

PREV ARTICLE NEXT ARTICLE FULL ISSUE PREV FULL ISSUE

V7 2004 INDEX E-SYLUM ARCHIVE