Originally published Saturday, April 26, 2008 at 12:00 AM
Scanning world's every book means turning many, many pages
In a dimly lit backroom on the second level of the University of Michigan library's book-shelving department, Courtney Mitchel helped a...
The Associated Press
ANN ARBOR, Mich. — In a dimly lit backroom on the second level of the University of Michigan library's book-shelving department, Courtney Mitchel helped a giant desktop machine digest a rare, centuries-old Bible.
Mitchel is among hundreds of librarians from Minnesota to England making digital versions of the most fragile of the books to be included in Google's Book Search, a portal that will eventually lead users to all the estimated 50 million to 100 million books in the world.
The manual scanning — at up to 600 pages a day — is much slower than Google's regular process.
"It's monotonous," the 24-year-old said.
Then she knit her career hopes into the work.
"But it's still something that I'm learning about — how to interact with really old materials and working with digital imaging, which is relevant to art history."
The unusually tight binding on the early 16th-century polyglot Bible made it hard to expose the portions toward the book's middle as Mitchel spread each pair of pages for the scanner.
Google, the Internet's leader in search and advertising, says the process for scanning the majority of the books in Book Search is proprietary. Employees will not discuss it except to say it is much faster than what Mitchel is doing and it's not destructive.
"It took us quite a while to develop it so we do keep that confidential," said a library manager for Book Search, Ben Bunnell, who declined even to say where Google does the scanning.
Cutting costs
Funding from Google allows the 28 libraries it's working with to cut their digitizing costs because they don't have to pay for scanning the books Google wants to include in Book Search.
Through Book Search, users can track down a book on any topic and read a small portion. If the book's not protected by copyright, users can download the whole thing. If it is, or if they just want to read an original, they can use Book Search to find copies to buy or borrow.
![]()
More than 1 million rare or fragile books have been digitized through the Google-Michigan partnership since it began in 2004, with an estimated 6 million to go.
Book Search has the support of many publishers, authors and librarians. But some publishers and authors have sued, claiming the service violates their copyrights. Google says Book Search is aboveboard because Web surfers can retrieve only snippets of copyrighted material through the service.
Brewster Kahle, founder and digital librarian of the Internet Archive at the Open Content Alliance, said Google may be trying to "lock up the public domain" by making proprietary copies of works whose copyrights have expired — which includes the vast majority of the world's books.
Kahle said there's a core value in the project, but he questioned whether Google will share the works it digitizes with other search engines.
"We believe there should be many libraries, many publishers, many search engines, many types of users from different points of view," Kahle said.
John Price Wilkin, Michigan's associate university librarian, called Kahle's stance "theoretical."
"Our volumes are entirely open in the sense that people can find them, read them, use them, do all the things that they would do in scholarship or pleasure," Wilkin said.
In the room where Mitchel and colleague Chava Israel, an artist, work, the temperature is always in the 60s.
Each technician has a slightly angled table with a flexible middle that cradles books and holds them still while two overhead cameras photograph the pages. Sometimes the women play music or listen to news online, but they often work in silence, save the clicks of their computers and scanners.
Scanning software
Mitchel glides in a rolling chair forth and back between scanner and computer, computer and scanner, turning page upon page and clicking her mouse to shoot each pair. Once the images reach the computer, the women use the book-scanning software Omniscan from Germany's Zeutschel GmbH to clean them up.
A final click of the mouse sends each digitized book to Google for optical character-recognition processing, which makes the text searchable. Google then returns a copy of the images and data to the library and posts another to the Web.
Israel, 44, who has been scanning books for three years, takes a philosophical view of the project.
"My favorite part is working with older books and being able to preserve a lot of the knowledge and help bring more people access," Israel said. "I turn pages. It's kind of meditative."
Copyright © 2008 The Seattle Times Company
An 802.11n upgrade could make a big difference
Retailers opening doors on Thanksgiving Day
Google makes concessions on digital book deal
Critics want to block Comcast-NBC deal
Google submits revised book settlement

Opening day at Crystal Mountain
Skiers crowded the slopes at Crystal Mountain for one of the resort's earliest openings.
nwjobs

Post a comment

Michelle Goodman blogs about work/life balance.
How to tell your office you're gravely ill
Post a comment
nwautos

Choosing a new sedan? Weigh the impact of your choice on your wallet and on the planet.
Post a comment
- Homeless man, 46, arrested in Greenwood arsons
- KVI talk radio host off the air as of Thursday
- Steve Kelley | ESPN's Bill Simmons gets us: He hates Clay Bennett, too
- Police investigate videotaped arrest
- Seattle U. Men's Hoops | Big recruit goes from Huskies to Redhawks
- Razor found in muffin an accident, 'mortified' baker says
- Mariners sign Jack Wilson to 2-year contract
- Suspect's family shaken by slaying of police officer
- Mountlake Terrace woman reports razor in muffin
- Ivar's undersea billboards a hoax devised as marketing ploy
- Police investigate videotaped arrest
631 - Seattle man to pack a pistol into community center to protest mayor's ban
207 - Light rail to airport to begin Dec. 19
177 - GOP clueless as families struggle with health care
167 - KVI talk radio host off the air as of Thursday
136 - Mariners sign Jack Wilson to 2-year contract
121 - Prosecutor weighs death penalty in police slaying
103 - Wright State game thread
97 - Person of interest in custody in connection with Greenwood arsons
95 - Rang says Locker not ready for NFL
85
- Light rail to airport to begin Dec. 19
- Homeless man, 46, arrested in Greenwood arsons
- Ivar's undersea billboards a hoax devised as marketing ploy
- Steve Kelley | ESPN's Bill Simmons gets us: He hates Clay Bennett, too
- Washington in race for federal education funds
- KVI talk radio host off the air as of Thursday
- Police investigate videotaped arrest
- Goodwill's Glitter Sale is Nov. 14-15
- Boeing: 787 fix is complete on first plane
- Seattle U. Men's Hoops | Big recruit goes from Huskies to Redhawks









