| Traffic | Weather | Your account | Movies | Restaurants | Today's events |
|
|
Sunday, November 12, 2006 - Page updated at 12:00 AM Could the future bring the Internet as your personal adviser?The New York Times SAN FRANCISCO — From the billions of documents that form the World Wide Web and the links that weave them together, computer scientists and a growing collection of startup companies are finding new ways to mine human intelligence. Their goal is to add a layer of meaning to the existing Web that would make it less of a catalog and more of a guide — and provide the foundation for systems that can reason in a human fashion. That level of artificial intelligence, with machines doing the thinking instead of simply following commands, has eluded researchers for more than 50 years. The effort, referred to as Web 3.0, is in its infancy, and the idea has given rise to skeptics who have called it an unobtainable vision. But the underlying technologies are rapidly gaining adherents, at big companies such as IBM and Google and at small ones. Their projects often center on simple, practical uses, from producing vacation recommendations to predicting the next hit song. But in the future, more powerful systems could act as personal advisers in areas as diverse as financial planning, with an intelligent system mapping out a retirement plan for a couple, for instance, or educational consulting, with the Web helping a high-school student identify the right college. The projects aimed at creating Web 3.0 take advantage of increasingly powerful computers that can quickly and completely scour the Web. "I call it the World Wide Database," said Nova Spivack, founder of a startup whose technology detects relationships between nuggets of information, rather than storing the information itself. "We are going from a Web of connected documents to a Web of connected data." Seamless connection? Web 2.0, which describes the ability to seamlessly connect applications (such as word processing) and services (such as photo-sharing) over the Internet, has in recent months become the focus of dot-com-style hype in Silicon Valley. But commercial interest in Web 3.0 — or the "semantic Web," for the idea of adding meaning — is only now emerging. The classic example of the Web 2.0 era is the "mash-up," connecting a rental-housing Web site with Google Maps to create a more useful service that automatically shows the location of each rental listing.
In contrast, the Holy Grail for developers of the semantic Web is to build a system that can give a reasonable and complete response to a simple question such as: "I'm looking for a warm place to vacation and I have a budget of $3,000. Oh, and I have an 11-year-old child." Under today's system, such a query can lead to hours of sifting — through lists of flights, hotel, car rentals — and options are often at odds with one another. Under Web 3.0, the same search would ideally call up a complete vacation package planned as meticulously as if it had been assembled by a travel agent. How such systems will be built and how soon they will begin providing meaningful answers are a matter of vigorous debate among academic researchers and commercial technologists. Some experts are focused on creating a vast structure to supplant the existing Web; others are developing pragmatic tools that extract meaning from the existing Web. But all agree that if such systems emerge, they will instantly become more commercially valuable than today's search engines, which return thousands or even millions of documents but as a rule do not answer questions directly. Underscoring the potential of mining human knowledge is an extraordinarily profitable example: the basic technology that made Google possible, "Page Rank," systematically exploits human knowledge and decisions about what is significant to order search results. It interprets a link from one page to another as a "vote," but votes cast by pages considered popular are weighted more heavily. Today, researchers are pushing further. Spivack's company, Radar Networks, for example, is one of several working to exploit the content of social computing sites, which allow users to collaborate in gathering and adding their thoughts to a wide array of content, from travel to movies. Radar's technology is based on a next-generation database system that stores associations, such as one person's relationship to another (colleague, friend, brother), rather than specific items such as text or numbers. One example that hints at the potential of such systems is KnowItAll, a project by a group of University of Washington faculty members and students that has been financed by Google. One sample system created using the technology is Opine, which is designed to extract and aggregate user-posted information from product and review sites. One demonstration project "understands" concepts such as room temperature, bed comfort and hotel price, and can distinguish between concepts such as "great," "almost great" and "mostly OK" to provide useful direct answers to questions about hotels. Whereas today's travel-recommendation sites force people to weed through long lists of comments and observations left by others, the Web. 3.0 system would weigh and rank all of the comments and find, by cognitive deduction, just the right hotel for a particular user. "The system will know that spotless is better than clean," said Oren Etzioni, a University of Washington artificial-intelligence researcher who is a leader of the project. "There is the growing realization that text on the Web is a tremendous resource." Researchers and entrepreneurs say that while it is unlikely there will be complete artificial-intelligence systems any time soon, if ever, the Web is producing an expanding cascade of useful intelligence-based systems grounded in commercial efforts to structure and mine the Web. Focused areas such as travel sites and product and restaurant reviews are obvious candidates to build such systems, which will herald the arrival of Web 3.0. "It's a hot topic, and people haven't realized this spooky thing about how much they are depending on A.I.," said W. Daniel Hillis, a veteran artificial-intelligence researcher who founded Metaweb Technologies in San Francisco last year. He was referring to the thousands of current examples of limited intelligence, from smart Web cams that watch for intruders to Web-based e-mail programs that recognize dates and locations. Like Radar Networks, Metaweb is not publicly describing what its service or product will be, though the company's Web site states that Metaweb intends to "build a better infrastructure for the Web." "It is pretty clear that human knowledge is out there and more exposed to machines than it ever was before," Hillis said. Project's roots Radar Networks and Metaweb have their roots in part in technology-development work done originally for the military and intelligence agencies. Early research financed by the National Security Agency, the CIA and the Defense Advanced Research Projects Agency predated a pioneering call for a semantic Web made in 1999 by Tim Berners-Lee, creator of the World Wide Web a decade earlier. Intelligence agencies were early backers of the idea of using artificial-intelligence techniques to sort through gigabytes of digital information, an idea gaining currency in the coming Web 3.0 era. That led during the 1990s to the emergence of a small but thriving "text-analytics" industry aimed at helping large corporations extract information from databases. It also helped underwrite the work of Doug Lenat, a computer scientist whose company, Cycorp, of Austin, Texas, sells systems and services to the government and large corporations. For the past 25 years, Lenat has labored on an artificial-intelligence system named Cyc that he claimed would someday be able to answer questions posed in spoken or written language — and to reason. Cyc originally was built by entering millions of common-sense facts that the computer system would "learn." But in a lecture given at Google earlier this year, Lenat said Cyc is learning by mining the World Wide Web, a process that is part of how Web 3.0 is being built. Separately, IBM researchers say they are routinely using a digital snapshot of the 6 billion documents that make up the nonpornographic World Wide Web to do survey research and answer other questions for corporate customers trying to solve problems as diverse as market research and corporate branding. Daniel Gruhl, a staff scientist at IBM's Almaden Research Center in San Jose, Calif., said the data-mining system, Web Fountain, has been used to determine the attitudes of young people on death for an insurance company and was able to choose between the terms "utility computing" and "grid computing," for an IBM branding effort. "It turned out that only geeks liked the term 'grid computing,' " he said. IBM has used the system to do market research for television networks on the popularity of shows by mining a popular online community site, he said. Additionally, by mining the "buzz" on college music Web sites, researchers predicted songs that would hit the top of the pop charts two weeks in advance, a capability more impressive than current market-research predictions. Copyright © 2006 The Seattle Times Company
Most read articles
|
Emerge from the winter with a handmade project and a new skill.
More shopping |