/ 2 February 2005

Search and you shall find

In 1990, which is almost unimaginably long ago in internet years, the notion that computer scientists might one day create an artificial replacement for human memory was the stuff of science fiction. Literally so: the idea was the premise of Total Recall, the needlessly violent and confusing Arnold Schwarzenegger movie released that year by Twentieth Century Fox. (The future governor of California was cast — appropriately, some might have argued — as a man who has had part of his brain stolen.)

Real computer science in 1990 was far more modest. At McGill University in Canada, one of its practitioners, a student named Alan Emtage, was busy developing a program that would enable people to find documents on the embryonic computer network known as the internet. He wanted to call it Archives, but the system he was using didn’t allow names that long, so the first-ever search engine had to be called Archie instead.

At least, that’s probably how it happened: most of the information in the paragraph above comes — albeit double-checked — originally from sources reached through internet search engines. Emtage couldn’t possibly have known at the time, but his invention gave birth to an idea that would come to change, at a fundamental level, the way we think. Its most feverish point was reached yesterday, when Bill Gates officially launched MSN Search, Microsoft’s long-awaited rival to Google, the website that has become a sort of outboard brain for millions of people in the information age.

Google and its rivals no longer just point us in the direction of useful websites. They help us navigate through online dictionaries and telephone directories, scholarly journals and image libraries; one Google service launched this month allows clip-by-clip searching of American television news broadcasts. The company has announced plans to digitise a million books from Oxford University’s Bodleian library, along with many millions more from Harvard, Stanford and the New York Public Library.

Enter ”time in Beijing” into the AskJeeves.com search engine and it will tell you the actual time in Beijing; put ”John-Stevens New-Mexico” into Google and it will give you the phone numbers and home addresses of four people by that name in that state.

”It becomes an extension of my mind, an extension of my taste, my sensibility, my active memory,” says Sherry Turkle, a philosopher at the Massachusetts Institute of Technology in Boston. We no longer need to remember, yet nor can we ever truly forget, because everything’s out there, logged and stored. It’s not so much that knowledge is power. Dominating the market in the tools people use to navigate that knowledge — which is what Gates wants to do now — is the true source of power. Own that, and is it too much of an overstatement to say that you own a little piece of people’s brains?

In truth, Microsoft has been incredibly late in coming to the game. A recent survey, conducted before the full launch of MSN Search, showed that the site that 66% of Microsoft employees were using to do their searching was … Google. The California-based company, founded in 1998 by the Stanford students Larry Page and Sergey Brin, revolutionised the concept of searching. Its system relies on the wisdom of crowds — the belief that the more people link to a page, the more authoritative it becomes.

At a stroke, searching was democratised. There was no need for people to sit at desks, categorising pages according to their subject matter; the classification of information on the internet became a self-supporting, organic structure. The power to find things on the internet was in the hands of the people, not the programmers. (MSN Search appears to follow a similar logic.)

Google also happened upon an ingenious source of revenue for its ever-expanding operations. As web users became increasingly exhausted by louder and louder and more and more annoying advertising, Google introduced low-key advertising on its results pages, matched to the words being searched for. The ads got their value not by bludgeoning users around the head, but by being precisely targeted to their specific interests – an advertiser’s demographic dream. Competition is hotting up: A9, the search facility attached to the website of the bookseller Amazon.com, allows visitors to search the text and footnotes of books — real books, made of paper — before committing to purchasing them.

But do we really want all this new information? ”The end result of a perfect search world is that as fast as answers are generated and consumed, new questions come quicker, with the consequence that ignorance expands,” the internet guru Kevin Kelly writes in an email. ”What we know we don’t know expands faster than what we know. This has been true for a while and will only continue. Science, in fact, will come to be measured as the expansion of our ignorance, rather than an expansion of our knowledge.”

Turkle, meanwhile, says she has noticed subtle transformations in the ways some of her students think and order their ideas. ”There is this sense that the world is out there to be Googled,” she says, ”and there is this associative glut. But linking from one thing to another is not the same as having something to say. A structured thought is more than a link.”

Besides, she adds, something seems to be missing. She is organising a forthcoming trip to Bangkok with her daughter, and recently pulled up search results promising 50 000 websites related to her area of interest, where before she would clip cuttings from magazines and keep them in files. ”But there’s a superficiality, a sameness. I get 50 000 sites, but the one I cut out from a magazine came from somewhere quirky, or it was something one of my friends had sent me …”

And then there’s the question of accuracy. In fields where tracking down a completely correct fact is the very point of the job, people still regard search engines as very far from direct access points to the truth. David Elias, a professional verifier of answers for television quizzes, says he tends to mistrust the web — ”There’s so much rubbish out there”.

”The rule is that you have to have two independent sources saying the same thing,” says Katie Purnell, who researched questions for the quiz show Grand Slam. ”We would very rarely go with a question if we could only find the answers by searching the internet. There’s still something about the Encylopaedia Britannica.” (Although the Encyclopaedia Britannica is, naturally, online.)

But there is no sign of any diminution of our appetite for what aficionados now know simply by the noun ”search”. Over the past year, the battle between the search engines has intensified dramatically, and last year AOL and Yahoo launched their own search technologies. Google continues to be the market leader, and now commands around 42% of all internet searches — a commanding, but far from unassailable position. Enormous sums are involved: the company’s flotation, last autumn, raised more than $1bn on the opening day, and its price now stands at almost $200 a share. It is all a very long way from just a few years ago, when what you put on the web was what mattered, not how you helped people find it.

With the market consolidated between a handful of strong players, the biggest threat to Google and MSN Search will be from non-traditional companies who choose to do something daring: abandon Google’s page-ranking model. Sites with names like Orase and Technorati are offering a new phenomenon, real-time searching, which catches events and web-pages as they are put on the internet, instead of sending ”spiders” through the web to index them afterwards. Meanwhile, a variety of new tools promise to seamlessly integrate searching for data on the web with searching for data on our own computers. It need not stop there: if internet telephone calls become as ubiquitous as email, why shouldn’t we search through archives of our every conversation? Every new development is a new opportunity to deliver value to the user (and thus, not coincidentally, to advertise to them even more effectively). If search engines really do function as artificial memories, they are memories that are set to become ever larger and ever more rapidly updated.

But will being able to access more information really help us do the things we want to do? Dr Eric Davies, a library scientist at Loughborough University, has seen it all: long rows of card index files, followed by proprietary computer-search systems installed in libraries, and now the internet. ”Fundamentally the important point remains the same,” he says. ”Before you go anywhere near a keyboard or a catalogue, you have to define the subject you’re after. These are disciplines we need to recover,” he adds, with only the faintest trace of exasperation, ”and instil in people who just have the Google mentality.” You can’t fight technology, Sherry Turkle says, but you can influence the way it changes your life. ”I can’t ask my students to pretend they don’t live in this century,” she says. ”But you have to put the technology in its place.”

Which is the best engine to use?

Simple searching

For a no-frills query, good old Google remains the best. But if you’re looking to organise and refine the results – eg, you’re after Franz Ferdinand the band, but are bombarded with pages eulogising the assassinated archduke — you want a clustering site. These use a combination of major engine results, which are sorted into folders grouping similar items together. Clusty (http://clusty.com) is the most user friendly. Clusty can uncover unexpected results and relationships between items — who’d have thought McFly was not only a band and a character in Back to the Future, but an angling accessory too?

News

To keep up with breaking stories, check out News Now (www.newsnow.co.uk). UK-based, this site casts a fast and wide net and refreshes every five minutes. Although you need to be a paying subscriber to perform your own searches, its ”Hot Topic” feeds are excellent (yesterday these keywords included Michael Jackson, Hercules Crash and Guantánamo). The presentation is clear and tidy, with national flags indicating the country of origin of each report.

Audio and video

For multimedia content, visit the Singing Fish. Unlike traditional search engines, it only indexes particular types of media files, including Windows Media, Real, QuickTime, and mp3s. Say you’re a Kurt Vonnegut fan who would like nothing more than to hear him read from Slaughterhouse-Five. Type in his name and up pops a 10-minute excerpt. The search options are configurable, meaning that you can choose to search just for TV or radio, for example, and it is possible to save your settings.

Expert

Teoma (www.teoma.com) uses something called Subject-Specific Popularity, which ranks a site based on the number of same-subject pages that reference it, not just general popularity. This means you get less speculative tosh muddled up with the most relevant results. Teoma’s most interesting feature is its Resource tool, which can find and identify expert resources about a particular subject. These sites feature lists of other authoritative sites and links relating to the search topic. So if you do a search for ”East Germany”, as well as getting a standard list of results, you get a page of links on literature from the former GDR created by an American university. — Guardian Unlimited Â