/ 31 October 2009

Web addresses to get non-Latin characters

The internet is on the verge of its biggest shake up in 40 years, and the most significant one since it crossed from academia to commerce in 1993. Icann, the organisation that oversees internet domain names, has approved the use of non-Latin characters such as Mandarin, Hindi, Cyrillic and Arabic in web addresses.

That means that the huge number of people who presently use the internet but are not native English speakers will be able to type web addresses in their own language and navigate to the pages — rather than, as at present, having to add “.com” or “.org” to the end of website names written in their own language — or even write an entire site address in unfamiliar letters. The first applications to register the new generation of domains will start next month, and they could be running by mid-2010, said Rod Beckstrom, the president of Icann, which oversees the development of such technologies online.

“Of the 1,6-billion users today worldwide, more than half use languages that have scripts that are not Latin-based,” Beckstrom said at the opening of Icann’s conference in Seoul, South Korea, this week.

The conference approved the change after more than nine years of work and two years of testing.

Lesley Cowley, chief executive of the UK naming organisation Nominet, said: “There are a further five billion people who are not yet online — most of these people are from nations where their language is not based on the Latin script. Allowing non-Latin based scripts will give this large group of people easier access to the web, helping to bring them online and making the internet more inclusive. This move will undoubtedly bring freedom to a globally connected community.”

Progress towards “Internationalised Domain Names” (IDNs) has been slow because every computer on the internet is actually addressed by a string of numbers; systems called “domain name servers” translate a web address such as “guardian.co.uk” into “77.91.248.30”, uniquely identifying it. The problem is that non-Latin letters — such as 大 月 人 — require an extra layer of software to deal with such “Unicode” characters.

But experts have warned that adding Unicode to web addresses could increase the potential for scams. Fake websites using bank details could use characters from other languages in order to fool people. “The incidental difference between BankofAmerica.com from BánkofAmerica.com would be a prime opportunity for cybercriminals to take advantage of the average web user,” Nora Nanayakkara, director of business development at the domain name seller Sedo, told Web User magazine.

The measures will mean more people can use the internet with keyboards in their own languages, rather than struggling with unfamiliar Roman letters as used in the west. Thus a Korean user will be able to write a web address almost entirely in Chinese script, rather than a few characters in Mandarin with the suffix “.kr”. Presently, only the domain name can be in non-Latin script.

“It’s more incremental [than previous changes] but it’s the single biggest change in 10 or 15 years,” Beckstrom said. “It’s about making the internet more global and more accessible. One world, one internet.”

One thing that will not be going away from web addresses, though: the “http://” prefix — which the inventor of the web, Tim Berners-Lee said earlier this month he wishes he had not made mandatory in web addresses: “Look at all the paper and trees that could have been saved if people had not had to write or type out those slashes on paper over the years — not to mention the human labour and time spent typing those two keystrokes countless millions of times in browser address boxes.”

Icann has let go of its previously tight ties with the US government this year as it has prepared to move to a more international system. – guardian.co.uk