Internationalized Domain Names (IDN) |
Introduction |
|
The abbreviation IDN stands for Internationalized Domain Name, also called a
multilingual domain name. Normal (traditional) domain names are limited to the
character set consisting of Latin letters (A-Z case ignored so includes a-z),
digits (0-9) and the hyphen (-), called LDH (Latin, Digits, Hyphen) characters
below. An IDN is a domain name that contains characters from the
Unicode
repertoire, and therefore may contain letters with diacritics, as required by
many European languages, or characters drawn from non-Latin scripts such as
Arabic or Chinese.
Though there are many concepts for implementing IDNs, the only recognized
(standard) mechanism is called IDNA (Internationalized Domain Names in
Applications), and was agreed by the IETF’s IDN Working Group. It was announced by IETF as a proposed standard in March
2003. The proposed standard encompasses the following IETF RFCs: RFC 3490, RFC 3491,
RFC 3492.
IDNA is devised to handle Internet domain names containing characters from
character sets other than the Latin Character Set, also known as LDH. Deployment
of IDNA entails no changes to current Internet infrastructure and preserves the
robustness of the DNS. In general, the idea behind the IDNA functioning is based
on conversion of non-LDH characters of an IDN into suitable LDH ones by a user
application (e.g. web browsers). Such a solution is designed for maximum
compatibility with the existing DNS system, which only supports domains using
LDH characters. The IDNA protocol implementation does not introduce any
change to the DNS infrastructure. It means that there is no need to alter any of
the existing internet’s protocols, DNS servers and resolvers on users' computers
in order to get the IDNs working. In other words, lower-layer protocols do not
need to be aware of IDNs. IDNA is a protocol of the top level layer of the OSI
model, therefore the IDN introduction requires only upgrades of software which
interacts with domain names, such as web browsers, e-mail and FTP clients, HTML
editors etc. In some cases, it is enough to upgrade the underlying software
infrastructure, for example runtime libraries like libc, virtual machines, etc.
With IDNA, using the IDN domains is no different from the traditional way of
coping with domains. Principles of working with IDNA may be easily explained by
the example involving an internet browser. Users need to be provided with the
newest version of an “IDN aware” internet browser or install the IDN plug-in to
their current browser. Then all that is left to do is to open a page of
interest, e.g.

Here, the entered domain name contains diacritics and is represented by a
Unicode string (or some other coding scheme) within the user’s operating system.
The string is then converted to the corresponding IRA (International Reference
Alphabet) “punycode”
representation, which in this case is the following domain name:

The application sends such representation to a resolver in order to obtain the
IP address of the WWW server identified by that domain name. The user does not
even need to be aware of the Unicode to IRA translation, as it is being
performed within the web browser.
| |
|