Search the knowledge base

Advisory: Internationalized domain names (IDN) can be used for spoofing.

Summary

Opera supports internationalized domain names (IDN), which allows for example Russian or Chinese domain names to be written in their own native scripts.

However, this also makes it possible to have domain names that look exactly the same as known, legitimate domain names while actually being written in a different script. Such possibilities can be used for fraud.

Problem description

Since 2003 domain names written with characters outside the US ASCII character set have been supported by the IDNA (RFC 3490) standard. IDNA is based on Unicode, which offers one big character set for the whole world. Standards for Unicode within HTML or plain text documents have existed for a longer time. UTF-8 is now a common encoding for Web pages, making ASCII-only domain names stand out because all non-English letters have to be mangled or "romanized".

The promise of Unicode is that a single version of a program shall be able to support all scripts of the world. The scripts can be mixed within the same text, without any escape codes or extra metadata. This is a boon for electronic typography and interoperability.

Before Unicode, the character set and the script used by the program were the character set and script chosen by the user. With a few exceptions (such as Japanese) all characters were visually different, because they had to be distinguishable to the user. And since the script was native to the user, the user would be trained to tell all the characters apart. With Unicode this is no longer true. Now the programs can display characters that are not native to the user, in-between the familiar characters. And within Unicode there are many so-called homographs: different characters that are visually identical. For example, several characters in the Cyrillic alphabet look the same as letters in in the Roman alphabet. They even tend to map to the same glyph in the same font; this is by design. But as far as the programs are concerned, they are different characters.

Internationalized domain names make it possible to have several domain names that look exactly the same typographically, as they are supposed to. As far as domain name servers, security protocols and Web browsers are concerned, the domain names are still different from each other. This can be abused to mislead the user. The deception will be totally convincing, unlike the trivial ASCII-only substitutions, for example "paypa1".

Opera's response

Opera has added a whitelist of top-level domains that are trusted to enforce a safe policy on domain names. Several top-level registrars have strict rules for domain names. Opera for Windows, Mac and UNIX will check for an updated list of trusted TLDs on a regular basis. Opera now only accepts Latin 1 characters in domain names from top-level domains that are not on the whitelist. This covers Western European languages without introducing any convincing homographs.

Top-level domain registrars who have enforced strict domain name policies are encouraged to contact Opera Software to be included in the browser's whitelist, provided that their policies are approved.


Browse through articles in the same categories: advisory

Support

Opera Help

Need help? Hit F1 anytime while using Opera to access our online help files, or go here.