Text and Internaltionalization support in Opera Presto 2.12
Unicode character set support in Opera Presto
Opera Presto can work with all the characters in the Unicode specification.
- All text communicated to Opera Presto from the network is converted into Unicode.
- In order for Opera Presto to render Unicode characters, the needed glyphs have to be available in the fonts on your system.
This might be a problem for older Windows systems. For information on available fonts, see
Unicode fonts for Windows computers.
- Updated Unicode character data tables from Unicode v5.0.0 to v5.1.0.
- Uniblocks table now supports ranges outside Unicode plane 0. This is needed to do proper font-switching of characters
outside the Unicode plane 0.
Opera Presto implements the following writing system related functionality improvements:
- font-switching: needed in order to display characters that the current font does not include
- line-breaking: needed in order to break scripts written without spaces, such as Chinese, Japanese, and
Korean
- CJK: improved line height and underlining in Chinese, Japanese, and Korean
- KDDI emojis: improved support for KDDI emojis and special characters
- Multistyle: improved default fonts for non-western Web pages
Opera Presto relies on the operating system to perform:
- character shaping: contextual glyph selection, ligature forming, character stacking, combining character
support, etc.
Opera Presto includes support for Unicode 5.2 character properties (class, casing, bidirectionality, mirroring, normalization)
from 5.0.
Legacy encoding support
Although Opera Presto works with the Unicode character set and its character encodings of UTF-16 and UTF-8, most text on
the Internet is encoded in legacy encodings, for instance:
- ISO 8859-1
- Windows-1251
- Shift_JIS (MIME name)
- EUC-KR
Opera Presto handles this by detecting the character encoding used, and converting it to UTF-16. The user has three options
for how to handle these pages.
- Auto-detect: in this mode Opera Presto will attempt to detect the encoding used by the page
- If the transport protocol provides an encoding name, that is used
- If not, Opera Presto will look at the page for a charset declaration
- If this is missing, Opera Presto will attempt to auto-detect the encoding, using the domain name to see if the script
is a CJK script, and if so which one
- Opera Presto can also auto-detect UTF-8
- Writing script auto-detect: In this mode the user can tell that this is a Japanese or Chinese page, but that the encoding
is unknown. Opera Presto will then analyze the text in the page to determine which encoding is used.
- Encoding override: In this mode the user selects an encoding. This encoding will be used by Opera Presto, regardless of
what the page and transport protocol claims is the encoding for the page.
Big5-HKSCS support for the HKSCS-2008 encoding standard has been updated.
Support for bidirectional text
Opera Presto supports bidirectional text as described in Unicode,
HTML, and CSS.