19,997
edits
m (TOCfloat no longer does anything in Vector 2022) |
m (added a little detail on the level-load crash) |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
Originally created in English, Oni has been translated into the following [[seven]] languages: French, Italian, Spanish, German, Russian, Japanese and Chinese. | Originally created in English, Oni has been translated into the following [[seven]] languages: French, Italian, Spanish, German, Russian, Japanese and Chinese. An overview of the known language versions can be found on [[OBD:Releases]], but the details of these releases' localized content are found on [[OBD:Localization]]. | ||
Depending on the language version, vanilla Oni uses one of the following five encodings to render text: | Depending on the language version, vanilla Oni uses one of the following five encodings to render text: | ||
*The original US version uses a trimmed-down [[wp: | *The original US version uses a trimmed-down [[wp:Mac OS Roman|Mac OS Roman]] code page that is effectively limited to [[wp:ASCII|US-ASCII]] (96 code points used, 256 available). | ||
*European localizations (UK English, French, Italian, Spanish, German) use a custom version of Mac OS Roman (192 code points used, 256 available). | *European localizations (UK English, French, Italian, Spanish, German) use a custom version of Mac OS Roman (192 code points used, 256 available). | ||
*The Russian localization uses a (nearly) full implementation of the [[wp:Windows-1251|Windows-1251]] (Cyrillic) code page (224 code points used, 256 available). | *The Russian localization uses a (nearly) full implementation of the [[wp:Windows-1251|Windows-1251]] (Cyrillic) code page (224 code points used, 256 available). | ||
*The Chinese localization uses the [[wp: | *The Chinese localization uses the [[wp:Extended Unix Code#EUC-CN|EUC-CN]] implementation of [[wp:GB 2312|GB 2312]] (7,668 code points used, 8,836 available). | ||
*The Japanese localization uses 1,357 code points mostly conforming to the [[wp: | *The Japanese localization uses 1,357 code points mostly conforming to the [[wp:Shift JIS|Shift JIS]] implementation of [[wp:JIS X 0208|JIS X 0208]]. | ||
Properties of the fonts that are eventually used to render the text (via the encoding) are briefly described throughout the page. | |||
Properties of the fonts that are eventually used to render the text (via the encoding) are briefly described throughout the page. A more thorough overview of the glyphs can be found on the [[/Fonts|Fonts subpage]] (to be created). | |||
==Encodings== | ==Encodings== | ||
===US English=== | ===US English=== | ||
Below is the code page implemented by [[TSFF]]Tahoma in the US English version of Oni. It is based on [[wp: | Below is the code page implemented by [[TSFF]]Tahoma in the US English version of Oni. It is based on [[wp:Mac OS Roman|Mac OS Roman]] ("MacRoman" for short), but with two differences: | ||
*Of the 223 printable glyphs provided by MacRoman, 42 are not implemented in TSFFTahoma (shown as grey-on-black). | *Of the 223 printable glyphs provided by MacRoman, 42 are not implemented in TSFFTahoma (shown as grey-on-black). | ||
*Control point 0x7F (a typically non-printable "delete" character) has a visible box-like glyph (◻) in this implementation. | *Control point 0x7F (a typically non-printable "delete" character) has a visible box-like glyph (◻) in this implementation. | ||
| Line 341: | Line 341: | ||
;N.B. | ;N.B. | ||
Unlike for other versions of Oni, an invalid code point does not interrupt the interpretation/rendering of a text string by xfhsm_oni.dll and can lead to a wide range of unexpected behavior: at best, a blank or otherwise unintended glyph will be displayed; at worst the rendered text will be garbled (memory corruption most likely), or the game may simply [[Blam | Unlike for other versions of Oni, an invalid code point does not interrupt the interpretation/rendering of a text string by xfhsm_oni.dll and can lead to a wide range of unexpected behavior: at best, a blank or otherwise unintended glyph will be displayed; at worst the rendered text will be garbled (memory corruption most likely), or the game may simply crash with a [[Blam!]] message. | ||
The current understanding is that xfhsm_oni.dll simply turns any two-byte code point QQ WW into the offset [(QQ-A1)*5E + (WW-A1)]*0x20, relative either to the start of the xf_font.dat data (for the 16x16 font) or to the middle of the data (for the small 12x12 font). Depending on the values of QQ and WW, both components of the offset can fall outside the intended 0-93 range, with values as high as 94 and as low as -161. There doesn't seem to be any sanity check, and the only special handling is for QQ=00 (in this case WW is ignored and the string is terminated). | The current understanding is that xfhsm_oni.dll simply turns any two-byte code point QQ WW into the offset [(QQ-A1)*5E + (WW-A1)]*0x20, relative either to the start of the xf_font.dat data (for the 16x16 font) or to the middle of the data (for the small 12x12 font). Depending on the values of QQ and WW, both components of the offset can fall outside the intended 0-93 range, with values as high as 94 and as low as -161. There doesn't seem to be any sanity check, and the only special handling is for QQ=00 (in this case WW is ignored and the string is terminated). | ||
| Line 1,180: | Line 1,180: | ||
|} | |} | ||
Without a proper sanity check, some illegal code points will clearly result in pixel data being loaded not from a valid glyph region, but from irrelevant memory that belongs either to xfhsm_oni.dll or to the main Oni engine, resulting in garbled text. Memory corruption or segmentation fault (access violation) may occur if similar out-of-bounds pointers are used when rendering glyph textures. Possibly invalid EUC-CN input is what is causing most Chapters of the Chinese Oni version to crash on modern Windows systems, although this has | Without a proper sanity check, some illegal code points will clearly result in pixel data being loaded not from a valid glyph region, but from irrelevant memory that belongs either to xfhsm_oni.dll or to the main Oni engine, resulting in garbled text. Memory corruption or segmentation fault (access violation) may occur if similar out-of-bounds pointers are used when rendering glyph textures. Possibly invalid EUC-CN input is what is causing most Chapters of the Chinese Oni version to crash on modern Windows systems, although this crash is different because it happens without the Blam! dialog appearing; also, it can be avoided by turning down the graphics quality to Superlow. This indicates an issue related to the amount of memory being used, but it's possible the crash is also text-related; the cause has yet to be determined. | ||
====Non-translated US-ASCII==== | ====Non-translated US-ASCII==== | ||
ASCII strings are much more harmful when handled by xfhsm_oni.dll, as compared to the two invalid code points (A3,A0) and (A3,0x89), because pairs of US-ASCII bytes, misinterpreted as EUC-CN code points, end up referencing completely strange memory regions (outside the region occupied by xf_font.dat). Unfortunately, there are a few ASCII strings that xfhsm_oni.dll can come across even during regular gameplay, and many more arise if one allows for modding. | ASCII strings are much more harmful when handled by xfhsm_oni.dll, as compared to the two invalid code points (A3,A0) and (A3,0x89), because pairs of US-ASCII bytes, misinterpreted as EUC-CN code points, end up referencing completely strange memory regions (outside the region occupied by xf_font.dat). Unfortunately, there are a few ASCII strings that xfhsm_oni.dll can come across even during regular gameplay, and many more arise if one allows for modding. | ||
=====Count on it===== | =====Count on it===== | ||
The following string in SUBTsubtitles has not been translated into Chinese: | The following string in SUBTsubtitles has not been translated into Chinese: | ||