Jump to content

OBD:Text encoding: Difference between revisions

m
→‎US English: wording on '…'
m (→‎US English: trying to prevent confusion over the cent sign for any other Mac users)
m (→‎US English: wording on '…')
Line 131: Line 131:
;Minor notes
;Minor notes
*The MacRoman layout was apparently "borrowed" before 1998, when Mac OS 8.5 came out and the [[wp:Currency sign (typography)|international currency sign]] a.k.a. scarab (¤), at 0xDB, was replaced with the euro symbol (€).
*The MacRoman layout was apparently "borrowed" before 1998, when Mac OS 8.5 came out and the [[wp:Currency sign (typography)|international currency sign]] a.k.a. scarab (¤), at 0xDB, was replaced with the euro symbol (€).
*The actual font (see [[/Fonts|HERE]]) has some unusual typographical features, such as a single-stroke Yen/Yuan symbol (Ұ) and a vertical-stroke cent symbol, similar to Unicode's Fullwidth Cent Sign (¢) character as seen in Windows Arial (note to Mac users: don't be confused, as this character will appear with a diagonal stroke on your system like the regular '¢' character).
*The actual font (see [[/Fonts|HERE]]) has some unusual typographical features, such as a single-stroke Yen/Yuan symbol (Ұ) and a vertical-stroke cent symbol similar to Unicode's Fullwidth Cent Sign (¢) character as seen in Windows Arial (note to Mac users: don't be confused, as this character will appear with a diagonal stroke on your system like the regular '¢' character).
;Major notes
;Major notes
*Some of the removed glyphs (most importantly ß, ù and û, but also Ê, Ú and ú) occur in [[wp:Languages of the European Union#Knowledge|common European languages]]. This made the US TSFFTahoma unsuitable for [[wikt:EFIGS|EFIGS]] localizations, requiring the creation of a new version (see below).  
*Some of the removed glyphs (most importantly ß, ù and û, but also Ê, Ú and ú) occur in [[wp:Languages of the European Union#Knowledge|common European languages]]. This made the US TSFFTahoma unsuitable for [[wikt:EFIGS|EFIGS]] localizations, requiring the creation of a new version (see below).  
*The US engine actually cannot interpret any code points beyond the US-ASCII range (first 6 rows, white background), notably failing on "…" (see [[#Ellipsis_issue|"Ellipsis issue"]] below). This is because of a provision for Asian encoding systems (EUC-CN and Shift JIS), which use two-byte sequences starting with a high-bit byte.
*The US engine actually cannot interpret any code points beyond the US-ASCII range (first 6 rows, white background), notably failing on 0xC9's "…". This is because of a nominal but unused provision for Asian text encodings. See "[[#Ellipsis_issue|Ellipsis issue]]" below for details.