OBD:SNDD: Difference between revisions

From OniGalore
Jump to navigation Jump to search
(wrapping up the Mac SNDD knowledge; thanks to Ed for the main menu recording)
m (replaced formula GIFs with Math markup; replaced nowiki tags around equals signs with new {{=}} magic word)
 
(28 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic | align=center}}
{{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=General | align=center}}


:''For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see [[OSBD]] and its subtypes: [[OSAm]], [[OSIm]] and [[OSGr]].''
:''For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see [[OSBD]] and its subtypes: [[OSAm]], [[OSIm]] and [[OSGr]].''
SNDD instances is where Oni stores sound data. Sounds can be either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), and they are typically compressed to save on storage space. Both the PC and Mac versions use a form of [[wp:ADPCM|ADPCM]] compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit [[wp:PCM|PCM]]).
SNDD instances is where Oni stores sound data. In Vanilla Oni game data, sounds are either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), compressed to save on storage space. Both the PC and Mac versions use a form of [[wp:Adaptive differential pulse-code modulation|ADPCM]] compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit [[wp:Pulse-code modulation|PCM]]).
*On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ms'''.  
*On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM codec (implemented in [[wp:FFmpeg|FFmpeg]] as '''adpcm_ms'''). See [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE] for a quick description.
*On Mac, sounds are encoded using the IMA4 algorithm described [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ima_qt'''.
*On Mac, sounds are encoded using QuickTime's IMA4 codec (implemented in FFmpeg as '''adpcm_ima_qt'''). See [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE] for a quick description.
;Key shortcomings of the PC demo and Mac SNDDs as compared to PC retail SNDDs:
*On PS2, sounds are encoded using Sony's VAG codec (a.k.a. Sony PSX ADPCM, or '''adpcm_psx''' in FFmpeg). See [https://www.psdevwiki.com/ps3/Multimedia_Formats_and_Tools#VAG HERE] for a quick description.
*PC demo and Mac SNDDs have a "short" .dat part that specifies only the frame count (animation length in game ticks) and number of channels (mono or stereo). The waveform data is always assumed to be sampled at 22.05 kHz, and compressed into 4-bit ADPCM (either IMA4 or MS ADPCM).
*PC retail supports arbitrary sample rate, which allows for crisper high frequencies: specifically, 44.1 kHz (CD quality) is used for the 46 electric spark sounds '''ap_hit_shld''' and '''zap##'''. Possibly PC retail supports uncompressed PCM waveforms as well.
; Key shortcomings of Mac SNDDs as compared to PC SNDDs (both retail and demo):
*At 22.05 kHz, the storage size of Mac SNDDs (IMA4) is about 5% larger than for PC equivalents (MS ADPCM), because of a much smaller block size in the .raw part (the smaller .dat part of Mac SNDDs doesn't help).
*Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). PC SNDDs (retail or demo) have no such artifacts and allow nearly seamless playback. See [[OBD:SNDD#Looping_issues|BELOW]] for details.
==Oni storage==
===PC retail===
Below is the .dat file part used in the PC retail version.


[[image:sndd_all.gif]]
As a unique feature of Oni game data, SNDD files have a significantly different structure depending on the engine version. For PC retail (.dat/.raw storage, no .sep files), the SNDD files are larger and include a 50-byte chunk of data that is equivalent to the "fmt " chunk of a WAVE file. For the other two versions (PC demo and Mac, .dat/.raw/.sep storage), this 50-byte block is missing. It turns out that the extra format data allows the PC retail to support both MS ADPCM and IMA4, as well as uncompressed PCM, whereas PC demo and Mac engines only support MS ADPCM and IMA4, respectively. (It has not been confirmed whether the PC retail engine supports other WAVE formats beyond PCM and MS ADPCM, such as Mu-Law or A-Law PCM, IEEE float PCM, etc.) The PS2 engine uses the same short data header as for PC demo and Mac, but the waveform is stored as VAG (a.k.a. PSX ADPCM) and resides in a completely separate SOUNDS folder, accessed through an additional layer of indexation beyond the usual .dat/.raw./.sep logic (not unlike PS2 TXMPs which rely on color palettes stored in additional level#_palette.pal files).


----
For clarity, the simpler and more straightforward SNDDs of PC demo and Mac are documented first, followed by the more complex and versatile PC retail SNDDs.
:(Historically, though, the PC retail implementation is older, and the PC demo and Mac versions were trimmed-down iterations of PC retail.)
The exotic PS2 storage is documented last, after which we list some legacy tips (for manual sound conversion) and known issues/limitations, as well as the current sound capabilities of OniX and OniSplit.


{{Table}}
{{OBDth}}
{{OBDtr| 0x00 | res_id  |FF0000| 01 D7 08 00 | 2263      | 02263-comguy_dth2.aif.SNDD }}
{{OBDtr| 0x04 | lev_id  |FFFF00| 01 00 00 06 | 3        | level 3 }}
{{OBDtr| 0x08 | int32    |FFC8C8| 08 00 00 00 | 8        | flags
:1 - (never used in Vanilla Oni)
:2 - (never used in Vanilla Oni)
:4 - (never used in Vanilla Oni)
:8 - ADPCM compressed }}
{{OBDtr| 0x0C | block[50]|FFC8C8|        |      | [[OBD:SNDD/wav|wav header]] (corresponds to the "fmt " section of a RIFF WAVE file)}}
{{OBDtr| 0x3E | int16    |FFFFC8| 37 00      | 55        | duration in 1/60 seconds (game ticks) }}
{{OBDtr| 0x40 | int32    |C8FFC8| 56 28 00 00 | 10326    | size of the part in the raw file in bytes }}
{{OBDtr| 0x44 | offset  |C8FFFF| 20 10 59 00 | 0x591020  | at this position starts the part in the raw file }}
{{OBDtr| 0x48 | char[24] |FFC8FF| AD DE      | dead      | 24 unused bytes (padding) }}
|}
The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).


====.raw part (MS ADPCM)====
==Mac and PC demo==
For a detailed overview of the ADPCM algorithm (if interested), see [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. For an actual implementation example, see, e.g., [[wp:FFmpeg|FFmpeg]].
The below example was taken from Mac Oni. In PC demo the file would look the same, except for possibly different res_id (at 0x00) and smaller raw data size (at 0x10).
:The MS ADPCM .raw data has 512- or 1024-byte blocks (512 bytes for 22.05 kHz mono, 1024 bytes for 22.05 kHz stereo and 44.1 kHz mono)
:Each block consists of a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the block's first two samples.
:The remaining 505, 1010 or 1017 bytes of each block consist of nibbles (half-bytes), with left-right interleaving in the case of stereo.
::(that's 1010 more samples in the case of 22.05 kHz mono or stereo, and 2034 more samples in the case of 44.1 kHz mono)
:Thus the total number of samples per block (including the two in the header) is 1012 for 22.05 kHz (mono or stereo) and 2036 for 44.1 kHz mono.
:The final block in the file can be incomplete (the decoder can infer this from the block size and raw data size).


----
[[Image:sndd_alm.gif]]
===PC demo and Mac===
The Mac version and the PC demo version use a simpler format, with no support for different sample rates (all sounds are sampled at 22050 Hz).
 
[[image:sndd_alm.gif]]


{{Table}}
{{Table}}
{{OBDth}}
{{OBDth}}
{{OBDtr| 0x00 | res_id  |FF0000| 01 D6 08 00 | 2262      | 02262-comguy_dth2.aif.SNDD }}
{{OBDtr| 0x00 | res_id  |FF0000| 01 D6 08 00 | 2262      | SNDDcomguy_dth2.aif, instance number #2262 }}
{{OBDtr| 0x04 | lev_id  |FFFF00| 01 00 00 06 | 3        | level 3 }}
{{OBDtr| 0x04 | lev_id  |FFFF00| 01 00 00 06 | 3        | level 3 }}
{{OBDtr| 0x08 | int32    |FFC8C8| 01 00 00 00 | 1        | flags
{{OBDtr| 0x08 | uint32  |FFC8C8| 01 00 00 00 | 1        | flags
:1 - (unknown; always the same in Vanilla Oni)
:1 - (compressed; always the same in Vanilla Oni)
:2 - stereo (mono if disabled) }}
:2 - stereo (mono if disabled) }}
{{OBDtr| 0x0C | int32    |FFFFC8| 37 00 00 00 | 55        | duration in 1/60 seconds (game ticks) }}
{{OBDtr| 0x0C | uint16  |FFFFC8| 37 00       | 55        | duration in 1/60 seconds (game ticks) }}
{{OBDtr| 0x10 | int32    |C8FFC8| 5E 2A 00 00 | 10846    | size of the part in the raw file in bytes }}
{{OBDtr| 0x0E | uint16  |FFFFC8| 00 00      | 0x0000    | padding (unused in Vanilla Oni) }}
{{OBDtr| 0x10 | uint32  |C8FFC8| 5E 2A 00 00 | 10846    | size of the part in the raw file, in bytes }}
{{OBDtr| 0x14 | offset  |C8FFFF| 00 B1 01 00 | 0x1B100  | at this position starts the part in the raw file }}
{{OBDtr| 0x14 | offset  |C8FFFF| 00 B1 01 00 | 0x1B100  | at this position starts the part in the raw file }}
{{OBDtr| 0x18 | char[8]  |FFC8FF| AD DE      | dead      | 8 unused bytes (padding) }}
{{OBDtr| 0x18 | char[8]  |FFC8FF| AD DE      | dead      | 8 unused bytes (padding; not part of the SNDD file) }}
|}
|}
;Padding
:The 8 bytes at the end are ''not'' part of the SNDD template. They are not loaded by the Oni engine.
:The uint16 at 0x0E ''is'' loaded by the engine, and thus constitutes a potentially useful 16-bit field.
;Duration
:The duration (number of game ticks) is rounded to the ''lower'' value (a.k.a. "floor") in Vanilla Oni data.<ref name="ticks">As an example, on PC (both demo and retail), '''SNDDmus_ot7.aif''' consists of 152360 samples, which at 22.05 kHz corresponds to 6.90975 seconds, or 414.585 ticks; the duration, however, is indicated as 414 ticks and not 415. In other words, "duration" corresponds to the number of ''whole'' game ticks spanned by the sound's playback.</ref> In other words, it indicates the number of ''complete'' game ticks spanned by the sound's playback.
;Compression
:The effect of the "1" flag (or rather of its absence) is different in the PC demo and Mac engines.
:*On Mac, the flag seems to have no effect at all (the .raw data is still passed to the IMA4 decoding algorithm even if the "1" flag is missing).
:*On PC demo, a missing "1" flag causes playback to fail.<ref>If a SNDD misses the "1" flag ("compressed"), the PC demo engine ''does'' identify the stream as already uncompressed (PCM samples) and skips the initialization phase of the decompression routine, but then proceeds with decompression anyway, and immediately stops because of zero input size.</ref> If the sound is part of a looping permutation, the game crashes (because playback keeps failing on the same sound, over and over).
:The bottom line is that, both for PC demo SNDDs and for Mac SNDDs, the "1" flag should always be set.
The compressed storage of .raw data is described in the following two sections.


The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).
===IMA4 ADPCM .raw data (Mac)===
====.raw part (MS ADPCM, PC demo)====
''For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. For an actual implementation example, see FFmpeg.''
For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz.
:The IMA4 ADPCM stream data (stored in the .raw file) consists of 34-byte blocks (in the case of stereo, there is an even number of such blocks, with Left and Right blocks interleaved).
====.raw part (IMA4 ADPCM, Mac)====
:The first two bytes of each block form a header that sets the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples. Typically they are used only for the first block, or for sudden changes of the waveform's value range.
For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]
:The IMA4 ADPCM data has 34-byte blocks (in the case of stereo, there is an even number of such blocks, because Left and Right blocks are interleaved).
:The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples.
:The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
:The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
:Unlike for MS ADPCM, incomplete trailing blocks (if any) are not indicated in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data.
:All the 32-byte blocks must be stored in their entirety, meaning that the overall sample count of the waveform is a multiple of 64 (this is a major difference from MS ADPCM storage in PC retail and PC demo, where the final block is truncated after the last actual sample).
:For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music:
:For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music:
{{divhide|Main menu music a.k.a. "Oni Trailer"}}
{{divhide|Main menu music a.k.a. "Oni Trailer"}}
Line 120: Line 98:
{{divhide|end}}
{{divhide|end}}


By looking at the end of the Mac SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last Right block), with no way to interrupt playback upon reaching these trailing samples - because they are no different from regular samples.
===MS ADPCM .raw data (PC demo)===
''For a detailed overview of the ADPCM algorithm (if interested), see [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. For an actual implementation example, see FFmpeg.''
:The MS ADPCM stream data (stored in .raw) consists of 512- or 1024-byte blocks (512 bytes for 22.05 kHz mono, 1024 bytes for 22.05 kHz stereo)
:Each block starts with a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the 16-bit values of the block's first two samples.
:The remaining 505 or 1010 bytes of each block consist of nibbles (half-bytes), each coding for a sample. In the case of stereo, Left and Right nibbles are interleaved.
:Thus the block has room for 1010 samples encoded as nibbles, and the total number of samples per block (including the header) is 1012, be it for mono or stereo 22.05 kHz.
:For space efficiency, the MS ADPCM stored in .raw deviates from the standard, in that the final block is truncated after the last actual sample. In a WAVE file, ADPCM blocks are stored in their entirety, and the actual sample count is specified in a "fact" chunk. In Oni there is no such thing, instead the actual sample count is inferred from the block size and truncated raw data size.


Also, from a careful examination of the sound stream that is actually played back by Oni in the main menu, it is clear that all the Oni engines (both Mac and PC) play back all the available data (including the "padding" of the fixed-size IMA4 blocks) before switching to the next segment. The frame count (number of game ticks) is ignored or used only as an indication (e.g., for approximate cueing in [[BSL]]).
==PC retail==
Below is the .dat file part used in the PC retail version.


This uninterrupted playback of fixed-size IMA4 blocks is one of the aspects that impact seamless playback of sound sequences in Mac Oni (music or ambient tracks). See [[OBD:SNDD#Looping_issues|"Looping Issues"]] below.
[[Image:sndd_all.gif]]


;NOTE
{{Table}}
:Musically, the two segments of the main menu music correspond to the same duration (four bars of a 4:4 beat). However, somewhat suprisingly, the two segments don't have the same sample count - or even the same frame count (in game ticks) -, not even when comparing the two sounds on the same platform. That means that, even on PC where the playback is nearly seamless, we are actually hearing musical loops of unequal length, ending 10 milliseconds early or late, and it still sounds OK.
{{OBDth}}
{{OBDtr| 0x00 | res_id  |FF0000| 01 D7 08 00 | 2263      | SNDDcomguy_dth2.aif, instance number #02263 }}
{{OBDtr| 0x04 | lev_id  |FFFF00| 01 00 00 06 | 3        | level 3 }}
{{OBDtr| 0x08 | uint32  |FFC8C8| 08 00 00 00 | 8        | flags
:1 - never used in Vanilla Oni; unknown (no effect?)
:2 - never used in Vanilla Oni; unknown (no effect?)
:4 - never used in Vanilla Oni; enables rudimentary format header and IMA4 decoding (overrules "8")
:8 - always used in Vanilla Oni; enables fully-featured format header and PCM / MS ADPCM decoding }}
{{OBDtr| 0x0C | block[50]|FFC8C8| &nbsp;      | &nbsp;    | format header (MS ADPCM variant here; can also be IMA4, see below)}}
{{OBDtr| 0x3E | uint16  |FFFFC8| 37 00      | 55        | duration in 1/60 seconds (game ticks), rounded to the lower value }}
{{OBDtr| 0x40 | uint32  |C8FFC8| 56 28 00 00 | 10326    | size of the part in the raw file in bytes }}
{{OBDtr| 0x44 | offset  |C8FFFF| 20 10 59 00 | 0x591020  | at this position starts the part in the raw file }}
{{OBDtr| 0x48 | char[24] |FFC8FF| AD DE      | dead      | 24 unused bytes (padding) }}
|}
;Padding
:The 24 bytes at the end are ''not'' part of the SNDD template. They are not loaded by the Oni engine.
;Duration
:Same as for PC demo and Mac, the duration (number of game ticks) is rounded to the ''lower'' value (a.k.a. "floor") in Vanilla Oni data.<ref name="ticks"/> In other words, it indicates the number of ''complete'' game ticks spanned by the sound's playback.
;Compression modes (flags)
:Unlike for PC demo, there is no single compressed/uncompressed flag in PC retail SNDDs, and no stereo/mono flag either.
:The '''channel count''' is specified in the 50-byte header if said header is enabled, otherwise it is inferred from [[OBD:OSBD/OSGr|OSGr]].
:As for '''compression''', PC retail actually has ''three'' primary compression modes, commanded by the flag values. Only the first is used in Vanilla Oni.


----
===WAVE-like format header ("8" flag)===
==Exporting and importing tips==
If the "8" flag of the SNDD (at 0x08) is ON and the "4" flag is OFF (as is always the case in Vanilla Oni), the 50-byte block is interpreted as a standard "fmt " chunk that you find in WAVE files (see [[/wav|HERE]] for details).
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part.
===WAV files (from PC retail/demo SNDDs)===
*Write "RIFF"
*add the size of the part in the raw file + 70 bytes
*write "WAVE"
*write "fmt "
*write 50
*write the wav header '''(for PC demo, the wav header is ''not'' present in the .dat part, and has to be deduced)'''
*'''OPTIONAL/RECOMMENDED: compute the number of samples and add a "fact" section announcing it'''
*write "data"
*add the size of the part in the raw file '''OPTIONAL/RECOMMENDED: increase the size if the last sample block is incomplete'''
*add the raw file data '''OPTIONAL/RECOMMENDED: add padding to the last sample block if it is incomplete'''
*save it as a wav file.


[[image:sndd_wav.gif]]
[[Image:sndd_hd.gif]]


{{Table}}
{{Table}}
{{OBDth}}
{{OBDth}}
{{OBDtrBK|1=Complete ADPCM wav format header (black outline)}}
{{OBDtr| 0x0C | int16    |FFFFC8| 02 00      | 2    | format ID (2 {{=}} MS ADPCM format)
{{OBDtr| 0x00 | char[4]  |FF0000| 52 49 46 46 | RIFF      | identifier for the "IBM/Microsoft RIFF" standard }}
:<small>'''N.B.''' At the time of writing, only "1" (linear PCM) and "2" (MS ADPCM) are known to work in Oni; in Vanilla Oni, only MS ADPCM is ever used.</small>}}
{{OBDtr| 0x04 | int32    |FFFF00| 9C 28 00 00 | 10396    | size of the file from 0x08 to the end (<nowiki>=</nowiki> size of the .raw part + 70 bytes) }}
{{OBDtr| 0x0E | int16    |C8FFC8| 01 00      | 1    | number of channels (1 {{=}} mono)
{{OBDtr| 0x08 | char[4]  |00FF00| 57 41 56 45 | WAVE      | identifier for the "WAVE" format }}
:<small>'''N.B.''' Both PCM and ADPCM support only mono and stereo sounds, i.e., 1 or 2 channels.</small>}}
{{OBDtr| 0x0C | char[4]  |00FFFF| 66 6D 74 20 | "fmt "   | identifier announcing the following wav format header }}
{{OBDtr| 0x10 | int32    |C8FFFF| 22 56 00 00 | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" }}
{{OBDtr| 0x10 | int32   |FFC8C8| 32 00 00 00 | 50        | wave format header size }}
{{OBDtr| 0x14 | int32    |FFC8FF| 93 2B 00 00 | 11155 | ADPCM average data rate: <math>\frac{\text{samples per second}*\text{block alignment}}{\text{samples per block}}</math> }}
{{OBDtr| 0x14 | block[50]|FFC8C8| &nbsp;     | &nbsp;   | [[OBD:SNDD/wav|wav header]] }}
:<small>'''N.B.''' For PCM, the data rate is simply ''samples per second*block alignment'', seeing as each sample gets its own block.</small>
{{OBDtr| 0x46 | char[4|FFFFC8| 64 61 74 61 | data      | identifier announcing the following wav data }}
:<small>'''N.B.''' For ADPCM, the average data rate is based on whole ADPCM blocks (not accounting for how Oni truncates the .raw data).</small>}}
{{OBDtr| 0x4A | int32    |C8FFC8| 56 28 00 00 | 10326    | size of the following wav data in bytes (<nowiki>=</nowiki> size of the .raw part) }}
{{OBDtr| 0x18 | int16    |FFC800| 02 00      | 512  | block alignment a.k.a. "block size", in bytes
:<small>'''N.B.''' The block size is trivially 2 bytes for PCM mono (one 16-bit sample) and 4 bytes for PCM stereo (Left and Right 16-bit samples).</small>
:<small>'''N.B.''' For ADPCM, Oni's Vanilla data always uses 512 bytes per channel for 22050 Hz waveforms, and 1024 bytes for 44.1 kHz mono (see below).}}
{{OBDtr| 0x1A | int16    |C800C8| 04 00      | 4    | bits per sample (per channel); typically 4 bits for ADPCM, 16 bits for PCM }}
{{OBDtrBK|1=Special extended ADPCM wav format header (black outline); fully ignored if the format ID is 1 }}
{{OBDtr| 0x1C | int16    |C87C64| 20 00      | 32   | size of the extra ADPCM parameters, in bytes; typically always 32 }}
{{OBDtr| 0x1E | int16   |B0C3D4| F4 03      | 1012  | samples per block: <math>\dfrac{(\text{block alignment}-7*\text{number of channels})*8}{\text{bits per sample}*\text{number of channels}}+2</math> }}
{{OBDtr| 0x20 | int16    |E7CEA5| 07 00      | 7    | number of the following coefficient pairs; always 7 in practice }}
|-align=center valign=top
| 0x22 || int16-16 || bgcolor="#FFDDDD" | 00 01 00 00 || 256, 0 || rowspan=7 align=left | The coefficient pairs themselves (always the same in practice).<br><math>\begin{array}{|c|c||c|} \text{coefficient set} & \text{coefficient 1} & \text{coefficient 2} \\
\hline
0 & 256 &    0\\
1 & 512 & -256\\
2 &  0 &    0\\
3 & 192 &  64\\
4 & 240 &    0\\
5 & 460 & -208\\
6 & 392 & -232
\end{array} </math>
|-align=center valign=top
| 0x26 || int16-16 || bgcolor="#FFDDDD" | 00 02 00 FF || 512, -256
|-align=center valign=top
| 0x2A || int16-16 || bgcolor="#FFDDDD" | 00 00 00 00 || 0, 0
|-align=center valign=top
| 0x2E || int16-16 || bgcolor="#FFDDDD" | C0 00 40 00 || 192, 64
|-align=center valign=top
| 0x32 || int16-16 || bgcolor="#FFDDDD" | F0 00 00 00 || 240, 0
|-align=center valign=top
| 0x36 || int16-16 || bgcolor="#FFDDDD" | CC 01 30 FF || 460, -208
|-align=center valign=top
| 0x3A || int16-16 || bgcolor="#FFDDDD" | 88 01 18 FF || 392, -232
|}
;PCM vs ADPCM
:Although Vanilla Oni SNDDs only ever use MS ADPCM waveforms, some mods have successfully used the (bloated!!!) PCM format.
:For PCM, the format ID (at 0x0C) is set to 1, the block size is either 2 or 4, the data rate formula is simplified, and everything between 0x1C and 0x3E is ignored. 
:See [[/wav#PCM_.28with_.22fmt_.22.29|HERE]] for more details on the importing procedure, but keep in mind that PCM waveforms are ''not'' recommended!
;ADPCM coefficients
:The 14 coefficients are de-facto standard, but custom values are formally allowed by the MS ADPCM algorithm.
:Therefore, each MS ADPCM waveform is always accompanied by the coefficient pairs that were used to encode it.
:Thus, even though these numbers are practically always the same, they are required; don't ever mess with them.
;Standard sets of ADPCM parameters
Below are the three types of headers occurring for Vanilla Oni's sounds (MS ADPCM).
----
22.05 kHz mono (used for the vast majority of sounds):
:'''1''' channel; sample rate '''''22050''''' Hz; average data rate '''11155''' B/s (truncated from ~11155.7312253 = 22050*512/1012);
:block alignment '''''512''''' bytes; '''4''' bits per sample; '''''1012''''' samples per block (= 2 + (512 - 7)*8/4/1); standard coefficient table.
{|cellpadding=3 cellspacing=0 style="line-height:13px"
{{HexRow|0x00|
|°°|°°|°°|°°|°°|°°|°°|°°|08|00|00|00|02|00|'''01'''|'''00'''|
}}
{{HexRow|0x10|
|'''''22'''''|'''''56'''''|'''''00'''''|'''''00'''''
|'''93'''|'''2B'''|'''00'''|'''00'''
|'''''00'''''|'''''02'''''
|'''04'''|'''00'''
|20|00
|'''''F4'''''|'''''03'''''
}}
{{HexRow|0x20|
|07|00|00|01|00|00|00|02|00|FF|00|00|00|00|C0|00|
}}
{{HexRow|0x30|
|40|00|F0|00|00|00|CC|01|30|FF|88|01|18|FF|°°|°°|
}}
|}
----
22.05 kHz stereo (used for music and some ambients):
:'''2''' channels; sample rate '''''22050''''' Hz; average data rate '''22311''' B/s (truncated from ~22311.4624506 = 22050*1024/1012);
:block alignment '''''1024''''' bytes; '''4''' bits per sample; '''''1012''''' samples per block (= 2 + (1024 - 2*7)*8/4/2); standard coefficient table.
{|cellpadding=3 cellspacing=0 style="line-height:13px"
{{HexRow|0x00|
|°°|°°|°°|°°|°°|°°|°°|°°|08|00|00|00|02|00|'''02'''|'''00'''|
}}
{{HexRow|0x10|
|'''''22'''''|'''''56'''''|'''''00'''''|'''''00'''''
|'''27'''|'''57'''|'''00'''|'''00'''
|'''''00'''''|'''''04'''''
|'''04'''|'''00'''
|20|00
|'''''F4'''''|'''''03'''''
}}
{{HexRow|0x20|
|07|00|00|01|00|00|00|02|00|FF|00|00|00|00|C0|00|
}}
{{HexRow|0x30|
|40|00|F0|00|00|00|CC|01|30|FF|88|01|18|FF|°°|°°
}}
|}
----
44 kHz mono (used for '''ap_hit_shld''' and the 45 '''zap##''' sounds):
:'''1''' channel; sample rate '''''44100''''' kHz; average data rate '''22179''' B/s (truncated from ~22179.9607073 = 44100*1024/2036);
:block alignment '''''1024''''' bytes; '''4''' bits per sample; '''''2036''''' samples per block (= 2 + (1024 - 7)*8/4/1); standard coefficient table.
{|cellpadding=3 cellspacing=0 style="line-height:13px"
{{HexRow|0x00|
|°°|°°|°°|°°|°°|°°|°°|°°|08|00|00|00|02|00|'''01'''|'''00'''|
}}
{{HexRow|0x10|
|'''''44'''''|'''''AC'''''|'''''00'''''|'''''00'''''
|'''A3'''|'''56'''|'''00'''|'''00'''
|'''''00'''''|'''''04'''''
|'''04'''|'''00'''
|20|00
|'''''F4'''''|'''''07'''''
}}
{{HexRow|0x20|
|07|00|00|01|00|00|00|02|00|FF|00|00|00|00|C0|00|
}}
{{HexRow|0x30|
|40|00|F0|00|00|00|CC|01|30|FF|88|01|18|FF|°°|°°|
}}
|}
|}
The above is not 100% consistent with the WAVE storage rules, because it allows for a completely arbitrary "data" size. Microsoft ADPCM data is supposed to be stored as a number of fixed-size blocks (in Oni, each block is either 512 bytes for 22.05 kHz mono, or 1024 bytes for 22.05 kHz stereo and 44.1 kHz mono). Thus, according to the standard, the last block - even if incomplete - must be stored in its entirety, and the "data" size must be a multiple of the block size. In the above example, since the format is 22.05 kHz mono, the "data" size should be increased from 10326 to 10752=21x512, and 426 empty bytes should be added as padding, so that there are 21 complete data blocks.
----
Only the 22.05 kHz sounds (mono and stereo) are played back correctly by the PC retail engine. The forty-six 44.1 kHz sounds are interpreted as 22.05 kHz waveforms, and therefore are played back two times slower/lower than intended.


The standard way to deal with incomplete blocks is to specify not just the data size, but the ''actual number of samples'', by adding a "fact" section to the WAVE header, like this:
For any other combinations of PCM or ADPCM parameters, please refer to the importing routine, [[/wav#MS_ADPCM_2|HERE]].
{{Table}}
 
{{OBDth}}
===IMA ADPCM decoding ("4" flag)===
{{OBDtrBK|1=Complete ADPCM wav format header}}
If the "4" flag of the SNDD (at 0x08) is ON, then (regardless of the "8" flag), the .raw data will be interpreted as IMA ADPCM blocks, and the 50-byte format block will be mostly ignored, as well as the ".raw data size" field at 0x40.
{{OBDtr| 0x00 | char[4]  |FF0000| 52 49 46 46 | RIFF      | identifier for the "IBM/Microsoft RIFF" standard }}
 
{{OBDtr| 0x04 | int32    |FFFF00| 9C 28 00 00 | 10396    | size of the file from 0x08 to the end (<nowiki>=</nowiki> size of the .raw part + 70 bytes) }}
Here is what the .dat part of the above '''SNDDcomguy_dth2''' would have looked like in IMA4 mode:
{{OBDtr| 0x08 | char[4]  |00FF00| 57 41 56 45 | WAVE      | identifier for the "WAVE" format }}
{|cellpadding=3 cellspacing=0 style="line-height:13px"
{{OBDtr| 0x0C | char[4]  |00FFFF| 66 6D 74 20 | "fmt "    | identifier announcing the following wav format header section }}
{{HexRow|0x00|
{{OBDtr| 0x10 | int32    |FFC8C8| 32 00 00 00 | 50        | wave format header size }}
|01|D7|08|00|01|00|00|06|04|00|00|00|00|00|01|00|
{{OBDtr| 0x14 | block[50]|FFC8C8| &nbsp;      | &nbsp;    | [[OBD:SNDD/wav|wav header]] }}
|FF|FF|FF|FF|FF|FF|FF|FF|00|00|00|00|80|80|00|00|
{{OBDtr| 0x46 | char[4]  |FFFFC8| 66 61 63 74 | fact      | identifier announcing the following "fact" section }}
|00|00|00|00|FF|FF|FF|FF|FF|FF|FF|FF|80|80|FF|FF|
{{OBDtr| 0x4A | int32    |FFFFC8| 04 00 00 00 | 4        | size of the following "fact" section in bytes }}
|00|00|00|00|00|00|00|00|00|00|00|00|80|80|FF|FF|
{{OBDtr| 0x4E | int32    |C8FFC8| B0 4F 00 00 | 20400    | actual number of samples (see below for calculation) }}
|°×°°°°°°°°°°°°°°
{{OBDtr| 0x52 | char[4]  |FFFFC8| 64 61 74 61 | data      | identifier announcing the following wav data }}
}}
{{OBDtr| 0x56 | int32    |C8FFC8| 00 2A 00 00 | 10752    | size of the following wav data in bytes (<nowiki>=</nowiki> size of the .raw part + 426 empty bytes) }}
{{HexRow|0x10|
|00|00|00|00|00|00|00|00|52|01|00|00|00|00|00|00|
|80|80|80|80|80|80|80|80|FF|FF|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|00|00|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|FF|FF|80|80|80|80|80|80|
|°°°°°°°°4°°°°°°°
}}
{{HexRow|0x20|
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0x30|
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0x40|
|00|00|00|00|20|10|59|00|AD|DE|AD|DE|AD|DE|AD|DE|
|80|80|80|80|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|80|80|80|80|DD|DD|DD|DD|C8|C8|C8|C8|C8|C8|C8|C8|
|80|80|80|80|DD|DD|DD|DD|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°° °Y°°°°°°°°°
}}
{{HexRow|0x50|
|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
|}
|}
The actual number of samples is implied from the actual data size (size of the .raw part) and [[OBD:SNDD/wav|wav header]] properties as follows:
:The channel count field in the 50-byte "format" block (at 0x0E) is used in the same way as for a WAVE-like header.
* n_whole_blocks = floor(raw_size/block_size);   '''// EXAMPLE: floor(10326/512) = 20'''
:The "block alignment" field (at 0x18) is used (somewhat counterintuitively) to store the number of IMA4 "packets":
* last_block_size = raw_size - whole_blocks*block_size;    '''// EXAMPLE: 10326 - 20x512 = 86'''
:*for a mono IMA4 sound, packets are 34-byte blocks carrying 64 mono samples each (two bytes of header data and 64 "nibbles", or half-bytes);
* last_block_samples = (last_block_size - 7*n_channels)*(8/bits_per_sample/n_channels) + 2;      '''// EXAMPLE: (86 - 7)*(8/4) + 2 = 160'''
:*for a stereo IMA4 sound, packets are ''pairs'' of consecutive 34-byte blocks (the 64 Left samples are stored first, then the 64 Right samples).
* n_samples = n_whole_blocks*samples_per_block + last_block_samples;        '''// EXAMPLE: 20*1012 + 160 = 20400'''
:(The actual duration of SNDDcomguy_dth2 is 0.97941 seconds, which at 22.05 kHz requires 338 64-sample blocks, hence the value 0x152 appearing at 0x18.)
:Everything else in the 50-byte format block is ignored, as well as the ".raw data size" usually found at 0x40 (Oni uses the number of packets instead).


----
===Raw PCM (no flags)===
===AIF files (from Mac SNDDs)===
If neither the "4" nor the "8" flags are set (at 0x08 in the .dat part of the SNDD), then the .raw data is apparently copied as-is into the playback buffer.
*Write "FORM"
*add the size of the part in the raw file + 50 bytes
*write "AIFC"
*write "COMM "
*add the aif header (after filling in in, the number of channels, the sample rate - always 22.05 kHz -, the bits per sample - always 16 - and the number of sample frames/blocks)
*write "SSND"
*add the size of the part in the raw file + 8 bytes
*add 8 zero bytes (custom "offset" and "block size" fields)
*add the raw file data and save it as an aif file.
Note the [[wikipedia:Big Endian|Big Endian]] order


[[image:sndm_aif.gif]]
In this "raw" playback mode, the sound will not be reproduced correctly unless it's linear PCM (Little Endian) with 16-bit sample depth, 22.05 kHz sampling rate, and a channel count consistent with the SNDD's [[OBD:OSBD/OSGr|OSGr]] (i.e., the metadata used by Oni's engine to set up the sound's playback).


{{Table}}
Here is what the .dat part of the above '''SNDDcomguy_dth2''' would have looked like in raw PCM mode:
{{OBDth}}
{|cellpadding=3 cellspacing=0 style="line-height:13px"
{{OBDtrBK|Complete aif format header (black outline) }}
{{HexRow|0x00|
{{OBDtr| 0x00 | char[4]  |FF0000| 46 4F 52 4D | FORM      | identifier for the "EA IFF 85" standard }}
|01|D7|08|00|01|00|00|06|00|00|00|00|00|00|00|00|
{{OBDtr| 0x04 | int32    |FFFF00| 00 00 2A 90 | 10896    | size of the file from 0x08 to the end (<nowiki>=</nowiki> size of the .raw part + 50 bytes) }}
|FF|FF|FF|FF|FF|FF|FF|FF|00|00|00|00|80|80|80|80|
{{OBDtr| 0x08 | char[4]  |00FF00| 41 49 46 43 | AIFC      | identifier for the "AIFC" format (compressed aif file) }}
|00|00|00|00|FF|FF|FF|FF|FF|FF|FF|FF|80|80|80|80|
{{OBDtr| 0x0C | char[4]  |00FFFF| 43 4F 4D 4D | COMM      | identifier announcing the following aif format header }}
|00|00|00|00|00|00|00|00|00|00|00|00|80|80|80|80|
{{OBDtr| 0x10 | block[26]|FFC8C8| &nbsp;      | &nbsp;    | [[OBD:SNDD/aif|aif header]] }}
|°×°°°°°°°°°°°°°°
{{OBDtr| 0x2A | char[4]  |FFFFC8| 53 53 4E 44 | SSND      | identifier announcing the following aif data }}
}}
{{OBDtr| 0x2E | int32    |C8FFC8| 00 00 2A 66 | 10854    | size of the file from 0x32 to the end (<nowiki>=</nowiki> size of the .raw part + 8 bytes) }}
{{HexRow|0x10|
{{OBDtr| 0x32 | int32    |C8FFFF| 00 00 00 00 | 0        | offset; determines where the first sample in the data starts; use zero }}
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|
{{OBDtr| 0x36 | int32    |FFC8FF| 00 00 00 00 | 0        | block size; used in conjunction with offset for block-aligning data; use zero }}
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0x20|
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0x30|
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|80|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0x40|
|00|00|00|00|20|10|59|00|AD|DE|AD|DE|AD|DE|AD|DE|
|80|80|80|80|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|80|80|80|80|DD|DD|DD|DD|C8|C8|C8|C8|C8|C8|C8|C8|
|80|80|80|80|DD|DD|DD|DD|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°° °Y°°°°°°°°°
}}
{{HexRow|0x50|
|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|AD|DE|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|C8|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
|}
|}
;Effect of "1" and "2" flags
:It is possible that the "1" and "2" flags used to affect playback in raw PCM mode (something about swapping the .raw data to allow both for Little Endian and Big Endian PCM samples), but currently they do not seem to have any effect.




==Looping issues==
==PS2 implementation==
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds).
The PS2 implementation has an unorthodox approach to raw data when it comes to SNDDs. Here are the .dat parts of the SNDDs in a PS2 level0_Final.dat file.
===MS ADPCM===
{|cellpadding=3 cellspacing=0 style="line-height:13px"
Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior when loading an ADPCM-compressed WAV (e.g., in a non-destructive audio program) is to assume full-sized blocks, with padding up to the end of the last block. Depending on the audio program, this can create a silence or some "bad data" at the end of the imported audio, which can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence).
{{HexRow|0xD5EA0|
|01|5B|0E|00|01|00|00|00|00|00|00|00|9E|01|'''''00'''''|'''''00'''''|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|00|00|00|00|FF|FF|FF|FF|C8|C8|C8|C8|FF|FF|FF|FF|
|00|00|00|00|00|00|00|00|C8|C8|C8|C8|C8|C8|C8|C8|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5EB0|
|20|F7|00|00|'''''40'''''|'''''30'''''|'''''76'''''|'''''13'''''|AD|DE|AD|DE|AD|DE|AD|DE|
|C8|C8|C8|C8|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|
|FF|FF|FF|FF|FF|FF|FF|FF|C8|C8|C8|C8|C8|C8|C8|C8|
|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5EC0|
|01|5C|0E|00|01|00|00|00|00|00|00|00|9E|01|'''''01'''''|'''''00'''''|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|00|00|00|00|FF|FF|FF|FF|C8|C8|C8|C8|FF|FF|FF|FF|
|00|00|00|00|00|00|00|00|C8|C8|C8|C8|C8|C8|C8|C8|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5ED0|
|A0|F6|00|00|'''''C0'''''|'''''37'''''|'''''77'''''|'''''13'''''|AD|DE|AD|DE|AD|DE|AD|DE|
|C8|C8|C8|C8|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|
|FF|FF|FF|FF|FF|FF|FF|FF|C8|C8|C8|C8|C8|C8|C8|C8|
|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5EE0|
|01|5D|0E|00|01|00|00|00|00|00|00|00|8F|02|'''''02'''''|'''''00'''''|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|00|00|00|00|FF|FF|FF|FF|C8|C8|C8|C8|FF|FF|FF|FF|
|00|00|00|00|00|00|00|00|C8|C8|C8|C8|C8|C8|C8|C8|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5EF0|
|40|86|01|00|'''''C0'''''|'''''2E'''''|'''''78'''''|'''''13'''''|AD|DE|AD|DE|AD|DE|AD|DE|
|C8|C8|C8|C8|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|
|FF|FF|FF|FF|FF|FF|FF|FF|C8|C8|C8|C8|C8|C8|C8|C8|
|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5F00|
|01|5E|0E|00|01|00|00|00|00|00|00|00|91|02|'''''03'''''|'''''00'''''|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|00|00|00|00|FF|FF|FF|FF|C8|C8|C8|C8|FF|FF|FF|FF|
|00|00|00|00|00|00|00|00|C8|C8|C8|C8|C8|C8|C8|C8|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5F10|
|40|87|01|00|'''''40'''''|'''''B5'''''|'''''79'''''|'''''13'''''|AD|DE|AD|DE|AD|DE|AD|DE|
|C8|C8|C8|C8|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|
|FF|FF|FF|FF|FF|FF|FF|FF|C8|C8|C8|C8|C8|C8|C8|C8|
|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5F20|
|01|5F|0E|00|01|00|00|00|00|00|00|00|3C|01|'''''04'''''|'''''00'''''|
|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|00|00|00|00|FF|FF|FF|FF|C8|C8|C8|C8|FF|FF|FF|FF|
|00|00|00|00|00|00|00|00|C8|C8|C8|C8|C8|C8|C8|C8|
|°°°°°°°°°°°°°°°°
}}
{{HexRow|0xD5F30|
|60|BC|00|00|'''''00'''''|'''''0A'''''|'''''75'''''|'''''13'''''|AD|DE|AD|DE|AD|DE|AD|DE|
|C8|C8|C8|C8|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|
|FF|FF|FF|FF|FF|FF|FF|FF|C8|C8|C8|C8|C8|C8|C8|C8|
|C8|C8|C8|C8|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|FF|
|°°°°°°°°°°°°°°°°
}}
|}
(Yes, the PS2 retail level0_Final.dat only has those five sounds. All the other sounds (weapons, particles, impacts, footsteps, etc) are stored per-chapter, which causes a lot of duplicates but supposedly lightens the memory usage for a given level.)


As a workaround, one can preprocess .wav files with some tools that can handle incomplete MS ADPCM blocks and convert to a less ambiguous format:
The layout is similar to the PC demo and Mac SNDDs described above - and indeed the SNDD template checksum is the same for PC demo, Mac and PS2, implying that the data structure in the .dat is the same. However, there are two major novelties/anomalies (apart from all the music being mono), emphasized above with '''''bold italic'''''. First, the five .raw offsets at the end of each SNDD are obviously not pointers into level0_Final.raw or level0_Final.sep (the .raw's size is 4 MB, the .sep's 6MB, and the pointers are in the 311 MB range, possibly pointing into a memory region where the sounds will be stored at runtime). Second, the 2-byte padding field between the duration and the .raw storage size is obviously not blank here; rather, it is an index into a file called SOUNDS\LEVEL0\SOUND.DAT, which looks like this:
*For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files.
0x00: '''''00 00 20 F7 00 00 00 00 00 00''''' 01 00 A0 F6 00 00
*For [https://www.ffmpeg.org/download.html FFmpeg], padding can be disabled as an optional setting.
0x10: 20 F7 00 00 '''''02 00 40 86 01 00 C0 ED 01 00''''' 03 00
So, you'd either join the .wav files in Sox or FFmpeg, or convert them, e.g., to uncompressed PCM, and then import them into a fancy audio tool.
0x20: 40 87 01 00 00 74 03 00 '''''04 00 60 BC 00 00 40 FB'''''
0x30: '''''04 00''''' FE FF
These are 5 blocks of 10 bytes each, followed by the two bytes FE FF which signal the end of the file. For each sound there is a 2-byte index, then a 4-byte data size (including padding), then the offset at which the data is stored in the SOUNDS\LEVEL0\SOUND.SEP file. The .SEP data for each sound consists of 32 blank bytes followed by a large number of VAG packets (16 bytes each), the final VAG packet having a terminating bit set. At the very end of the .SEP file is a terminating pair of bytes, FE FF, same as in the .DAT file. The file SOUNDS\LEVEL0\SOUND.RAW exists in the same folder, but has no data except for the two bytes FE FF.


As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.###
For all the levels other than LEVEL0 (i.e., game chapters), some of the sounds are stored in SOUNDS\LEVEL#\SOUND.RAW (the size is about 1 MB for all chapters). RAW storage is indicated by a zero .SEP offset in the corresponding block of the SOUNDS\LEVEL#\SOUND.DAT file (note, however, that the first sound in SOUND.SEP also has a zero offset). As an example, here is a fragment of SOUNDS\LEVEL1\SOUND.DAT featuring the first .RAW-resident sounds.
0x690: '''''A8 00 70 3B 02 00 22 B2 83 00''''' A9 00 50 98 00 00
0x6A0: 92 ED 85 00 '''''AA 00 20 80 00 00 E2 85 86 00''''' AB 00
0x6B0: F0 F2 00 00 02 06 87 00 '''''AC 00 10 2D 00 00 00 00'''''
0x6C0: '''''00 00''''' AD 00 C0 1B 00 00 00 00 00 00 '''''AE 00 40 0B'''''
0x6D0: '''''00 00 00 00 00 00''''' AF 00 F0 11 00 00 00 00 00 00
0x6E0: '''''B0 00 20 24 00 00 F2 F8 87 00''''' B1 00 50 25 00 00
0x6F0: 12 1D 88 00 '''''B2 00 10 58 00 00 62 42 88 00''''' B3 00
0x700: 20 1F 00 00 72 9A 88 00 '''''B4 00 A0 26 00 00 00 00'''''
0x710: '''''00 00''''' B5 00 F0 0E 00 00 00 00 00 00 '''''B6 00 B0 09'''''
0x720: '''''00 00 00 00 00 00''''' B7 00 B0 0B 00 00 00 00 00 00
Here the .SEP offset field is zero for entries 0xAC, 0xAD, 0xAE and 0xAF, then non-zero for the next four entries, and zero again for the following four. The start of the SOUNDS\LEVEL1\SOUND.RAW file looks as follows
0x00: '''''AC 00''''' 10 2D 00 00 '''''00 00 00 00 00 00 00 00 00 00'''''
0x10: '''''00 00 00 00 00 00''''' 00 00 00 00 00 00 00 00 00 00
0x20: 00 00 00 00 00 00 '''''1A 00 24 00 00 01 10 00 02 10'''''
0x30: '''''0F 00 11 1F 1F 1E''''' 1A 00 01 1F 10 0F 11 21 3E 10
0x40: 20 0F 20 41 D2 E3
Here AC 00 is the 2-byte index of the sound (the same as in the SOUNDS\LEVEL1\SOUND.DAT and in the corresponding SNDD in level1_Final.dat), then there is the 4-byte data size (also the same as announced in the .dat and .DAT), followed by the same data as in the .SEP (32 zero bytes, then some 16-byte VAG packets, the last packet having a terminating bit set). At the very end of the .RAW file is a terminating pair of bytes, FE FF.
:'''N.B.''' It appears that SOUND.RAW is loaded in its entirely when a level starts, whereas sound data from a level's SOUND.SEP is (re)loaded on-demand. Accordingly, SOUND.RAW typically contains short recurrent sounds (gunshots, impacts, footsteps, hurt sounds, etc). Permanent storage is not decided based on size alone, though: for example, the rather long SNDDheliflyby2 (6 seconds) is stored in SOUND.RAW, whereas the much shorter SNCCconsole-locked (1.5 seconds long) is stored in SOUND.SEP.
:'''N.B.''' A level's SOUND.DAT and SOUND.SEP always start with the same 5 entries (music segments) as in LEVEL0/SOUND.DAT and LEVEL0/SOUND.SEP, including the terminating code FE FF at 0x5B7A0. Those segments, indexed as 0 through 4 (same as in LEVEL0/SOUND.DAT), do not have a corresponding SNDD in the chapter's level#_Final (i.e., the 16-bit indices of the SNDD instances in level#_Final always start at 5 except for level0_Final, which only has those five SNDDs). It would seem that the sound indices listed in SOUND.DAT need to be unique across all the loaded level files. It would also seem that the duplicated storage of level0 music in all 14 chapters is highly suboptimal, using up 5 MB of disk space, and that the LEVEL0/SOUND.* files are redundant.
:'''N.B.''' Because of how the contents of LEVEL0/SOUND.SEP is included at the start of each level's LEVEL#/SOUND.SEP, complete with the terminating FE FF code, the data for the following, level-specific sounds is shifted by two bytes, i.e., from then on the starting offsets of each sound and VAG packet look like 0x.......2 rather than 0x.......0 (this is reflected by the offsets in LEVEL#/SOUND.DAT). This terminating code in the middle of the .SEP file probably serves no purpose whatsoever (the reading routines stop whenever they encounter a terminating bit in a VAG packet).


Slight distorsions are sometimes observed near the ends of looping SNDDs (music and ambient tracks). These artifacts were likely caused by Bungie's audio tools, and can not be undone automatically. Barely noticeable, they can be healed by manually editing audio samples near the seams.
==Exporting and importing tips==
*For manual converting between SNDD (PC retail or demo) and a WAVE file, see [[/wav|HERE]].
*For manual converting between SNDD (PC retail or Mac) and an AIFC file, see [[/aif|HERE]].
For automatic conversion, please use a sufficiently recent version of [[OniSplit]].


==Known engine issues==
;Custom sample/data rates
:The PC retail engine formally allows for an arbitrary sample rate (and a corresponding data rate) to be specified in the WAVfmt header, but actually the engine interprets all waveforms as 22.05 kHz.
:*If the WAVEfmt header is enabled and specifies MS ADPCM encoding (format ID 2), then the sample rate and data rate are completely ignored (you can fill those fields with zeroes or garbage, and it will still work).
:*If the WAVEfmt header is enabled and specifies PCM storage (format ID 1), then fully arbitrary (inconsistent) sample rate and data rate will cause playback glitches or interruption, whereas mutually consistent pairs ("data rate" equal to "sample rate"x"block size") are eventually ignored. In other words, you have to specify a valid sample rate and data rate, but the engine will end up using 22.05 kHz anyway.
:*If the IMA4 header is enabled (overriding WAVEfmt), then the header only specifies the channel count and the "number of packets", and the sample/data rates are again completely ignored.


===IMA ADPCM===
;PCM playback
====Padding====
:PCM playback is only known to work in PC retail builds, either with IMA4 and WAVEfmt headers disabled (the stream is simply copied to the output buffer, as Little-Endian 16-bit linear PCM, with the channel count specified at OSGr level) or with the WAVEfmt header enabled (in which case the stream can have a custom bit depth and channel count).
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition (for one thing, the trailing samples are not flat zero).
:In the case of PC demo, the PCM playback fails (the engine identifies the stream as already uncompressed and skips the initialization of ACM headers, but then proceeds with decompression anyway and stops because of zero input size). This causes no problems for impulse sounds (other than silence), but results in a crash for looping permutations (the same playback keeps failing over and over).
:On the Mac, the stream is interpreted as IMA4 regardless of the 0x00000001 flag, therefore if you put PCM data in the .raw, it will play back as noise.


As a workaround/solution, the correct sample count of a Mac SNDD can be looked up in a PC counterpart (always available, since we're only talking of music/ambients/sirens, which are neither localized nor sampled at 44.1 kHz), and then used to trim the .aif file in FFmpeg, while converting to .wav (either PCM or ADPCM). However, it's easier (and more reliable) to just grab a PC retail copy of Oni and extract the MS ADPCM sounds.
==Known data issues==
;PC SNDD data
:The MS ADPCM data used by Vanilla Oni on Windows (both retail and demo) is somewhat coarser (lossier) than the IMA4 ADPCM used on Mac. This is because of the much larger block size, and how the same predictor must be used for the whole block: whenever a block spans both high- and low-amplitude samples, the predictor adapts to the higher amplitudes, and the low-amplitude resolution is lost.
:As a minor issue, the lack of an exact sample count (similar to a WAVE's "fact") makes it impossible to specify an odd number of samples for a mono block (because there is no way to tell if the last byte's second nibble counts as data or not). It is also impossible to have only one sample in the last block, be it for mono or stereo (because the block header reads as two samples by default).


;Mac SNDD data
:Unlike Oni's implementation of MS ADPCM, which cuts off the last block after the last actual sample, IMA4 data consists of full 64-sample blocks, with no way to determine the actual sample count other than by comparing with the same sound from PC Oni. (The "padding" of IMA4 sounds is not always zero, and even if it was, truncation based on trailing zeroes would be somewhat arbitrary.) The padding of IMA4 samples to multiples of 64 leads to silences at the end of Mac sounds that can be up to 3 milliseconds long. This is not a problem for impulse sounds or looping sounds that are quiet near the end, but for loud music (e.g. the main menu's "Trailer" theme) the gap is noticeable. It has been confirmed that the gap is actually heard in Mac Oni, apparently without bothering anyone.
:As a hypothetical fix, the Oni engine could implement a sample count parameter (similar to a WAVE file's "fact"), or truncate the last 64-sample block after the last actual sample, upon encoding (the latter is slightly less straightforward for stereo – because there are ''two'' last blocks, one for each channel – but still not a big problem).
:Mac sound data also seems to suffer from encoding artifacts: there are spurious transients at the start of waveforms that prevent ambients and music from looping seamlessly even when cut at the right length. Unlike the padding, this cannot be remedied at all, therefore if one wishes to produce seamless tracks from Oni data (e.g. music) it is recommended to turn to PC SNDDs, which have no looping issues (but are slightly lossier because of the larger block size, as mentioned above).


====Initial transient====
==Modern support==
The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment does not pick up where the previous segment left (or should have left) - instead it builds up from zero over ~7 samples. This introduces about 0.3 milliseconds of silence, and an audible discontinuity in the waveform, even if the two segments are lined up properly.
===OniSplit===
Newer releases (OniSplit v0.9.99.3 and newer) include an implementation of the MS ADPCM and IMA4 codecs, along with the ability to generate .dat/.raw./sep game data compatible with the PC-demo based OniX engine. The codecs allow OniSplit to transcode from Mac's IMA4 or from 44.1 kHz MS ADPCM to standard 22.05 kHz MS ADPCM. PCM is also supported both ways (both as .wav output coming from IMA4 or MS ADPCM, or as uncompressed SNDD intended for the PC retail or demo engines). Finally, for exotic sample rates, there is an option to store the waveform as-is (without resampling) and merely report the appropriate rate multiplier that can be used in OSGr to adjust the speed/pitch upon playback.


The values of those initial samples is not recoverable (unlike the padding at the end of SNDDs, which can be trimmed down). Therefore, if working with sound samples extracted from Oni, it is recommended to turn to a PC version's SNDDs.
Older releases (OniSplit v0.9.99.2 and older) did not support ADPCM encoding or decoding, so there was no way to transcode between IMA4, MS ADPCM and PCM. Generating instance files for the demo engine was not supported either.


The hybrid SNDD format (aligned with OniX's planned features) has not been implemented in OniSplit yet. The recently discovered PC retail engine features (IMA4 support, or default PCM support) are also not covered by OniSplit at the time of writing.


==PCM export and PC demo detection==
===OniX===
OniSplit v0.9.### implements export to uncompressed PCM (signed 16-bit linear) from both the PC retail and the Mac SNDD format: use '''-extract:pcm''' instead of either '''-extract:wav''' or '''-extract:aif'''. As compared to ADPCM, linear PCM is a much more straightforward format (almost human readable), and makes it easier to analyze artifacts.
Starting with v1.1, OniX will likely support all three standard streams (IMA4, MS ADPCM and PCM) with their default encoding/storage settings:
*for IMA4, 32-byte blocks with 64 samples per block (interleaved Left-channel and Right-channel blocks if stereo), sample rate 22.05 kHz;
*for MS ADPCM, 512-byte blocks for mono, 1024-byte blocks for stereo, 1012 samples per block in both cases, sample rate 22.05 kHz;
*for PCM, signed linear 16-bit samples, Little Endian storage, sample rate 22.05 kHz.
The engine will determine between the three types of streams using flags in the short SNDD header. Possibly another flag will allow for doubled sample rate (44.1 kHz); in this configuration the default block size will probably be 1024 bytes for mono and 2048 bytes for stereo, with 2036 samples per block in each case.


Since the only difference between PC demo and Mac is the actual storage format of SNDD files (in the .raw part), and the template checksum is the same, OniSplit has no way of determining which ADPCM algorithm to use, other than by actually scanning and validating the data as IMA4 (or not). Starting with OniSplit v0.9.###, this automatic check is implemented, allowing both '''-extract:wav''' and '''-extract:pcm''' on PC-demo SNDDs. It is very unlikely that any Mac SNDDs will be falsely identified as MS ADPCM (or, rather, invalidated as IMA4). If it ever happens, do as instructed by the following warning: ''"PC-demo MS ADPCM detected; use '''-nodemo''' flag to treat as IMA4."''
Finally, even with a short SNDD header, it is possible for OniX to read a custom WAVEfmt chunk just like the PC retail engine does, by putting the chunk at the start of the .raw part and announcing its presence and/or size through another flag and/or the unused uint at 0x0E. The typical size of this ".raw header" will be 16 bytes for PCM and 50 bytes for ADPCM (the use of other WAVE formats will probably be discouraged).


Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So '''-extract:aif''' will not work on PC SNDDs, '''-extract:wav''' will not work on Mac SNDDs, and '''-create''' will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine.
Alternatively, both the flags and the uint at 0x0E can be used to specify a custom sample rate and/or block size (either as fully custom values or as power-of-two multiplicative factors), without the need for a detailed WAVEfmt header. Still, it is probably easiest to either adhere to the standard parameters (without too many extra flags) or go fully custom and read all the parameters from a standard-compliant WAVEfmt header.


Importing SNDDs for PC demo (with a short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too.


==Notes==
<references/>
----
----
 
{{OBD_File_Footer | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=General}}
 
{{OBD_File_Footer | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic}}


{{OBD}}
{{OBD}}

Latest revision as of 14:04, 6 January 2024

ONI BINARY DATA
QTNA << Other file types >> StNA
SNDD : Sound Data
switch to XML:SNDD page
Overview @ Oni Stuff
OBD.png
For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see OSBD and its subtypes: OSAm, OSIm and OSGr.

SNDD instances is where Oni stores sound data. In Vanilla Oni game data, sounds are either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), compressed to save on storage space. Both the PC and Mac versions use a form of ADPCM compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit PCM).

  • On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM codec (implemented in FFmpeg as adpcm_ms). See HERE for a quick description.
  • On Mac, sounds are encoded using QuickTime's IMA4 codec (implemented in FFmpeg as adpcm_ima_qt). See HERE for a quick description.
  • On PS2, sounds are encoded using Sony's VAG codec (a.k.a. Sony PSX ADPCM, or adpcm_psx in FFmpeg). See HERE for a quick description.

As a unique feature of Oni game data, SNDD files have a significantly different structure depending on the engine version. For PC retail (.dat/.raw storage, no .sep files), the SNDD files are larger and include a 50-byte chunk of data that is equivalent to the "fmt " chunk of a WAVE file. For the other two versions (PC demo and Mac, .dat/.raw/.sep storage), this 50-byte block is missing. It turns out that the extra format data allows the PC retail to support both MS ADPCM and IMA4, as well as uncompressed PCM, whereas PC demo and Mac engines only support MS ADPCM and IMA4, respectively. (It has not been confirmed whether the PC retail engine supports other WAVE formats beyond PCM and MS ADPCM, such as Mu-Law or A-Law PCM, IEEE float PCM, etc.) The PS2 engine uses the same short data header as for PC demo and Mac, but the waveform is stored as VAG (a.k.a. PSX ADPCM) and resides in a completely separate SOUNDS folder, accessed through an additional layer of indexation beyond the usual .dat/.raw./.sep logic (not unlike PS2 TXMPs which rely on color palettes stored in additional level#_palette.pal files).


For clarity, the simpler and more straightforward SNDDs of PC demo and Mac are documented first, followed by the more complex and versatile PC retail SNDDs.

(Historically, though, the PC retail implementation is older, and the PC demo and Mac versions were trimmed-down iterations of PC retail.)

The exotic PS2 storage is documented last, after which we list some legacy tips (for manual sound conversion) and known issues/limitations, as well as the current sound capabilities of OniX and OniSplit.


Mac and PC demo

The below example was taken from Mac Oni. In PC demo the file would look the same, except for possibly different res_id (at 0x00) and smaller raw data size (at 0x10).

Sndd alm.gif

Offset Type Raw Hex Value Description
0x00 res_id 01 D6 08 00 2262 SNDDcomguy_dth2.aif, instance number #2262
0x04 lev_id 01 00 00 06 3 level 3
0x08 uint32 01 00 00 00 1 flags
1 - (compressed; always the same in Vanilla Oni)
2 - stereo (mono if disabled)
0x0C uint16 37 00 55 duration in 1/60 seconds (game ticks)
0x0E uint16 00 00 0x0000 padding (unused in Vanilla Oni)
0x10 uint32 5E 2A 00 00 10846 size of the part in the raw file, in bytes
0x14 offset 00 B1 01 00 0x1B100 at this position starts the part in the raw file
0x18 char[8] AD DE dead 8 unused bytes (padding; not part of the SNDD file)
Padding
The 8 bytes at the end are not part of the SNDD template. They are not loaded by the Oni engine.
The uint16 at 0x0E is loaded by the engine, and thus constitutes a potentially useful 16-bit field.
Duration
The duration (number of game ticks) is rounded to the lower value (a.k.a. "floor") in Vanilla Oni data.[1] In other words, it indicates the number of complete game ticks spanned by the sound's playback.
Compression
The effect of the "1" flag (or rather of its absence) is different in the PC demo and Mac engines.
  • On Mac, the flag seems to have no effect at all (the .raw data is still passed to the IMA4 decoding algorithm even if the "1" flag is missing).
  • On PC demo, a missing "1" flag causes playback to fail.[2] If the sound is part of a looping permutation, the game crashes (because playback keeps failing on the same sound, over and over).
The bottom line is that, both for PC demo SNDDs and for Mac SNDDs, the "1" flag should always be set.

The compressed storage of .raw data is described in the following two sections.

IMA4 ADPCM .raw data (Mac)

For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see HERE. For an actual implementation example, see FFmpeg.

The IMA4 ADPCM stream data (stored in the .raw file) consists of 34-byte blocks (in the case of stereo, there is an even number of such blocks, with Left and Right blocks interleaved).
The first two bytes of each block form a header that sets the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples. Typically they are used only for the first block, or for sudden changes of the waveform's value range.
The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
All the 32-byte blocks must be stored in their entirety, meaning that the overall sample count of the waveform is a multiple of 64 (this is a major difference from MS ADPCM storage in PC retail and PC demo, where the final block is truncated after the last actual sample).
For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music:

MS ADPCM .raw data (PC demo)

For a detailed overview of the ADPCM algorithm (if interested), see HERE. For an actual implementation example, see FFmpeg.

The MS ADPCM stream data (stored in .raw) consists of 512- or 1024-byte blocks (512 bytes for 22.05 kHz mono, 1024 bytes for 22.05 kHz stereo)
Each block starts with a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the 16-bit values of the block's first two samples.
The remaining 505 or 1010 bytes of each block consist of nibbles (half-bytes), each coding for a sample. In the case of stereo, Left and Right nibbles are interleaved.
Thus the block has room for 1010 samples encoded as nibbles, and the total number of samples per block (including the header) is 1012, be it for mono or stereo 22.05 kHz.
For space efficiency, the MS ADPCM stored in .raw deviates from the standard, in that the final block is truncated after the last actual sample. In a WAVE file, ADPCM blocks are stored in their entirety, and the actual sample count is specified in a "fact" chunk. In Oni there is no such thing, instead the actual sample count is inferred from the block size and truncated raw data size.

PC retail

Below is the .dat file part used in the PC retail version.

Sndd all.gif

Offset Type Raw Hex Value Description
0x00 res_id 01 D7 08 00 2263 SNDDcomguy_dth2.aif, instance number #02263
0x04 lev_id 01 00 00 06 3 level 3
0x08 uint32 08 00 00 00 8 flags
1 - never used in Vanilla Oni; unknown (no effect?)
2 - never used in Vanilla Oni; unknown (no effect?)
4 - never used in Vanilla Oni; enables rudimentary format header and IMA4 decoding (overrules "8")
8 - always used in Vanilla Oni; enables fully-featured format header and PCM / MS ADPCM decoding
0x0C block[50]     format header (MS ADPCM variant here; can also be IMA4, see below)
0x3E uint16 37 00 55 duration in 1/60 seconds (game ticks), rounded to the lower value
0x40 uint32 56 28 00 00 10326 size of the part in the raw file in bytes
0x44 offset 20 10 59 00 0x591020 at this position starts the part in the raw file
0x48 char[24] AD DE dead 24 unused bytes (padding)
Padding
The 24 bytes at the end are not part of the SNDD template. They are not loaded by the Oni engine.
Duration
Same as for PC demo and Mac, the duration (number of game ticks) is rounded to the lower value (a.k.a. "floor") in Vanilla Oni data.[1] In other words, it indicates the number of complete game ticks spanned by the sound's playback.
Compression modes (flags)
Unlike for PC demo, there is no single compressed/uncompressed flag in PC retail SNDDs, and no stereo/mono flag either.
The channel count is specified in the 50-byte header if said header is enabled, otherwise it is inferred from OSGr.
As for compression, PC retail actually has three primary compression modes, commanded by the flag values. Only the first is used in Vanilla Oni.

WAVE-like format header ("8" flag)

If the "8" flag of the SNDD (at 0x08) is ON and the "4" flag is OFF (as is always the case in Vanilla Oni), the 50-byte block is interpreted as a standard "fmt " chunk that you find in WAVE files (see HERE for details).

Sndd hd.gif

Offset Type Raw Hex Value Description
0x0C int16 02 00 2 format ID (2 = MS ADPCM format)
N.B. At the time of writing, only "1" (linear PCM) and "2" (MS ADPCM) are known to work in Oni; in Vanilla Oni, only MS ADPCM is ever used.
0x0E int16 01 00 1 number of channels (1 = mono)
N.B. Both PCM and ADPCM support only mono and stereo sounds, i.e., 1 or 2 channels.
0x10 int32 22 56 00 00 22050 sample rate in Hz (samples per second), a.k.a. "sampling frequency"
0x14 int32 93 2B 00 00 11155 ADPCM average data rate:
N.B. For PCM, the data rate is simply samples per second*block alignment, seeing as each sample gets its own block.
N.B. For ADPCM, the average data rate is based on whole ADPCM blocks (not accounting for how Oni truncates the .raw data).}}
0x18 int16 02 00 512 block alignment a.k.a. "block size", in bytes
N.B. The block size is trivially 2 bytes for PCM mono (one 16-bit sample) and 4 bytes for PCM stereo (Left and Right 16-bit samples).
N.B. For ADPCM, Oni's Vanilla data always uses 512 bytes per channel for 22050 Hz waveforms, and 1024 bytes for 44.1 kHz mono (see below).
0x1A int16 04 00 4 bits per sample (per channel); typically 4 bits for ADPCM, 16 bits for PCM
Special extended ADPCM wav format header (black outline); fully ignored if the format ID is 1
0x1C int16 20 00 32 size of the extra ADPCM parameters, in bytes; typically always 32
0x1E int16 F4 03 1012 samples per block:
0x20 int16 07 00 7 number of the following coefficient pairs; always 7 in practice
0x22 int16-16 00 01 00 00 256, 0 The coefficient pairs themselves (always the same in practice).
0x26 int16-16 00 02 00 FF 512, -256
0x2A int16-16 00 00 00 00 0, 0
0x2E int16-16 C0 00 40 00 192, 64
0x32 int16-16 F0 00 00 00 240, 0
0x36 int16-16 CC 01 30 FF 460, -208
0x3A int16-16 88 01 18 FF 392, -232
PCM vs ADPCM
Although Vanilla Oni SNDDs only ever use MS ADPCM waveforms, some mods have successfully used the (bloated!!!) PCM format.
For PCM, the format ID (at 0x0C) is set to 1, the block size is either 2 or 4, the data rate formula is simplified, and everything between 0x1C and 0x3E is ignored.
See HERE for more details on the importing procedure, but keep in mind that PCM waveforms are not recommended!
ADPCM coefficients
The 14 coefficients are de-facto standard, but custom values are formally allowed by the MS ADPCM algorithm.
Therefore, each MS ADPCM waveform is always accompanied by the coefficient pairs that were used to encode it.
Thus, even though these numbers are practically always the same, they are required; don't ever mess with them.
Standard sets of ADPCM parameters

Below are the three types of headers occurring for Vanilla Oni's sounds (MS ADPCM).


22.05 kHz mono (used for the vast majority of sounds):

1 channel; sample rate 22050 Hz; average data rate 11155 B/s (truncated from ~11155.7312253 = 22050*512/1012);
block alignment 512 bytes; 4 bits per sample; 1012 samples per block (= 2 + (512 - 7)*8/4/1); standard coefficient table.
0x00:  °° °° °° °° °° °° °° °° 08 00 00 00 02 00 01 00  °°°°°°°°°°°°°°°°
0x10:  22 56 00 00 93 2B 00 00 00 02 04 00 20 00 F4 03  °°°°°°°°°°°°°°°°
0x20:  07 00 00 01 00 00 00 02 00 FF 00 00 00 00 C0 00  °°°°°°°°°°°°°°°°
0x30:  40 00 F0 00 00 00 CC 01 30 FF 88 01 18 FF °° °°  °°°°°°°°°°°°°°°°

22.05 kHz stereo (used for music and some ambients):

2 channels; sample rate 22050 Hz; average data rate 22311 B/s (truncated from ~22311.4624506 = 22050*1024/1012);
block alignment 1024 bytes; 4 bits per sample; 1012 samples per block (= 2 + (1024 - 2*7)*8/4/2); standard coefficient table.
0x00:  °° °° °° °° °° °° °° °° 08 00 00 00 02 00 02 00  °°°°°°°°°°°°°°°°
0x10:  22 56 00 00 27 57 00 00 00 04 04 00 20 00 F4 03  °°°°°°°°°°°°°°°°
0x20:  07 00 00 01 00 00 00 02 00 FF 00 00 00 00 C0 00  °°°°°°°°°°°°°°°°
0x30:  40 00 F0 00 00 00 CC 01 30 FF 88 01 18 FF °° °°  °°°°°°°°°°°°°°°°

44 kHz mono (used for ap_hit_shld and the 45 zap## sounds):

1 channel; sample rate 44100 kHz; average data rate 22179 B/s (truncated from ~22179.9607073 = 44100*1024/2036);
block alignment 1024 bytes; 4 bits per sample; 2036 samples per block (= 2 + (1024 - 7)*8/4/1); standard coefficient table.
0x00:  °° °° °° °° °° °° °° °° 08 00 00 00 02 00 01 00  °°°°°°°°°°°°°°°°
0x10:  44 AC 00 00 A3 56 00 00 00 04 04 00 20 00 F4 07  °°°°°°°°°°°°°°°°
0x20:  07 00 00 01 00 00 00 02 00 FF 00 00 00 00 C0 00  °°°°°°°°°°°°°°°°
0x30:  40 00 F0 00 00 00 CC 01 30 FF 88 01 18 FF °° °°  °°°°°°°°°°°°°°°°

Only the 22.05 kHz sounds (mono and stereo) are played back correctly by the PC retail engine. The forty-six 44.1 kHz sounds are interpreted as 22.05 kHz waveforms, and therefore are played back two times slower/lower than intended.

For any other combinations of PCM or ADPCM parameters, please refer to the importing routine, HERE.

IMA ADPCM decoding ("4" flag)

If the "4" flag of the SNDD (at 0x08) is ON, then (regardless of the "8" flag), the .raw data will be interpreted as IMA ADPCM blocks, and the 50-byte format block will be mostly ignored, as well as the ".raw data size" field at 0x40.

Here is what the .dat part of the above SNDDcomguy_dth2 would have looked like in IMA4 mode:

0x00:  01 D7 08 00 01 00 00 06 04 00 00 00 00 00 01 00  °×°°°°°°°°°°°°°°
0x10:  00 00 00 00 00 00 00 00 52 01 00 00 00 00 00 00  °°°°°°°°4°°°°°°°
0x20:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  °°°°°°°°°°°°°°°°
0x30:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  °°°°°°°°°°°°°°°°
0x40:  00 00 00 00 20 10 59 00 AD DE AD DE AD DE AD DE  °°°° °Y°°°°°°°°°
0x50:  AD DE AD DE AD DE AD DE AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
The channel count field in the 50-byte "format" block (at 0x0E) is used in the same way as for a WAVE-like header.
The "block alignment" field (at 0x18) is used (somewhat counterintuitively) to store the number of IMA4 "packets":
  • for a mono IMA4 sound, packets are 34-byte blocks carrying 64 mono samples each (two bytes of header data and 64 "nibbles", or half-bytes);
  • for a stereo IMA4 sound, packets are pairs of consecutive 34-byte blocks (the 64 Left samples are stored first, then the 64 Right samples).
(The actual duration of SNDDcomguy_dth2 is 0.97941 seconds, which at 22.05 kHz requires 338 64-sample blocks, hence the value 0x152 appearing at 0x18.)
Everything else in the 50-byte format block is ignored, as well as the ".raw data size" usually found at 0x40 (Oni uses the number of packets instead).

Raw PCM (no flags)

If neither the "4" nor the "8" flags are set (at 0x08 in the .dat part of the SNDD), then the .raw data is apparently copied as-is into the playback buffer.

In this "raw" playback mode, the sound will not be reproduced correctly unless it's linear PCM (Little Endian) with 16-bit sample depth, 22.05 kHz sampling rate, and a channel count consistent with the SNDD's OSGr (i.e., the metadata used by Oni's engine to set up the sound's playback).

Here is what the .dat part of the above SNDDcomguy_dth2 would have looked like in raw PCM mode:

0x00:  01 D7 08 00 01 00 00 06 00 00 00 00 00 00 00 00  °×°°°°°°°°°°°°°°
0x10:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  °°°°°°°°°°°°°°°°
0x20:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  °°°°°°°°°°°°°°°°
0x30:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  °°°°°°°°°°°°°°°°
0x40:  00 00 00 00 20 10 59 00 AD DE AD DE AD DE AD DE  °°°° °Y°°°°°°°°°
0x50:  AD DE AD DE AD DE AD DE AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
Effect of "1" and "2" flags
It is possible that the "1" and "2" flags used to affect playback in raw PCM mode (something about swapping the .raw data to allow both for Little Endian and Big Endian PCM samples), but currently they do not seem to have any effect.


PS2 implementation

The PS2 implementation has an unorthodox approach to raw data when it comes to SNDDs. Here are the .dat parts of the SNDDs in a PS2 level0_Final.dat file.

0xD5EA0:  01 5B 0E 00 01 00 00 00 00 00 00 00 9E 01 00 00  °°°°°°°°°°°°°°°°
0xD5EB0:  20 F7 00 00 40 30 76 13 AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
0xD5EC0:  01 5C 0E 00 01 00 00 00 00 00 00 00 9E 01 01 00  °°°°°°°°°°°°°°°°
0xD5ED0:  A0 F6 00 00 C0 37 77 13 AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
0xD5EE0:  01 5D 0E 00 01 00 00 00 00 00 00 00 8F 02 02 00  °°°°°°°°°°°°°°°°
0xD5EF0:  40 86 01 00 C0 2E 78 13 AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
0xD5F00:  01 5E 0E 00 01 00 00 00 00 00 00 00 91 02 03 00  °°°°°°°°°°°°°°°°
0xD5F10:  40 87 01 00 40 B5 79 13 AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°
0xD5F20:  01 5F 0E 00 01 00 00 00 00 00 00 00 3C 01 04 00  °°°°°°°°°°°°°°°°
0xD5F30:  60 BC 00 00 00 0A 75 13 AD DE AD DE AD DE AD DE  °°°°°°°°°°°°°°°°

(Yes, the PS2 retail level0_Final.dat only has those five sounds. All the other sounds (weapons, particles, impacts, footsteps, etc) are stored per-chapter, which causes a lot of duplicates but supposedly lightens the memory usage for a given level.)

The layout is similar to the PC demo and Mac SNDDs described above - and indeed the SNDD template checksum is the same for PC demo, Mac and PS2, implying that the data structure in the .dat is the same. However, there are two major novelties/anomalies (apart from all the music being mono), emphasized above with bold italic. First, the five .raw offsets at the end of each SNDD are obviously not pointers into level0_Final.raw or level0_Final.sep (the .raw's size is 4 MB, the .sep's 6MB, and the pointers are in the 311 MB range, possibly pointing into a memory region where the sounds will be stored at runtime). Second, the 2-byte padding field between the duration and the .raw storage size is obviously not blank here; rather, it is an index into a file called SOUNDS\LEVEL0\SOUND.DAT, which looks like this:

0x00: 00 00 20 F7 00 00 00 00 00 00 01 00 A0 F6 00 00
0x10: 20 F7 00 00 02 00 40 86 01 00 C0 ED 01 00 03 00
0x20: 40 87 01 00 00 74 03 00 04 00 60 BC 00 00 40 FB
0x30: 04 00 FE FF

These are 5 blocks of 10 bytes each, followed by the two bytes FE FF which signal the end of the file. For each sound there is a 2-byte index, then a 4-byte data size (including padding), then the offset at which the data is stored in the SOUNDS\LEVEL0\SOUND.SEP file. The .SEP data for each sound consists of 32 blank bytes followed by a large number of VAG packets (16 bytes each), the final VAG packet having a terminating bit set. At the very end of the .SEP file is a terminating pair of bytes, FE FF, same as in the .DAT file. The file SOUNDS\LEVEL0\SOUND.RAW exists in the same folder, but has no data except for the two bytes FE FF.

For all the levels other than LEVEL0 (i.e., game chapters), some of the sounds are stored in SOUNDS\LEVEL#\SOUND.RAW (the size is about 1 MB for all chapters). RAW storage is indicated by a zero .SEP offset in the corresponding block of the SOUNDS\LEVEL#\SOUND.DAT file (note, however, that the first sound in SOUND.SEP also has a zero offset). As an example, here is a fragment of SOUNDS\LEVEL1\SOUND.DAT featuring the first .RAW-resident sounds.

0x690: A8 00 70 3B 02 00 22 B2 83 00 A9 00 50 98 00 00
0x6A0: 92 ED 85 00 AA 00 20 80 00 00 E2 85 86 00 AB 00
0x6B0: F0 F2 00 00 02 06 87 00 AC 00 10 2D 00 00 00 00
0x6C0: 00 00 AD 00 C0 1B 00 00 00 00 00 00 AE 00 40 0B
0x6D0: 00 00 00 00 00 00 AF 00 F0 11 00 00 00 00 00 00
0x6E0: B0 00 20 24 00 00 F2 F8 87 00 B1 00 50 25 00 00
0x6F0: 12 1D 88 00 B2 00 10 58 00 00 62 42 88 00 B3 00
0x700: 20 1F 00 00 72 9A 88 00 B4 00 A0 26 00 00 00 00
0x710: 00 00 B5 00 F0 0E 00 00 00 00 00 00 B6 00 B0 09
0x720: 00 00 00 00 00 00 B7 00 B0 0B 00 00 00 00 00 00

Here the .SEP offset field is zero for entries 0xAC, 0xAD, 0xAE and 0xAF, then non-zero for the next four entries, and zero again for the following four. The start of the SOUNDS\LEVEL1\SOUND.RAW file looks as follows

0x00: AC 00 10 2D 00 00 00 00 00 00 00 00 00 00 00 00
0x10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x20: 00 00 00 00 00 00 1A 00 24 00 00 01 10 00 02 10
0x30: 0F 00 11 1F 1F 1E 1A 00 01 1F 10 0F 11 21 3E 10
0x40: 20 0F 20 41 D2 E3

Here AC 00 is the 2-byte index of the sound (the same as in the SOUNDS\LEVEL1\SOUND.DAT and in the corresponding SNDD in level1_Final.dat), then there is the 4-byte data size (also the same as announced in the .dat and .DAT), followed by the same data as in the .SEP (32 zero bytes, then some 16-byte VAG packets, the last packet having a terminating bit set). At the very end of the .RAW file is a terminating pair of bytes, FE FF.

N.B. It appears that SOUND.RAW is loaded in its entirely when a level starts, whereas sound data from a level's SOUND.SEP is (re)loaded on-demand. Accordingly, SOUND.RAW typically contains short recurrent sounds (gunshots, impacts, footsteps, hurt sounds, etc). Permanent storage is not decided based on size alone, though: for example, the rather long SNDDheliflyby2 (6 seconds) is stored in SOUND.RAW, whereas the much shorter SNCCconsole-locked (1.5 seconds long) is stored in SOUND.SEP.
N.B. A level's SOUND.DAT and SOUND.SEP always start with the same 5 entries (music segments) as in LEVEL0/SOUND.DAT and LEVEL0/SOUND.SEP, including the terminating code FE FF at 0x5B7A0. Those segments, indexed as 0 through 4 (same as in LEVEL0/SOUND.DAT), do not have a corresponding SNDD in the chapter's level#_Final (i.e., the 16-bit indices of the SNDD instances in level#_Final always start at 5 except for level0_Final, which only has those five SNDDs). It would seem that the sound indices listed in SOUND.DAT need to be unique across all the loaded level files. It would also seem that the duplicated storage of level0 music in all 14 chapters is highly suboptimal, using up 5 MB of disk space, and that the LEVEL0/SOUND.* files are redundant.
N.B. Because of how the contents of LEVEL0/SOUND.SEP is included at the start of each level's LEVEL#/SOUND.SEP, complete with the terminating FE FF code, the data for the following, level-specific sounds is shifted by two bytes, i.e., from then on the starting offsets of each sound and VAG packet look like 0x.......2 rather than 0x.......0 (this is reflected by the offsets in LEVEL#/SOUND.DAT). This terminating code in the middle of the .SEP file probably serves no purpose whatsoever (the reading routines stop whenever they encounter a terminating bit in a VAG packet).

Exporting and importing tips

  • For manual converting between SNDD (PC retail or demo) and a WAVE file, see HERE.
  • For manual converting between SNDD (PC retail or Mac) and an AIFC file, see HERE.

For automatic conversion, please use a sufficiently recent version of OniSplit.

Known engine issues

Custom sample/data rates
The PC retail engine formally allows for an arbitrary sample rate (and a corresponding data rate) to be specified in the WAVfmt header, but actually the engine interprets all waveforms as 22.05 kHz.
  • If the WAVEfmt header is enabled and specifies MS ADPCM encoding (format ID 2), then the sample rate and data rate are completely ignored (you can fill those fields with zeroes or garbage, and it will still work).
  • If the WAVEfmt header is enabled and specifies PCM storage (format ID 1), then fully arbitrary (inconsistent) sample rate and data rate will cause playback glitches or interruption, whereas mutually consistent pairs ("data rate" equal to "sample rate"x"block size") are eventually ignored. In other words, you have to specify a valid sample rate and data rate, but the engine will end up using 22.05 kHz anyway.
  • If the IMA4 header is enabled (overriding WAVEfmt), then the header only specifies the channel count and the "number of packets", and the sample/data rates are again completely ignored.
PCM playback
PCM playback is only known to work in PC retail builds, either with IMA4 and WAVEfmt headers disabled (the stream is simply copied to the output buffer, as Little-Endian 16-bit linear PCM, with the channel count specified at OSGr level) or with the WAVEfmt header enabled (in which case the stream can have a custom bit depth and channel count).
In the case of PC demo, the PCM playback fails (the engine identifies the stream as already uncompressed and skips the initialization of ACM headers, but then proceeds with decompression anyway and stops because of zero input size). This causes no problems for impulse sounds (other than silence), but results in a crash for looping permutations (the same playback keeps failing over and over).
On the Mac, the stream is interpreted as IMA4 regardless of the 0x00000001 flag, therefore if you put PCM data in the .raw, it will play back as noise.

Known data issues

PC SNDD data
The MS ADPCM data used by Vanilla Oni on Windows (both retail and demo) is somewhat coarser (lossier) than the IMA4 ADPCM used on Mac. This is because of the much larger block size, and how the same predictor must be used for the whole block: whenever a block spans both high- and low-amplitude samples, the predictor adapts to the higher amplitudes, and the low-amplitude resolution is lost.
As a minor issue, the lack of an exact sample count (similar to a WAVE's "fact") makes it impossible to specify an odd number of samples for a mono block (because there is no way to tell if the last byte's second nibble counts as data or not). It is also impossible to have only one sample in the last block, be it for mono or stereo (because the block header reads as two samples by default).
Mac SNDD data
Unlike Oni's implementation of MS ADPCM, which cuts off the last block after the last actual sample, IMA4 data consists of full 64-sample blocks, with no way to determine the actual sample count other than by comparing with the same sound from PC Oni. (The "padding" of IMA4 sounds is not always zero, and even if it was, truncation based on trailing zeroes would be somewhat arbitrary.) The padding of IMA4 samples to multiples of 64 leads to silences at the end of Mac sounds that can be up to 3 milliseconds long. This is not a problem for impulse sounds or looping sounds that are quiet near the end, but for loud music (e.g. the main menu's "Trailer" theme) the gap is noticeable. It has been confirmed that the gap is actually heard in Mac Oni, apparently without bothering anyone.
As a hypothetical fix, the Oni engine could implement a sample count parameter (similar to a WAVE file's "fact"), or truncate the last 64-sample block after the last actual sample, upon encoding (the latter is slightly less straightforward for stereo – because there are two last blocks, one for each channel – but still not a big problem).
Mac sound data also seems to suffer from encoding artifacts: there are spurious transients at the start of waveforms that prevent ambients and music from looping seamlessly even when cut at the right length. Unlike the padding, this cannot be remedied at all, therefore if one wishes to produce seamless tracks from Oni data (e.g. music) it is recommended to turn to PC SNDDs, which have no looping issues (but are slightly lossier because of the larger block size, as mentioned above).

Modern support

OniSplit

Newer releases (OniSplit v0.9.99.3 and newer) include an implementation of the MS ADPCM and IMA4 codecs, along with the ability to generate .dat/.raw./sep game data compatible with the PC-demo based OniX engine. The codecs allow OniSplit to transcode from Mac's IMA4 or from 44.1 kHz MS ADPCM to standard 22.05 kHz MS ADPCM. PCM is also supported both ways (both as .wav output coming from IMA4 or MS ADPCM, or as uncompressed SNDD intended for the PC retail or demo engines). Finally, for exotic sample rates, there is an option to store the waveform as-is (without resampling) and merely report the appropriate rate multiplier that can be used in OSGr to adjust the speed/pitch upon playback.

Older releases (OniSplit v0.9.99.2 and older) did not support ADPCM encoding or decoding, so there was no way to transcode between IMA4, MS ADPCM and PCM. Generating instance files for the demo engine was not supported either.

The hybrid SNDD format (aligned with OniX's planned features) has not been implemented in OniSplit yet. The recently discovered PC retail engine features (IMA4 support, or default PCM support) are also not covered by OniSplit at the time of writing.

OniX

Starting with v1.1, OniX will likely support all three standard streams (IMA4, MS ADPCM and PCM) with their default encoding/storage settings:

  • for IMA4, 32-byte blocks with 64 samples per block (interleaved Left-channel and Right-channel blocks if stereo), sample rate 22.05 kHz;
  • for MS ADPCM, 512-byte blocks for mono, 1024-byte blocks for stereo, 1012 samples per block in both cases, sample rate 22.05 kHz;
  • for PCM, signed linear 16-bit samples, Little Endian storage, sample rate 22.05 kHz.

The engine will determine between the three types of streams using flags in the short SNDD header. Possibly another flag will allow for doubled sample rate (44.1 kHz); in this configuration the default block size will probably be 1024 bytes for mono and 2048 bytes for stereo, with 2036 samples per block in each case.

Finally, even with a short SNDD header, it is possible for OniX to read a custom WAVEfmt chunk just like the PC retail engine does, by putting the chunk at the start of the .raw part and announcing its presence and/or size through another flag and/or the unused uint at 0x0E. The typical size of this ".raw header" will be 16 bytes for PCM and 50 bytes for ADPCM (the use of other WAVE formats will probably be discouraged).

Alternatively, both the flags and the uint at 0x0E can be used to specify a custom sample rate and/or block size (either as fully custom values or as power-of-two multiplicative factors), without the need for a detailed WAVEfmt header. Still, it is probably easiest to either adhere to the standard parameters (without too many extra flags) or go fully custom and read all the parameters from a standard-compliant WAVEfmt header.


Notes

  1. 1.0 1.1 As an example, on PC (both demo and retail), SNDDmus_ot7.aif consists of 152360 samples, which at 22.05 kHz corresponds to 6.90975 seconds, or 414.585 ticks; the duration, however, is indicated as 414 ticks and not 415. In other words, "duration" corresponds to the number of whole game ticks spanned by the sound's playback.
  2. If a SNDD misses the "1" flag ("compressed"), the PC demo engine does identify the stream as already uncompressed (PCM samples) and skips the initialization phase of the decompression routine, but then proceeds with decompression anyway, and immediately stops because of zero input size.

ONI BINARY DATA
QTNA << Other file types >> StNA
SNDD : Sound Data
General file