OBD:SNDD/wav: Difference between revisions
(getting there...) |
m (replaced formula GIFs with Math markup; replaced nowiki tags around equals signs with new {{=}} magic word) |
||
(2 intermediate revisions by one other user not shown) | |||
Line 40: | Line 40: | ||
{{OBDtr| 0x0C | char[4] |00FFFF| 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section }} | {{OBDtr| 0x0C | char[4] |00FFFF| 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section }} | ||
{{OBDtr| 0x10 | uint32 |FFC8C8| 10 00 00 00 | 16 | content size for the "fmt " section, in bytes (always 16 for PCM) }} | {{OBDtr| 0x10 | uint32 |FFC8C8| 10 00 00 00 | 16 | content size for the "fmt " section, in bytes (always 16 for PCM) }} | ||
{{OBDtr| 0x14 | uint16 |FFFFC8| '''''01 00''''' | 1 | format ID (1 | {{OBDtr| 0x14 | uint16 |FFFFC8| '''''01 00''''' | 1 | format ID (1 {{=}} linear PCM format) }} | ||
{{OBDtr| 0x16 | uint16 |C8FFC8| '''''02 00''''' | 2 | number of channels (2 | {{OBDtr| 0x16 | uint16 |C8FFC8| '''''02 00''''' | 2 | number of channels (2 {{=}} stereo) }} | ||
{{OBDtr| 0x18 | uint32 |C8FFFF| '''''22 56 00 00''''' | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" }} | {{OBDtr| 0x18 | uint32 |C8FFFF| '''''22 56 00 00''''' | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" }} | ||
{{OBDtr| 0x1C | uint32 |FFC8FF| '''''88 58 01 00''''' | 88200 | data rate ( | {{OBDtr| 0x1C | uint32 |FFC8FF| '''''88 58 01 00''''' | 88200 | data rate ({{=}} "sample rate" * "block alignment"), in bytes per second | ||
:<small>'''N.B.''' For PCM, there is one block per sample, hence the simple formula.</small> }} | :<small>'''N.B.''' For PCM, there is one block per sample, hence the simple formula.</small> }} | ||
{{OBDtr| 0x20 | uint16 |FFC800| '''''04 00''''' | 4 | block alignment a.k.a "block size", in bytes | {{OBDtr| 0x20 | uint16 |FFC800| '''''04 00''''' | 4 | block alignment a.k.a "block size", in bytes | ||
Line 50: | Line 50: | ||
{{OBDtr| 0x24 | char[4] |FF00C8| 64 61 74 61 | data | identifier announcing the following "data" section }} | {{OBDtr| 0x24 | char[4] |FF00C8| 64 61 74 61 | data | identifier announcing the following "data" section }} | ||
{{OBDtr| 0x28 | uint32 |C8FF00| 00 50 09 00 | 610304 | content size for the "data" section, in bytes (implies 152576 stereo sample blocks, 4 bytes each) }} | {{OBDtr| 0x28 | uint32 |C8FF00| 00 50 09 00 | 610304 | content size for the "data" section, in bytes (implies 152576 stereo sample blocks, 4 bytes each) }} | ||
{{OBDtr| 0x2C | block[4] |C800FF| F5 FF F5 FF | (-11,-11) | first stereo sample; the left and the right sample values are both -11 | {{OBDtr| 0x2C | block[4] |C800FF| F5 FF F5 FF | (-11,-11) | first stereo sample; the left and the right sample values are both -11{{=}}0xFFF5 }} | ||
|} | |} | ||
;Mono vs stereo | ;Mono vs stereo | ||
Line 125: | Line 125: | ||
{{OBDtr| 0x0C | char[4] |00FFFF| 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section }} | {{OBDtr| 0x0C | char[4] |00FFFF| 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section }} | ||
{{OBDtr| 0x10 | uint32 |FFC8C8| 32 00 00 00 | 50 | content size for the "fmt " section, in bytes (typically 50 for MS ADPCM) }} | {{OBDtr| 0x10 | uint32 |FFC8C8| 32 00 00 00 | 50 | content size for the "fmt " section, in bytes (typically 50 for MS ADPCM) }} | ||
{{OBDtr| 0x14 | uint16 |FFFFC8| '''''02 00''''' | 2 | format ID (2 | {{OBDtr| 0x14 | uint16 |FFFFC8| '''''02 00''''' | 2 | format ID (2 {{=}} MS ADPCM format) }} | ||
{{OBDtr| 0x16 | uint16 |C8FFC8| '''''02 00''''' | 2 | number of channels (2 | {{OBDtr| 0x16 | uint16 |C8FFC8| '''''02 00''''' | 2 | number of channels (2 {{=}} stereo) }} | ||
{{OBDtr| 0x18 | uint32 |C8FFFF| '''''22 56 00 00''''' | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" }} | {{OBDtr| 0x18 | uint32 |C8FFFF| '''''22 56 00 00''''' | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" }} | ||
{{OBDtr| 0x1C | uint32 |FFC8FF| '''''27 57 00 00''''' | 22311 | | {{OBDtr| 0x1C | uint32 |FFC8FF| '''''27 57 00 00''''' | 22311 | <math>\frac{\text{samples per second}*\text{block alignment}}{\text{samples per block}}</math> | ||
:<small>'''N.B.''' The data rate (in bytes per second) is truncated to the lower integer value.</small> }} | :<small>'''N.B.''' The data rate (in bytes per second) is truncated to the lower integer value.</small> }} | ||
{{OBDtr| 0x20 | uint16 |FFC800| '''''00 04''''' | 1024 | block alignment a.k.a "block size", in bytes | {{OBDtr| 0x20 | uint16 |FFC800| '''''00 04''''' | 1024 | block alignment a.k.a "block size", in bytes | ||
Line 134: | Line 134: | ||
{{OBDtr| 0x22 | uint16 |00FFC8| '''''04 00''''' | 4 | bits per sample (per channel); typically 4 bits for MS ADPCM }} | {{OBDtr| 0x22 | uint16 |00FFC8| '''''04 00''''' | 4 | bits per sample (per channel); typically 4 bits for MS ADPCM }} | ||
{{OBDtr| 0x24 | uint16 |C87C64| '''''20 00''''' | 32 | size of the following extended format specification, in bytes }} | {{OBDtr| 0x24 | uint16 |C87C64| '''''20 00''''' | 32 | size of the following extended format specification, in bytes }} | ||
{{OBDtr| 0x26 | uint16 |B0C3D4| '''''F4 03''''' | 1012 | | {{OBDtr| 0x26 | uint16 |B0C3D4| '''''F4 03''''' | 1012 | <math>\dfrac{(\text{block alignment}-7*\text{number of channels})*8}{\text{bits per sample}*\text{number of channels}}+2</math> }} | ||
{{OBDtr| 0x28 | uint16 |E7CEA5| '''''07 00''''' | 7 | number of the following coefficient pairs; always 7 in practice }} | {{OBDtr| 0x28 | uint16 |E7CEA5| '''''07 00''''' | 7 | number of the following coefficient pairs; always 7 in practice }} | ||
|-align=center valign=top | |-align=center valign=top | ||
| 0x2A || int16-16 || bgcolor="#FFDDDD" | '''''00 01 00 00''''' || 256, 0 || rowspan=7 align=left | The coefficient pairs themselves (always the same in practice).<br> | | 0x2A || int16-16 || bgcolor="#FFDDDD" | '''''00 01 00 00''''' || 256, 0 || rowspan=7 align=left | The coefficient pairs themselves (always the same in practice).<br><math>\begin{array}{|c|c||c|} \text{coefficient set} & \text{coefficient 1} & \text{coefficient 2} \\ | ||
\hline | |||
0 & 256 & 0\\ | |||
1 & 512 & -256\\ | |||
2 & 0 & 0\\ | |||
3 & 192 & 64\\ | |||
4 & 240 & 0\\ | |||
5 & 460 & -208\\ | |||
6 & 392 & -232 | |||
\end{array} </math> | |||
|-align=center valign=top | |-align=center valign=top | ||
| 0x2E || int16-16 || bgcolor="#FFDDDD" | '''''00 02 00 FF''''' || 512, -256 | | 0x2E || int16-16 || bgcolor="#FFDDDD" | '''''00 02 00 FF''''' || 512, -256 | ||
Line 171: | Line 180: | ||
:From the "samples per block" follows the average rate (at 0x1C). From the actual number of samples (at 0x4E) follows the number of ADPCM blocks required to store the waveform, and therefore the data size at (0x56), a multiple of "block size", which in turn affects the total RIFF size at 0x04. | :From the "samples per block" follows the average rate (at 0x1C). From the actual number of samples (at 0x4E) follows the number of ADPCM blocks required to store the waveform, and therefore the data size at (0x56), a multiple of "block size", which in turn affects the total RIFF size at 0x04. | ||
;"fact" section VS truncated data | ;"fact" section VS truncated data | ||
:It is ''not'' standard-compliant to truncate the final ADPCM block after the last actual sample like Oni does. The WAVE file is expected to contain ''whole'' ADPCM blocks (usually padded with zeroes), and the exact number of actual samples is specified in the "fact" section. This is not very space-efficient, but it allows a waveform to have a completely arbitrary number of samples. With Oni's truncated data and lack of an explicit sample count, it is impossible to specify an odd-numbered count of mono samples (because the bytes of mono ADPCM blocks consist of consecutive same-channel nibbles, and there is no way to tell whether the last nibble counts or not). Also, Oni can only truncate after the header of an ADPCM block, therefore the two explicitly stored samples | :It is ''not'' standard-compliant to truncate the final ADPCM block after the last actual sample like Oni does. The WAVE file is expected to contain ''whole'' ADPCM blocks (usually padded with zeroes), and the exact number of actual samples is specified in the "fact" section. This is not very space-efficient, but it allows a waveform to have a completely arbitrary number of samples. With Oni's truncated data and lack of an explicit sample count, it is impossible to specify an odd-numbered count of mono samples (because the bytes of mono ADPCM blocks consist of consecutive same-channel nibbles, and there is no way to tell whether the last nibble counts or not). Also, Oni can only truncate after the header of an ADPCM block, therefore the two explicitly stored samples in the header are always played, whereas "fact" can be used to indicate that only the first sample is actual waveform data, and the other is a dummy. | ||
;Data size | ;Data size | ||
:The size at 0x56 is the size of the raw data that starts at 0x5A, which consists of whole ADPCM blocks and will need to be truncated after the last actual sample for use in an Oni SNDD instance. If the data is not truncated, Oni will determine the sample count from the padded data size, and there will be a noticeable silence (up to 46 milliseconds, or almost 3 game ticks, at 22.05 kHz and for 1012 samples per block) – not a problem for impulse sounds, but definitely not recommendable for ambients or music. | :The size at 0x56 is the size of the raw data that starts at 0x5A, which consists of whole ADPCM blocks and will need to be truncated after the last actual sample for use in an Oni SNDD instance. If the data is not truncated, Oni will determine the sample count from the padded data size, and there will be a noticeable silence (up to 46 milliseconds, or almost 3 game ticks, at 22.05 kHz and for 1012 samples per block) – not a problem for impulse sounds, but definitely not recommendable for ambients or music. | ||
==Importing from WAVE into Oni== | ==Importing from WAVE into Oni== | ||
Line 217: | Line 206: | ||
The PC retail engine is more flexible since it has a 50-byte format section specifically intended to receive the "fmt " chunk of a MS ADPCM .wav file along will all the non-trivial ADPCM settings. This format section is activated with the 0x00000008 flag, provided that the 0x00000004 flag is OFF (this "4" flag overrides the "8" flag and forces the interpretation of the data as IMA4 ADPCM). | The PC retail engine is more flexible since it has a 50-byte format section specifically intended to receive the "fmt " chunk of a MS ADPCM .wav file along will all the non-trivial ADPCM settings. This format section is activated with the 0x00000008 flag, provided that the 0x00000004 flag is OFF (this "4" flag overrides the "8" flag and forces the interpretation of the data as IMA4 ADPCM). | ||
The 50-byte format section in the SNDD can also be used to store the shorter "fmt " chunk of | The 50-byte format section in the SNDD can also be used to store the shorter "fmt " chunk of PCM .wav files, and this is the recommended procedure, as opposed to using the "raw PCM" functionality of PC retail SNDD (if you remove the 0x00000008 flag, then the format header is completely disabled and the .raw SNDD data is interpreted as 22.05 kHz PCM, using the channel count specified at OSGr level; don't do that, and just use the "8" flag). | ||
;Copying the .raw data | ;Copying the .raw data | ||
:For PCM, just refer to the "PCM" template above: copy the data that starts at 0x2C in the WAVE file (its size can be found at 0x28, but it reaches to the end of the file anyway since there is no padding). Copy the data to the SNDD's .raw part, store the address at 0xC4 in the SNDD's .dat part and the size at 0xC0. | :For PCM, just refer to the "PCM" template above: copy the data that starts at 0x2C in the WAVE file (its size can be found at 0x28, but it reaches to the end of the file anyway since there is no padding). Copy the data to the SNDD's .raw part, store the address at 0xC4 in the SNDD's .dat part and the size at 0xC0. | ||
Line 287: | Line 276: | ||
<center>[[OBD:SNDD|Back to SNDD]]</center> | <center>[[OBD:SNDD|Back to SNDD]]</center> | ||
{{OBD}} | {{OBD}} |
Latest revision as of 14:07, 6 January 2024
WAVE is a set of formats for storing waveform data in a RIFF container. Oni adopted a small subset of WAVE for its PC versions.
- The simplest and most widespread format is linear PCM (format ID 1), where waveform samples are simply stored one after the other, in chronological order, as integer values proportional to the amplitude of the waveform. The most common bit depth is 16 bits per sample, with sample values in the range [-32768:32767] (signed), stored in Little Endian order. In the case of stereo, Left and Right samples are interleaved.
- Another widespread format is MS ADPCM (format ID2), where samples are stored as 4-bit nibbles packed into large blocks. Within each block, each sample of the waveform is inferred from the previous sample through a prediction-correction algorithm, initialized with two samples explicitly stored as 16-bit values at the start of the block (more details below). The compressed data is almost 4 times more compact than PCM.
There are hundreds of other formats (including MP3, Vorbis, Dolby, FLAC and lots of legacy codecs that no one cares about), but (L)PCM and (MS)ADPCM are the most widespread and useful.
- Vanilla Oni data exclusively uses MS ADPCM, but some mods have successfully used (L)PCM, which is why we are documenting it too. Also, since PCM is significantly simpler, we describe it first.
PCM
Below is the beginning of a standard-compliant WAVE file with stereo PCM data (decoded from SNDDmus_ot6.aif).
0x0000: | 52 | 49 | 46 | 46 | 20 | 50 | 09 | 00 | 57 | 41 | 56 | 45 | 66 | 6D | 74 | 20 | RIFF P°°WAVEfmt |
0x0010: | 10 | 00 | 00 | 00 | 01 | 00 | 02 | 00 | 22 | 56 | 00 | 00 | 88 | 58 | 01 | 00 | °°°°°°°°"V°°ˆX°° |
0x0020: | 04 | 00 | 10 | 00 | 64 | 61 | 74 | 61 | 00 | 50 | 09 | 00 | F5 | FF | F5 | FF | °°°°data°P°°õÿõÿ |
- (The contents of the "fmt " header, relevant to SNDD storage, has been highlighted in bold italic.)
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
0x00 | char[4] | 52 49 46 46 | RIFF | identifier for the "IBM/Microsoft RIFF" standard |
0x04 | uint32 | 20 50 09 00 | 610336 | size of the RIFF container from 0x08 to end of file |
0x08 | char[4] | 57 41 56 45 | WAVE | identifier for the "WAVE" format |
0x0C | char[4] | 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section |
0x10 | uint32 | 10 00 00 00 | 16 | content size for the "fmt " section, in bytes (always 16 for PCM) |
0x14 | uint16 | 01 00 | 1 | format ID (1 = linear PCM format) |
0x16 | uint16 | 02 00 | 2 | number of channels (2 = stereo) |
0x18 | uint32 | 22 56 00 00 | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" |
0x1C | uint32 | 88 58 01 00 | 88200 | data rate (= "sample rate" * "block alignment"), in bytes per second
|
0x20 | uint16 | 04 00 | 4 | block alignment a.k.a "block size", in bytes
|
0x22 | uint16 | 10 00 | 16 | bits per sample (per channel); typically 16 bits for PCM, although other bit depths are possible |
0x24 | char[4] | 64 61 74 61 | data | identifier announcing the following "data" section |
0x28 | uint32 | 00 50 09 00 | 610304 | content size for the "data" section, in bytes (implies 152576 stereo sample blocks, 4 bytes each) |
0x2C | block[4] | F5 FF F5 FF | (-11,-11) | first stereo sample; the left and the right sample values are both -11=0xFFF5 |
- Mono vs stereo
- For a mono sound, the layout would be the same, except for the channel count (1 instead of 2), block alignment (2 instead of 4), data rate (double of the sample rate instead of 4x larger) and different data sizes.
- Sample rate
- Standard WAVE PCM supports completely arbitrary sample rates. Besides 22050 and 44100, common ones are 8000, 11025, 48000 and 88200. See HERE for more.
- Bit depth
- 16-bit sample depth provides satisfactory signal-to-noise ratio in most situations. Low-resolution 8-bit samples are sometimes used, and higher resolution waveforms can have 24-bit or 32-bit samples. See HERE for more.
- Data size
- The size at 0x28 is the size of the raw data that starts at 0x2C and exactly corresponds to the .raw data of an SNDD instance.
MS ADPCM
Below is the beginning of a standard-compliant WAVE file with stereo MS ADPCM data (adapted from SNDDalarm_loop.aif).
0x0000: | 52 | 49 | 46 | 46 | 4E | 0C | 01 | 00 | 57 | 41 | 56 | 45 | 66 | 6D | 74 | 20 | RIFF P°°WAVEfmt |
0x0010: | 32 | 00 | 00 | 00 | 02 | 00 | 02 | 00 | 22 | 56 | 00 | 00 | 27 | 57 | 00 | 00 | 2°°°°°°°"V°°'W°° |
0x0020: | 00 | 04 | 04 | 00 | 20 | 00 | F4 | 03 | 07 | 00 | 00 | 01 | 00 | 00 | 00 | 02 | °°°° °ô°°°°°°°° |
0x0030: | 00 | FF | 00 | 00 | 00 | 00 | C0 | 00 | 40 | 00 | F0 | 00 | 00 | 00 | CC | 01 | °ÿ°°°°À°@°ð°°°Ì° |
0x0040: | 30 | FF | 88 | 01 | 18 | FF | 66 | 61 | 63 | 74 | 04 | 00 | 00 | 00 | 8A | 05 | 0ÿˆ°°ÿfact°°°°Š° |
0x0050: | 01 | 00 | 64 | 61 | 74 | 61 | 00 | 0C | 01 | 00 | 05 | 05 | 10 | 00 | 10 | 00 | °°data°°°°°°°°°° |
0x0060: | 0C | 00 | AF | FF | 2E | 00 | B4 | FF | F0 | F1 | 00 | 00 | 0F | 30 | 20 | 10 | °°¯ÿ.°´ÿðñ°°°0°° |
- (The contents of the "fmt " header, relevant to SNDD storage, has been highlighted in bold italic.)
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
0x00 | char[4] | 52 49 46 46 | RIFF | identifier for the "IBM/Microsoft RIFF" standard |
0x04 | uint32 | 4E 0C 01 00 | 68686 | size of the RIFF container from 0x08 to end of file |
0x08 | char[4] | 57 41 56 45 | WAVE | identifier for the "WAVE" format |
0x0C | char[4] | 66 6D 74 20 | "fmt " | identifier announcing the following "fmt " (format) section |
0x10 | uint32 | 32 00 00 00 | 50 | content size for the "fmt " section, in bytes (typically 50 for MS ADPCM) |
0x14 | uint16 | 02 00 | 2 | format ID (2 = MS ADPCM format) |
0x16 | uint16 | 02 00 | 2 | number of channels (2 = stereo) |
0x18 | uint32 | 22 56 00 00 | 22050 | sample rate in Hz (samples per second), a.k.a. "sampling frequency" |
0x1C | uint32 | 27 57 00 00 | 22311 |
|
0x20 | uint16 | 00 04 | 1024 | block alignment a.k.a "block size", in bytes
|
0x22 | uint16 | 04 00 | 4 | bits per sample (per channel); typically 4 bits for MS ADPCM |
0x24 | uint16 | 20 00 | 32 | size of the following extended format specification, in bytes |
0x26 | uint16 | F4 03 | 1012 | |
0x28 | uint16 | 07 00 | 7 | number of the following coefficient pairs; always 7 in practice |
0x2A | int16-16 | 00 01 00 00 | 256, 0 | The coefficient pairs themselves (always the same in practice). |
0x2E | int16-16 | 00 02 00 FF | 512, -256 | |
0x32 | int16-16 | 00 00 00 00 | 0, 0 | |
0x36 | int16-16 | C0 00 40 00 | 192, 64 | |
0x3A | int16-16 | F0 00 00 00 | 240, 0 | |
0x3E | int16-16 | CC 01 30 FF | 460, -208 | |
0x42 | int16-16 | 88 01 18 FF | 392, -232 | |
0x46 | char[4] | 66 61 63 74 | fact | identifier announcing the following "fact" section |
0x4A | int32 | 04 00 00 00 | 4 | size of the following "fact" section in bytes |
0x4E | int32 | 8A 05 01 00 | 66954 | actual number of samples (see below for calculation) |
0x52 | char[4] | 64 61 74 61 | data | identifier announcing the following "data" section |
0x56 | int32 | 00 0C 01 00 | 68608 | size of the following "data" section in bytes (67 blocks of 1024 bytes) |
0x5A | block[14] | 05 05 10 00 10 00 0C 00 AF FF 2E 00 B4 FF |
(5,5) (16,16) (12,-81) (46,-76) |
the header of the first 1024-byte block (14 bytes for a stereo block) |
0x68 | byte[8]... | F0 F1 00 00 00 30 20 10 |
(-1,0) (-1,1) (0,0) (0,0) (0,0) (3,0) (2,0) (1,0) |
the first 8 pairs of nibbles (stereo samples); 1002 more bytes follow |
- ADPCM coefficient table
- The 7 pairs of coefficients are a standard set, hardcoded in practically every implementation of MS ADPCM. In theory their number and values are allowed to vary, and therefore any ADPCM-compressed waveform still provides the coefficients that were used for encoding, even though they are always the same in practice. Think of it as a "key" that needs to be common to the compression and decompression phases.
- Sample rate
- Standard WAVE PCM supports completely arbitrary sample rates. (Besides 22050 and 44100, common ones are 8000, 11025, 48000 and 88200. See HERE for more.) Oni, however, ignores the sample rate and plays back all waveforms as 22.05 kHz.
- Bit depth
- The bit depth of the compressed nibbles is typically 4 bits, the decompressed data is ordinary 16-bit PCM (two 16-bit samples for each channel are explicitly provided at the start of each block).
- Variable sizes, mono vs stereo
- For a mono sound, the layout would be the same, except for the channel count (1 instead of 2) as well as a possibly different block alignment. From the channel count and block size follows the number of samples per block (at 0x26), the rules being as follows (details HERE):
- For a mono signal (not shown above), an ADPCM block starts with seven bytes: an 8-bit predictor index, a 16-bit delta, and two 16-bit PCM samples (starting with the second sample to play).
- For a stereo signal (as shown above), an ADPCM block starts with 14 bytes: two 8-bit predictor indices, two 16-bit deltas, and two pairs of 16-bit PCM samples (Left then Right in each case).
- The rest of the block is filled, in the case of stereo, with pairs of Left and Right nibbles (grouped into bytes). In the case of mono, the bytes consist of consecutive same-channel nibbles.
- From the "samples per block" follows the average rate (at 0x1C). From the actual number of samples (at 0x4E) follows the number of ADPCM blocks required to store the waveform, and therefore the data size at (0x56), a multiple of "block size", which in turn affects the total RIFF size at 0x04.
- "fact" section VS truncated data
- It is not standard-compliant to truncate the final ADPCM block after the last actual sample like Oni does. The WAVE file is expected to contain whole ADPCM blocks (usually padded with zeroes), and the exact number of actual samples is specified in the "fact" section. This is not very space-efficient, but it allows a waveform to have a completely arbitrary number of samples. With Oni's truncated data and lack of an explicit sample count, it is impossible to specify an odd-numbered count of mono samples (because the bytes of mono ADPCM blocks consist of consecutive same-channel nibbles, and there is no way to tell whether the last nibble counts or not). Also, Oni can only truncate after the header of an ADPCM block, therefore the two explicitly stored samples in the header are always played, whereas "fact" can be used to indicate that only the first sample is actual waveform data, and the other is a dummy.
- Data size
- The size at 0x56 is the size of the raw data that starts at 0x5A, which consists of whole ADPCM blocks and will need to be truncated after the last actual sample for use in an Oni SNDD instance. If the data is not truncated, Oni will determine the sample count from the padded data size, and there will be a noticeable silence (up to 46 milliseconds, or almost 3 game ticks, at 22.05 kHz and for 1012 samples per block) – not a problem for impulse sounds, but definitely not recommendable for ambients or music.
Importing from WAVE into Oni
Filling in PC demo SNDD
- Block alignment
- The PC demo engine can only play back MS ADPCM sounds, and only with a block size of 512 bytes per channel. This means that a WAVE file cannot be adapted for Oni PC demo at all (not without reencoding, anyway), unless it conforms to the above "MS ADPCM" template with the following value at 0x20: 512 bytes per block for mono sounds; 1024 bytes per block for stereo sounds. The other non-trivial values (samples per block and data rate) follow unambiguously from the channel count, block alignment and sample rate, so you only need to check the block size against the channel count.
- Sample rate
- The PC demo will play waveforms at 22.05 kHz, so if your WAVE file has a different sample rate, you will need to adjust the pitch/speed rate in OSGr.
- Sample count and raw data size
- Oni uses the raw data size to determine the sample count of ADPCM data, and a WAVE file stores whole ADPCM blocks (zero-padded), so you will need to look up the sample count in the "fact" section and use it to determine the size of the trimmed data that should be copied to Oni's SNDD (.raw part). The formula is not entirely trivial, so it is recommended that you derive it on your own:
- Divide the actual number of samples (from "fact") by the number of samples per block, and look at the remainder (R).
- If R is 0, then you are lucky and the waveform exactly fits in a whole number of blocks; just copy the data as-is.
- If R is 1, then the waveform only uses the first sample from the last block's header, so you trim the final ADPCM block to the first 7 or 14 bytes (for mono or stereo respectively).
- If R is 2 or larger, then you need to keep the 7 or 14 first bytes of the last block, plus R-2 more bytes for stereo and ceil[(R-2)*0.5] for mono.
- Copy the right amount of bytes as the SNDD's .raw part, store its address at 0x14 in the SNDD's .dat part, and put the size of the data at 0x10.
- Duration in game ticks
- The duration (value at 0x0C in the PC demo SNDD) is computed from the number of stereo samples: the implicit playback rate is 22.05 kHz, so divide the sample count by 367.5 and round off to the lower integer value.
- Flags (at 0x08 in SNDD)
- Use 0x00000001 (Little Endian 01 00 00 00) for mono, and 0x00000003 (Little Endian 03 00 00 00) for stereo.
Filling in PC retail SNDD
The PC retail engine is more flexible since it has a 50-byte format section specifically intended to receive the "fmt " chunk of a MS ADPCM .wav file along will all the non-trivial ADPCM settings. This format section is activated with the 0x00000008 flag, provided that the 0x00000004 flag is OFF (this "4" flag overrides the "8" flag and forces the interpretation of the data as IMA4 ADPCM).
The 50-byte format section in the SNDD can also be used to store the shorter "fmt " chunk of PCM .wav files, and this is the recommended procedure, as opposed to using the "raw PCM" functionality of PC retail SNDD (if you remove the 0x00000008 flag, then the format header is completely disabled and the .raw SNDD data is interpreted as 22.05 kHz PCM, using the channel count specified at OSGr level; don't do that, and just use the "8" flag).
- Copying the .raw data
- For PCM, just refer to the "PCM" template above: copy the data that starts at 0x2C in the WAVE file (its size can be found at 0x28, but it reaches to the end of the file anyway since there is no padding). Copy the data to the SNDD's .raw part, store the address at 0xC4 in the SNDD's .dat part and the size at 0xC0.
- For MS ADPCM, you need to trim the final ADPCM block, with the same calculation as laid out for MS ADPCM above. Copy the truncated data from WAVE's "data" to the SNDD's .raw, and fill in its address and size in the SNDD's .dat, at 0x44 and 0x40 respectively.
- Copying the "fmt " header
- For MS ADPCM, overwrite the whole format section in the SNDD (at 0x0C) with the whole contents of the WAVE file's "fmt " section (at 0x14).
- For PCM, use the "fmt " section's content to fill the first 16 bytes of the SNDD's format section (the other 34 bytes will be ignored).
- Block alignment
- Unlike for PC demo, any block alignment is supported, so as long as you are working with a self-consitent WAVE file, you have nothing to worry about (apart from the sample rate).
- Sample rate
- Like PC demo, PC retail will play all waveforms at 22.05 kHz (yes, even the 44.1 kHz waveforms from Vanilla game data are actually interpreted as 22.05 kHz waveforms).
- Therefore, if your WAVE file has a sample rate other than 22.05 kHz, you will need to compensate for the difference by adjusting the pitch/speed rate in OSGr.
- Flags (at 0x08 in PC retail SNDD)
- The 0x00000008 flag (Little Endian 08 00 00 00) must be ON, and the 0x00000004 flag (Little Endian 04 00 00 00) must be OFF.
- The effect of the "1" and "Z" flags is unknown, so it is recommended to just use the value 0x00000008 (Little Endian 08 00 00 00)
Exporting from Oni to WAVE
From PC demo SNDD
PC demo SNDDs contain standard MS ADPCM data, but the parameters are not stored in the SNDD file. Instead they reside in the engine (hard-coded) and must be generated from scratch when creating the WAVE file. The determining parameters are the sample rate (22.05 kHz), the channel count (either 1 or 2) and the block alignment (512 bytes per channel).
- Generating the WAVE header (stereo)
- For stereo, use the template provided above for the MS ADPCM variant of the WAVE file, seeing as it already has all the right parameters: channel count 2 (at 0x16), block alignment 1024 (at 0x20), samples per block 1012 (at 0x26), sample rate 22050 (at 0x18), data rate 22311 (at 0x1C).
- For mono, use the same template, but with the following substitutions: channel count 1, block alignment 512, data rate 11155. The samples per block remain unchanged since the block size iz two times smaller but there also is twice as little data.
- Calculating the sample count
- The calculation here is the reverse of the WAVE-to-SNDD conversion. We know the size of the truncated data and we must construct the actual sample count that will be stored in the WAVE's "fact" section.
- Divide the .raw data size (at 0x10 in the SNDD's .dat) by the block size (512 per channel) and consider the remainder (R):
- If R is 0, then the sample count is simply the number of whole blocks times 1012.
- If R is 14 or larger (for stereo), the sample count must be increased by the value R-14 (one sample per byte).
- If R is 7 or larger (for mono), the sample count must be increased by ceil[(R-14)*0.5] (two samples per byte).
- If R is in the [1:6] range for mono, or in the [1:13] range for stereo, then something is wrong with your SNDD file (incomplete ADPCM header for the final block).
- Once the sample count is calculated, store it in the "fact" section of the WAVE file, at 0x4E.
- Raw data
- For a standard-compliant WAVE file, the .raw data of the SNDD must be padded to an integer multiple of "block size" (512 bytes for mono and 1024 for stereo), so that the data consists of whole ADPCM blocks. Once you have done that, put the padded .raw data at 0x5A in the WAVE file, and update the "data" section's size (at 0x56 as well as the RIFF container's size at 0x04).
From PC retail SNDD
Since "raw PCM" has remained udocumented for a long time, it is assumed that you will never need to convert to WAVE from a SNDD that uses the raw PCM functionality (i.e., with both the 0x0000004 and the 0x00000008 flags disabled). Most probably you will need to convert from an SNDD that contains a standard-compliant "fmt " section. (The raw PCM case will still be covered below, very briefly).
At the time of writing, it has not been confirmed whether PC retail Oni supports any WAVE format IDs other than 1 (Linear PCM) and 2 (MS ADPCM), so we are only going to assume those two (or raw PCM).
MS ADPCM
If the format ID (at 0x0C in the SNDD) is 2, then you need to work from the "MS ADPCM" template for the WAVE file.
- Copying the "fmt " header
- Take the whole 50-byte block that starts at 0x0C in the SNDD, and use it to replace the contents of the "fmt " section in the WAVE file (starting at 0x14).
- Calculating the sample count
- Same calculation as above for the PC-demo-SNDD-to-WAVE conversion, except the "block alignment" and "samples per block" parameters can be custom now.
- Divide the .raw data size (at 0x40 in the SNDD's .dat) by the block size (custom value at 0x18 in the SNDD or at 0x20 in the new WAVE file) and consider the remainder (R):
- If R is 0, then the sample count is simply the number of whole blocks times the "samples per block" value (at 0x1E in the SNDD or at 0x26 in the new WAVE file).
- If R is 14 or larger (for stereo), the sample count must be increased by the value R-14 (one sample per byte).
- If R is 7 or larger (for mono), the sample count must be increased by ceil[(R-14)*0.5] (two samples per byte).
- If R is in the [1:6] range for mono, or in the [1:13] range for stereo, then something is wrong with your SNDD file (incomplete ADPCM header for the final block).
- Once the sample count is calculated, store it in the "fact" section of the WAVE file, at 0x4E.
- Raw data
- Same as above for exporting from PC demo SNDDs, but with a possibly custom block size.
- For a standard-compliant WAVE file, the .raw data of the SNDD must be padded to an integer multiple of "block size" (custom value at 0x18 in the SNDD or at 0x20 in the new WAVE file), so that the data consists of whole ADPCM blocks. Once you have done that, put the padded .raw data at 0x5A in the WAVE file, and update the "data" section's size (at 0x56 as well as the RIFF container's size at 0x04).
PCM (with "fmt ")
If the format ID (at 0x0C in the SNDD) is 2, then you need to work from the "PCM" template for the WAVE file.
- Copying the "fmt " header
- Grab a 16-byte block from the SNDD's .dat part, starting at 0x0C, and use it to replace the contents of the "fmt " section in the WAVE file (starting at 0x14). Be careful not to copy over 16 bytes, so as not to overwrite the "fact" and "data" sections.
- Raw data
- For PCM there is no padding, so copy the .raw data as-is into the "data" section of the WAVE file, at 0x2C, and store its size at 0x28. Update the size of the RIFF container (at 0x04) accordingly.
raw PCM
For raw PCM, everything is as with formatted PCM, except the format is a default one (Linear PCM, 22.05 kHz, 16 bits per sample). The channel count is inferred from OSGr.
- Filling in the "fmt " header
- Start with the "PCM" template. Leave the sample rate (at 0x18) as 22050 and the bit depth (at 0x22) as 16. Look up the channel count in the OSGr referencing the SNDD.
- If the sound is referenced as mono (from OSGr), then back in the WAVE file set the block size (at 0x20) to 2 and the data rate (at 0x1C) to 44100.
- If the sound is referenced as stereo (from OSGr), then back in the WAVE file set the block size (at 0x20) to 4 and the data rate (at 0x1C) to 88200.
- Raw data
- For PCM there is no padding, so copy the .raw data as-is into the "data" section of the WAVE file, at 0x2C, and store its size at 0x28. Update the size of the RIFF container (at 0x04) accordingly.