OBD:SNDD: Difference between revisions
(→raw: ADPCM link) |
(→raw: info about data blocks, padding and looping) |
||
Line 50: | Line 50: | ||
---- | ---- | ||
---- | |||
---- | ---- | ||
==raw== | ==raw== | ||
The raw data part of a SNDD file contains the actual audio | The raw data part of a SNDD file contains the actual audio sample blocks without any other headers (other than block headers). | ||
===Exporting and importing tips=== | |||
=== | |||
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part. | To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part. | ||
====WAV files==== | ====WAV files==== | ||
Line 113: | Line 114: | ||
{{OBDtr| 0x36 | int32 |FFC8FF| 00 00 00 00 | 0 | block size; used in conjunction with offset for block-aligning data; in Oni it's always zero }} | {{OBDtr| 0x36 | int32 |FFC8FF| 00 00 00 00 | 0 | block size; used in conjunction with offset for block-aligning data; in Oni it's always zero }} | ||
|} | |} | ||
===ADPCM format details=== | |||
;MS ADPCM - PC retail | |||
:For an overview of the ADPCM algorithm (if interested), see [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. | |||
::The MS ADPCM data has 512- or 1024-byte blocks (512 bytes for 22.05kHz mono, 1024 bytes for 22.05kHz stereo and 44.1kHz mono) | |||
::Each block consists of a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the block's first two samples. | |||
::The remaining 505, 1010 or 1017 bytes of each block consist of nibbles (half-bytes), with left-right interleaving in the case of stereo. | |||
:::(that's 1010 more samples in the case of 22.05kHz mono or stereo, and 2034 more samples in the case of 44.1kHz mono) | |||
::Thus the total number of samples per block (including the two in the header) is 1012 for 22.05kHz (mono or stereo) and 2036 for 44.1kHz mono. | |||
::The final block in the file can be incomplete (the decoder can infer this from the block size and raw data size). | |||
;IMA ADPCM (IMA4) - Mac and PC demo | |||
:For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE] | |||
::The IMA4 ADPCM data has 34-byte blocks (in the case of stereo, there is an even number of such blocks, because Left and Right blocks are interleaved). | |||
::The first two bytes of each block are used to set the initial the predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples. | |||
::The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right). | |||
::Unlike for MS ADPCM, incomplete trailing blocks (if any) are not announced in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data. | |||
::For this reason, identical sounds do not have the same sample count on PC retail and Mac/demo. As an example, here are the stats for some stereo sounds ("atm_cl05" ambient): | |||
{| | |||
| | |||
{{Table}} | |||
!SNDD name (and frame count) | |||
!PC retail | |||
!Mac/demo | |||
!difference | |||
|- | |||
| | |||
:'''SNDDatm_cl05_in''' | |||
:100 frames = 1.6667 seconds | |||
:~= '''36750''' samples (@22.05kHz) | |||
| | |||
:0x916C = 37228 = 36x1024 + 364 bytes | |||
:= 36x1012 + 352 = '''36784''' stereo samples | |||
:= 1.668208616780045 s (@22.05kHz) | |||
| | |||
:0x98BC = 39100 = 1150x34 bytes | |||
:= 575x64 = '''36800''' stereo samples | |||
:= 1.668934240362812 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 16 extra samples at the end | |||
:(i.e., the last 8 bytes of the last two blocks). | |||
|- | |||
| | |||
:'''SNDDatm_cl05_lp1''' | |||
:897 frames = 14.95 seconds | |||
:~= '''329647.5''' samples (@22.05kHz) | |||
| | |||
:0x5172A = 333610 = 325x1024 + 810 bytes | |||
:= 325x1012 + 798 = '''329698''' stereo samples | |||
:= 14.95229024943311 s (@22.05kHz) | |||
| | |||
:0x55880 = 350336 = 10304x34 bytes | |||
:= 5152x64 = '''329728''' stereo samples | |||
:= 14.95365079365079 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 30 extra samples at the end | |||
:(i.e., the last 15 bytes of the last two blocks) | |||
|- | |||
| | |||
:'''SNDDatm_cl05_lp2''' | |||
:795 frames = 13.25 seconds | |||
:~= '''292162.5''' samples (@22.05kHz) | |||
| | |||
:0x48334 = 295732 = 288x1024 + 820 bytes | |||
:= 288x1012 + 808 = '''292264''' stereo samples | |||
:= 13.25460317460317 s (@22.05kHz) | |||
| | |||
:0x4BD1C = 310556 = 9134x34 bytes | |||
:= 4567x64 = '''292288''' stereo samples | |||
:= 13.25569160997732 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 24 extra samples at the end | |||
:(i.e., the last 12 bytes of the last two blocks) | |||
|- | |||
| | |||
:'''SNDDatm_cl05_lp3''' | |||
:428 frames = 7.133333 seconds | |||
:~= '''157290''' samples (@22.05kHz) | |||
| | |||
:0x26F1 = 159506 = 155x1024 + 786 bytes | |||
:= 155x1012 + 774 = '''157634''' stereo samples | |||
:= 7.148934240362812 s (@22.05kHz) | |||
| | |||
:0x28E80 = 167552 = 4928x34 bytes | |||
:= 2464x64 = '''157696''' stereo samples | |||
:= 7.151746031746032 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 62 extra samples at the end | |||
:(i.e., the last 31 bytes of the last two blocks) | |||
|- | |||
| | |||
:'''SNDDatm_cl05_lp4''' | |||
:478 frames = 7.9666667 seconds | |||
:~= '''175665''' samples (@22.05kHz) | |||
| | |||
:0x2B7BE = 178110 = 173x1024 + 958 bytes | |||
:= 173x1012 + 946 = '''176022''' stereo samples | |||
:= 7.982857142857143 s (@22.05kHz) | |||
| | |||
:0x2DABC = 187068 = 5502x34 bytes | |||
:= 2751x64 = '''176064''' stereo samples | |||
:= 7.984761904761905 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 42 extra samples at the end | |||
:(i.e., the last 21 bytes of the last two blocks) | |||
|- | |||
| | |||
:'''SNDDatm_cl05_out''' | |||
:109 frames = 1.816667 seconds | |||
:~= '''40057.5''' samples (@22.05kHz) | |||
| | |||
:0x9E7A = 40570 = 39x1024 + 634 bytes | |||
:=39x1012 + 622 = '''40090''' stereo samples | |||
:= 1.818140589569161 s (@22.05kHz) | |||
| | |||
:0xA68C = 42636 = 1254x34 bytes | |||
:= 627x64 = '''40128''' stereo samples | |||
:= 1.819863945578231 s (@22.05kHz) | |||
| | |||
:As compared to PC retail, | |||
:Mac/demo has 38 extra samples at the end | |||
:(i.e., the last 19 bytes of the last two blocks) | |||
|} | |||
|} | |||
By looking at the end of the Mac/demo SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last right block). It is not clear how the Mac (or PC demo) engine detects these trailing samples and cuts them off (this is important when playing a sequence, e.g. music or ambient loops, see "looping issues" below). | |||
Possibly Mac/demo Oni just looks at the approximate length of each SNDD in frames (or game ticks, i.e., 1/60th of a second), which is listed in each SNDD's header and, once the announced frame count has been reached for the currently playing sound, starts playback on the next sound in the sequence. Depending on the hardware/software implementation of the audio pipelines, this logic can either interrupt the currently playing sound, or cause a slight overlap/crossfade between the current sound and the next. It is possible that PC retail Oni actually does the same, i.e., segments of a sequence are dispatched to the OS based on the frame count of the previous segment, rather than based on its actual play time (sample count). | |||
Another theoretical possibility is that, in the case of IMA4 ADPCM, an illegal step index (outside the expected 0-88 range) is used to signify the end of the stream. Decoders typically resolve this by forcing an out-of-bounds step index into the 0-88 interval, but perhaps a custom decoder can interrupt the stream instead. However, this would be very non-standard behavior, and would only be feasible if the decompressed audio stream is put together inside Oni's engine, rather than deferred to the OS. Therefore it is more likely that Oni dispatches each segment to the OS based on the frame count; the OS receives the ADPCM-compressed data, decompresses it and plays it back; the overlap/crossfade/interruption of the currently playing segment is handled at OS level. | |||
===Looping issues=== | |||
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail) and IMA4 ADPCM (Mac and PC demo), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds). | |||
====MS ADPCM==== | |||
Although the final block of a MS ADPCM file is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior (e.g., in an audio editor) is to automatically add the padding up to the end of the last block of an ADPCM-compressed WAV, creating a silence at the end of the imported audio. This artificial silence can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence). | |||
The solution is to use some alternative tools, which are more flexible about incomplete MS ADPCM blocks: | |||
*For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files. | |||
*For [https://www.ffmpeg.org/download.html ffmpeg], padding can be disabled as an optional setting. | |||
====IMA ADPCM==== | |||
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac/demo SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition. | |||
There are two solutions - an approximate one and an exact one: | |||
#As an approximation, look up the frame count announced in the SNDD's header (using a hex viewer or an XML dump), divide that by 60 to get the length of the clip in seconds, and multiply by the sample rate 22.05kHz. You will get the number of samples (and the corresponding delay in seconds) that are actually played back by Oni before starting the next sound in the sequence. You can replicate this delay with ffmpeg, Sox, or the audio editor of your choice. A cross-fade between the two clips in the overlapping region should sound best. You can also examine the samples near the approximate transition time, and "manually" determine where the actual samples end and the padding begins. | |||
#As an exact solution, look up the sample count of an equivalent MS ADPCM file from a PC retail version of Oni. Padding should only be a problem for non-localized music and ambients, so the language version shouldn't matter. The electric "zap" sounds (which are sampled at 44.1kHz in PC retail Oni) are also not loopable, so you should be able to find a 22.05kHz sound with a intuitive sample count. It will be somewhat smaller than the raw sample count of the IMA4 ADPCM, because of the padding (see the atm_cl05 example above for a comparison). Once you know the actual sample count, use ffmpeg to convert the .aif file to .wav (either PCM or ADPCM), keeping only the actual samples and trimming out the padding. Then, use Sox or ffmpeg to seamlessly join the .wav files as you would for regular MS ADPCM files (see above). | |||
---- | |||
Revision as of 14:53, 9 May 2020
|
.dat
There are 2 different formats used by the SNDD files.
PC retail
Below is the .dat file part used in the PC retail version.
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
0x00 | res_id | 01 D7 08 00 | 2263 | 02263-comguy_dth2.aif.SNDD |
0x04 | lev_id | 01 00 00 06 | 3 | level 3 |
0x08 | int32 | 08 00 00 00 | 8 | flags
|
0x0C | block[50] | wav header | ||
0x3E | int16 | 37 00 | 55 | duration in 1/60 seconds |
0x40 | int32 | 56 28 00 00 | 10326 | size of the part in the raw file in bytes |
0x44 | offset | 20 10 59 00 | 00 59 10 20 | at this position starts the part in the raw file |
0x48 | char[24] | AD DE | dead | unused |
PC demo and Mac
The Mac version and the PC demo version use a simpler format. It appears that there is no support for different sample rates (all sounds are sampled at 22050 Hz).
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
0x00 | res_id | 01 D6 08 00 | 2262 | 02262-comguy_dth2.aif.SNDD |
0x04 | lev_id | 01 00 00 06 | 3 | level 3 |
0x08 | int32 | 01 00 00 00 | 1 | "number of channels" (can be 1 for 1 channel or 3 for 2 channels) |
0x0C | int32 | 37 00 00 00 | 55 | duration in 1/60 seconds |
0x10 | int32 | 5E 2A 00 00 | 10846 | size of the part in the raw file in bytes |
0x14 | offset | 00 B1 01 00 | 00 01 B1 00 | at this position starts the part in the raw file |
0x18 | char[8] | AD DE | dead | unused |
raw
The raw data part of a SNDD file contains the actual audio sample blocks without any other headers (other than block headers).
Exporting and importing tips
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part.
WAV files
- Write "RIFF"
- add the size of the part in the raw file + 70 bytes
- write "WAVE"
- write "fmt "
- write 50
- write the wav header
- write "data"
- add the size of the part in the raw file
- add the raw file data
- save it as a wav file.
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
Complete ADPCM wav format header (black outline) | ||||
0x00 | char[4] | 52 49 46 46 | RIFF | identifier for the "IBM/Microsoft RIFF" standard |
0x04 | int32 | 9C 28 00 00 | 10396 | size of the file from 0x08 to the end (= size of the .raw part + 70 bytes) |
0x08 | char[4] | 57 41 56 45 | WAVE | identifier for the "WAVE" format |
0x0C | char[4] | 66 6D 74 20 | fmt | identifier announcing the following wav format header |
0x10 | int32 | 32 00 00 00 | 50 | wave format header size |
0x14 | block[50] | wav header | ||
0x46 | char[4] | 64 61 74 61 | data | identifier announcing the following wav data |
0x4A | int32 | 56 28 00 00 | 10326 | size of the following wav data in bytes (= size of the .raw part) |
AIF files
- Write "FORM"
- add the size of the part in the raw file + 50 bytes
- write "AIFC"
- write "COMM "
- add the aif header + calculate its sample rate (always 22)
- write "SSND"
- add the size of the part in the raw file + 8 bytes
- add 8 zero bytes
- add the raw file data and save it as an aif file.
Note the Big Endian order
Offset | Type | Raw Hex | Value | Description |
---|---|---|---|---|
Complete aif format header (black outline) | ||||
0x00 | char[4] | 46 4F 52 4D | FORM | identifier for the "EA IFF 85" standard |
0x04 | int32 | 00 00 2A 90 | 10896 | size of the file from 0x08 to the end (= size of the .raw part + 50 bytes) |
0x08 | char[4] | 41 49 46 43 | AIFC | identifier for the "AIFC" format (compressed aif file) |
0x0C | char[4] | 43 4F 4D 4D | COMM | identifier announcing the following aif format header |
0x10 | block[26] | aif header | ||
0x2A | char[4] | 53 53 4E 44 | SSND | identifier announcing the following aif data |
0x2E | int32 | 00 00 2A 66 | 10854 | size of the file from 0x32 to the end (= size of the .raw part + 8 bytes) |
0x32 | int32 | 00 00 00 00 | 0 | offset; determines where the first sample in the data starts; in Oni it's always zero |
0x36 | int32 | 00 00 00 00 | 0 | block size; used in conjunction with offset for block-aligning data; in Oni it's always zero |
ADPCM format details
- MS ADPCM - PC retail
- For an overview of the ADPCM algorithm (if interested), see HERE.
- The MS ADPCM data has 512- or 1024-byte blocks (512 bytes for 22.05kHz mono, 1024 bytes for 22.05kHz stereo and 44.1kHz mono)
- Each block consists of a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the block's first two samples.
- The remaining 505, 1010 or 1017 bytes of each block consist of nibbles (half-bytes), with left-right interleaving in the case of stereo.
- (that's 1010 more samples in the case of 22.05kHz mono or stereo, and 2034 more samples in the case of 44.1kHz mono)
- Thus the total number of samples per block (including the two in the header) is 1012 for 22.05kHz (mono or stereo) and 2036 for 44.1kHz mono.
- The final block in the file can be incomplete (the decoder can infer this from the block size and raw data size).
- IMA ADPCM (IMA4) - Mac and PC demo
- For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see HERE
- The IMA4 ADPCM data has 34-byte blocks (in the case of stereo, there is an even number of such blocks, because Left and Right blocks are interleaved).
- The first two bytes of each block are used to set the initial the predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples.
- The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
- Unlike for MS ADPCM, incomplete trailing blocks (if any) are not announced in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data.
- For this reason, identical sounds do not have the same sample count on PC retail and Mac/demo. As an example, here are the stats for some stereo sounds ("atm_cl05" ambient):
|
By looking at the end of the Mac/demo SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last right block). It is not clear how the Mac (or PC demo) engine detects these trailing samples and cuts them off (this is important when playing a sequence, e.g. music or ambient loops, see "looping issues" below).
Possibly Mac/demo Oni just looks at the approximate length of each SNDD in frames (or game ticks, i.e., 1/60th of a second), which is listed in each SNDD's header and, once the announced frame count has been reached for the currently playing sound, starts playback on the next sound in the sequence. Depending on the hardware/software implementation of the audio pipelines, this logic can either interrupt the currently playing sound, or cause a slight overlap/crossfade between the current sound and the next. It is possible that PC retail Oni actually does the same, i.e., segments of a sequence are dispatched to the OS based on the frame count of the previous segment, rather than based on its actual play time (sample count).
Another theoretical possibility is that, in the case of IMA4 ADPCM, an illegal step index (outside the expected 0-88 range) is used to signify the end of the stream. Decoders typically resolve this by forcing an out-of-bounds step index into the 0-88 interval, but perhaps a custom decoder can interrupt the stream instead. However, this would be very non-standard behavior, and would only be feasible if the decompressed audio stream is put together inside Oni's engine, rather than deferred to the OS. Therefore it is more likely that Oni dispatches each segment to the OS based on the frame count; the OS receives the ADPCM-compressed data, decompresses it and plays it back; the overlap/crossfade/interruption of the currently playing segment is handled at OS level.
Looping issues
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail) and IMA4 ADPCM (Mac and PC demo), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds).
MS ADPCM
Although the final block of a MS ADPCM file is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior (e.g., in an audio editor) is to automatically add the padding up to the end of the last block of an ADPCM-compressed WAV, creating a silence at the end of the imported audio. This artificial silence can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence).
The solution is to use some alternative tools, which are more flexible about incomplete MS ADPCM blocks:
- For Sox, padding is disabled by default when joining several files.
- For ffmpeg, padding can be disabled as an optional setting.
IMA ADPCM
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac/demo SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition.
There are two solutions - an approximate one and an exact one:
- As an approximation, look up the frame count announced in the SNDD's header (using a hex viewer or an XML dump), divide that by 60 to get the length of the clip in seconds, and multiply by the sample rate 22.05kHz. You will get the number of samples (and the corresponding delay in seconds) that are actually played back by Oni before starting the next sound in the sequence. You can replicate this delay with ffmpeg, Sox, or the audio editor of your choice. A cross-fade between the two clips in the overlapping region should sound best. You can also examine the samples near the approximate transition time, and "manually" determine where the actual samples end and the padding begins.
- As an exact solution, look up the sample count of an equivalent MS ADPCM file from a PC retail version of Oni. Padding should only be a problem for non-localized music and ambients, so the language version shouldn't matter. The electric "zap" sounds (which are sampled at 44.1kHz in PC retail Oni) are also not loopable, so you should be able to find a 22.05kHz sound with a intuitive sample count. It will be somewhat smaller than the raw sample count of the IMA4 ADPCM, because of the padding (see the atm_cl05 example above for a comparison). Once you know the actual sample count, use ffmpeg to convert the .aif file to .wav (either PCM or ADPCM), keeping only the actual samples and trimming out the padding. Then, use Sox or ffmpeg to seamlessly join the .wav files as you would for regular MS ADPCM files (see above).
ONI BINARY DATA |
---|
QTNA << Other file types >> StNA |
SNDD : Sound Data |
Generic file |