OBD:SNDD: Difference between revisions

From OniGalore
Jump to navigation Jump to search
(→‎PCM export: PC demo detection)
(wrapping up the Mac SNDD knowledge; thanks to Ed for the main menu recording)
Line 1: Line 1:
{{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic | align=center}}
{{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic | align=center}}


 
:''For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see [[OSBD]] and its subtypes: [[OSAm]], [[OSIm]] and [[OSGr]].''
There are 2 different formats used by the SNDD files. Both versions use a form of [[wp:ADPCM|ADPCM]] compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit nibbles (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit [[wp:PCM|PCM]]).
SNDD instances is where Oni stores sound data. Sounds can be either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), and they are typically compressed to save on storage space. Both the PC and Mac versions use a form of [[wp:ADPCM|ADPCM]] compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit [[wp:PCM|PCM]]).
*On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ms'''.  
*On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ms'''.  
*On Mac, sounds are encoded using the IMA4 algorithm described [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ima_qt'''.
*On Mac, sounds are encoded using the IMA4 algorithm described [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ima_qt'''.
The PC demo and Mac SNDDs come short of PC retail SNDDs in the following respects:  
;Key shortcomings of the PC demo and Mac SNDDs as compared to PC retail SNDDs:
*PC demo and Mac SNDDs only support one sample rate (22.05 kHz). PC retail supports arbitrary sample rate, allowing 46 electric spark sounds to use CD-quality 44.1 kHz (for crisper high frequencies).
*PC demo and Mac SNDDs have a "short" .dat part that specifies only the frame count (animation length in game ticks) and number of channels (mono or stereo). The waveform data is always assumed to be sampled at 22.05 kHz, and compressed into 4-bit ADPCM (either IMA4 or MS ADPCM).
The Mac SNDDs come short of PC SNDDs (both retail and demo) in the following respects:  
*PC retail supports arbitrary sample rate, which allows for crisper high frequencies: specifically, 44.1 kHz (CD quality) is used for the 46 electric spark sounds '''ap_hit_shld''' and '''zap##'''. Possibly PC retail supports uncompressed PCM waveforms as well.
*At 22.05 kHz, Mac SNDDs (IMA4) are about 5% larger than PC equivalents (MS ADPCM), because of a much smaller block size in the .raw part (the smaller .dat part of Mac SNDDs doesn't help).
; Key shortcomings of Mac SNDDs as compared to PC SNDDs (both retail and demo):  
*Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). Only PC SNDDs (retail or demo) can be pieced together seamlessly. See [[OBD:SNDD#Looping_issues|BELOW]] for details.
*At 22.05 kHz, the storage size of Mac SNDDs (IMA4) is about 5% larger than for PC equivalents (MS ADPCM), because of a much smaller block size in the .raw part (the smaller .dat part of Mac SNDDs doesn't help).
*Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). PC SNDDs (retail or demo) have no such artifacts and allow nearly seamless playback. See [[OBD:SNDD#Looping_issues|BELOW]] for details.
==Oni storage==
==Oni storage==
===PC retail===
===PC retail===
Line 33: Line 34:
|}
|}


The .raw data contains the actual audio sample blocks without any other headers (other than block headers).
The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).


====.raw part (MS ADPCM)====
====.raw part (MS ADPCM)====
Line 63: Line 64:
|}
|}


The .raw data contains the actual audio sample blocks without any other headers (other than block headers).
The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).
====.raw part (MS ADPCM, PC demo)====
====.raw part (MS ADPCM, PC demo)====
For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz.
For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz.
Line 71: Line 72:
:The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples.
:The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples.
:The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
:The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
:Unlike for MS ADPCM, incomplete trailing blocks (if any) are not announced in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data.
:Unlike for MS ADPCM, incomplete trailing blocks (if any) are not indicated in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data.
:For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for some stereo sounds ("atm_cl05" ambient):
:For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music:
{{divhide|Comparative table (click to unhide)}}
{{divhide|Main menu music a.k.a. "Oni Trailer"}}
{|
{|
|
|
{{Table}}
{{Table}}
!SNDD name (and frame count)
!SNDD name (and frame count)
!PC
!PC (retail and demo)
!Mac
!Mac
!difference
!difference
|-
|-
|
|
:'''SNDDatm_cl05_in'''
:'''SNDDmus_ot6'''
:100 frames = 1.6667 seconds
:415 frames = 6.9166667 seconds
:~= '''36750''' samples (@ 22.05 kHz)
:~= '''152512.5''' samples (@ 22.05 kHz)
|
|
:0x916C = 37228 = 36x1024 + 364 bytes
:0x25B0C = 154380 = 150x1024 + 780 bytes
:= 36x1012 + 352 = '''36784''' stereo samples
:= 150x1012 + 768 = '''152568''' stereo samples
:= 1.668208616780045 s (@ 22.05 kHz)
:= 6.919183673469388 s (@ 22.05 kHz)
|
|
:0x98BC = 39100 = 1150x34 bytes
:0x27940 = 162112 = 4768x34 bytes
:= 575x64 = '''36800''' stereo samples
:= 2384x64 = '''152576''' stereo samples
:= 1.668934240362812 s (@ 22.05 kHz)
:= 6.919546485260771 s (@ 22.05 kHz)
|
|
:As compared to the PC version of the SNDD,
:As compared to the PC version of the SNDD,
:the Mac version has 16 extra samples at the end
:the Mac version has 8 extra samples at the end
:(i.e., the last 8 bytes of the last two blocks).
:(i.e., the last 4 bytes of the last two blocks).
|-
|-
|
|
:'''SNDDatm_cl05_lp1'''
:'''SNDDmus_ot7'''
:897 frames = 14.95 seconds
:414 frames = 6.9 seconds
:~= '''329647.5''' samples (@ 22.05 kHz)
:~= '''152145''' samples (@ 22.05 kHz)
|
|
:0x5172A = 333610 = 325x1024 + 810 bytes
:0x25A3C = 154172 = 150x1024 + 572 bytes
:= 325x1012 + 798 = '''329698''' stereo samples
:= 150x1012 + 560 = '''152360''' stereo samples
:= 14.95229024943311 s (@ 22.05 kHz)
:= 6.909750566893424 s (@ 22.05 kHz)
|
|
:0x55880 = 350336 = 10304x34 bytes
:0x27874 = 161908 = 4762x34 bytes
:= 5152x64 = '''329728''' stereo samples
:= 2381x64 = '''152384''' stereo samples
:= 14.95365079365079 s (@ 22.05 kHz)
:= 6.910839002267574 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 30 extra samples at the end
:(i.e., the last 15 bytes of the last two blocks)
|-
|
:'''SNDDatm_cl05_lp2'''
:795 frames = 13.25 seconds
:~= '''292162.5''' samples (@ 22.05 kHz)
|
:0x48334 = 295732 = 288x1024 + 820 bytes
:= 288x1012 + 808 = '''292264''' stereo samples
:= 13.25460317460317 s (@ 22.05 kHz)
|
:0x4BD1C = 310556 = 9134x34 bytes
:= 4567x64 = '''292288''' stereo samples
:= 13.25569160997732 s (@ 22.05 kHz)
|
|
:As compared to the PC version of the SNDD,
:As compared to the PC version of the SNDD,
:the Mac version has 24 extra samples at the end  
:the Mac version has 24 extra samples at the end  
:(i.e., the last 12 bytes of the last two blocks)
:(i.e., the last 12 bytes of the last two blocks)
|-
|
:'''SNDDatm_cl05_lp3'''
:428 frames = 7.133333 seconds
:~= '''157290''' samples (@ 22.05 kHz)
|
:0x26F1 = 159506 = 155x1024 + 786 bytes
:= 155x1012 + 774 = '''157634''' stereo samples
:= 7.148934240362812 s (@ 22.05 kHz)
|
:0x28E80 = 167552 = 4928x34 bytes
:= 2464x64 = '''157696''' stereo samples
:= 7.151746031746032 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 62 extra samples at the end
:(i.e., the last 31 bytes of the last two blocks)
|-
|
:'''SNDDatm_cl05_lp4'''
:478 frames = 7.9666667 seconds
:~= '''175665''' samples (@ 22.05 kHz)
|
:0x2B7BE = 178110 = 173x1024 + 958 bytes
:= 173x1012 + 946 = '''176022''' stereo samples
:= 7.982857142857143 s (@ 22.05 kHz)
|
:0x2DABC = 187068 = 5502x34 bytes
:= 2751x64 = '''176064''' stereo samples
:= 7.984761904761905 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 42 extra samples at the end
:(i.e., the last 21 bytes of the last two blocks)
|-
|
:'''SNDDatm_cl05_out'''
:109 frames = 1.816667 seconds
:~= '''40057.5''' samples (@ 22.05 kHz)
|
:0x9E7A = 40570 = 39x1024 + 634 bytes
:= 39x1012 + 622 =  '''40090''' stereo samples
:= 1.818140589569161 s (@ 22.05 kHz)
|
:0xA68C = 42636 = 1254x34 bytes
:= 627x64 = '''40128''' stereo samples
:= 1.819863945578231 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 38 extra samples at the end
:(i.e., the last 19 bytes of the last two blocks)
|}
|}
|}
|}
{{divhide|end}}
{{divhide|end}}
By looking at the end of the Mac SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last Right block). Unless the Mac engine has a very non-standard implementation of IMA4 ADPCM, there is no way to interrupt playback for the last blocks by cutting off the trailing samples (because they are no different from regular samples). This is one of the aspects that impact seamless playback of sequences (music or ambient tracks), see "looping issues" below.


Possibly Mac Oni just looks at the approximate length of each SNDD in frames (or game ticks, i.e., 1/60th of a second), which is listed in each SNDD's header and, once the announced frame count has been reached for the currently playing sound, starts playback on the next sound in the sequence. Depending on the hardware/software implementation of the audio pipelines, this logic can either interrupt the currently playing sound, or cause a slight overlap/crossfade between the current sound and the next. It is possible that PC retail Oni actually does the same, i.e., segments of a sequence are dispatched to the OS based on the frame count of the previous segment, rather than based on its actual play time (sample count). The early interrupt/crossfade is barely noticeable.
By looking at the end of the Mac SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last Right block), with no way to interrupt playback upon reaching these trailing samples - because they are no different from regular samples.


Another theoretical possibility is that, in the case of IMA4 ADPCM, an illegal step index (outside the expected 0-88 range) is used to signify the end of the stream. Decoders typically resolve this by forcing out-of-bounds step indices into the 0-88 interval, but perhaps a custom decoder can interrupt the stream instead. However, this would be very non-standard behavior, and would only be feasible if the decompressed audio stream is put together inside Oni's engine, rather than deferred to the OS. Therefore it is more likely that Oni dispatches each segment to the OS based on the frame count; the OS receives the ADPCM-compressed data, decompresses it and plays it back; as for the overlap/crossfade/interruption of the currently playing segment, it is handled at OS level.
Also, from a careful examination of the sound stream that is actually played back by Oni in the main menu, it is clear that all the Oni engines (both Mac and PC) play back all the available data (including the "padding" of the fixed-size IMA4 blocks) before switching to the next segment. The frame count (number of game ticks) is ignored or used only as an indication (e.g., for approximate cueing in [[BSL]]).


This uninterrupted playback of fixed-size IMA4 blocks is one of the aspects that impact seamless playback of sound sequences in Mac Oni (music or ambient tracks). See [[OBD:SNDD#Looping_issues|"Looping Issues"]] below.
;NOTE
:Musically, the two segments of the main menu music correspond to the same duration (four bars of a 4:4 beat). However, somewhat suprisingly, the two segments don't have the same sample count - or even the same frame count (in game ticks) -, not even when comparing the two sounds on the same platform. That means that, even on PC where the playback is nearly seamless, we are actually hearing musical loops of unequal length, ending 10 milliseconds early or late, and it still sounds OK.


----
----
==Exporting and importing tips==
==Exporting and importing tips==
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part.
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part.
Line 281: Line 216:
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds).
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds).
===MS ADPCM===
===MS ADPCM===
Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior (e.g., in an audio editor) is to automatically add the padding up to the end of the last block of an ADPCM-compressed WAV. Depending on the audio-editing tool, this can create a silence or some "bad data" at the end of the imported audio, which can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence).  
Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior when loading an ADPCM-compressed WAV (e.g., in a non-destructive audio program) is to assume full-sized blocks, with padding up to the end of the last block. Depending on the audio program, this can create a silence or some "bad data" at the end of the imported audio, which can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence).  


As a workaround, one can preprocess .wav files with some tools that are more flexible about incomplete MS ADPCM blocks:
As a workaround, one can preprocess .wav files with some tools that can handle incomplete MS ADPCM blocks and convert to a less ambiguous format:
*For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files.
*For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files.
*For [https://www.ffmpeg.org/download.html ffmpeg], padding can be disabled as an optional setting.
*For [https://www.ffmpeg.org/download.html FFmpeg], padding can be disabled as an optional setting.
So, you'd either join the .wav files in Sox or FFmpeg, or convert them, e.g., to uncompressed PCM, and then import them into a fancy audio tool.


As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.###
As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.###
Line 294: Line 230:
===IMA ADPCM===
===IMA ADPCM===
====Padding====
====Padding====
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition.
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition (for one thing, the trailing samples are not flat zero).
 
As a workaround/solution, the correct sample count of a Mac SNDD can be looked up in a PC counterpart (always available, since we're only talking of music/ambients/sirens, which are neither localized nor sampled at 44.1 kHz), and then used to trim the .aif file in FFmpeg, while converting to .wav (either PCM or ADPCM). However, it's easier (and more reliable) to just grab a PC retail copy of Oni and extract the MS ADPCM sounds.
 


There are two solutions - an approximate one and an exact one:
#As an approximation, look up the frame count announced in the SNDD's header (using a hex viewer or an XML dump), divide that by 60 to get the length of the clip in seconds, and multiply by the sample rate 22.05 kHz. You will get the number of samples (and the corresponding delay in seconds) that are actually played back by Mac Oni before starting the next sound in the sequence. You can then replicate this delay with ffmpeg, Sox, or the audio editor of your choice. A cross-fade between the two clips in the overlapping region should sound best. You can also examine the samples near the approximate transition time, and "manually" determine where the actual samples end and the padding begins.
#As an exact solution, look up the sample count of an equivalent MS ADPCM file from a PC version of Oni. Padding should only be a problem for non-localized music and ambients, so the language version shouldn't matter. The electric "zap" sounds (which are sampled at 44.1 kHz in PC retail Oni) are also not loopable, so you should be able to find a 22.05 kHz sound with a intuitive sample count, similar to the Mac version. It will be somewhat smaller than the raw sample count of the IMA4 ADPCM, because of the padding (see the atm_cl05 example above for a comparison). Once you know the actual sample count, use ffmpeg to convert the .aif file to .wav (either PCM or ADPCM), keeping only the actual samples and trimming out the padding. Then, use Sox or ffmpeg to seamlessly join the .wav files as you would for regular MS ADPCM files (see above).
Of course, you can just grab a PC retail copy of Oni and extract the MS ADPCM sounds directly from there.
====Initial transient====
====Initial transient====
The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment builds up from zero over ~7 samples, instead of picking up where the previous segment left. This introduces about 0.3 milliseconds of silence, and an audible discontinuity in the waveform, even if the two segments are lined up properly. The values of those initial samples is not recoverable, so again it is recommended to turn to a PC version of Oni.
The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment does not pick up where the previous segment left (or should have left) - instead it builds up from zero over ~7 samples. This introduces about 0.3 milliseconds of silence, and an audible discontinuity in the waveform, even if the two segments are lined up properly.
====Role of frame count on Mac====
 
It has been verified (on the main menu music) that in PC Oni (both demo and retail) SNDDs are looped/chained accurately based on the actual sample count (and not on an approximate number of game ticks). Mac Oni probably ''does'' use game ticks, ending each segment slightly early. This eliminates pops that would be caused by a full playback of the IMA4 padding at the end of a segment, followed by the initial transient of the next segment. Early segment end would also be perceived as either a slight pop or a slightly "rushed" beat, but ingame it would be barely noticeable. If a Mac user can record his/her main menu theme playing for half a minute or so, then the resulting audio would be sufficient to validate or invalidate the "early cut" hypothesis.
The values of those initial samples is not recoverable (unlike the padding at the end of SNDDs, which can be trimmed down). Therefore, if working with sound samples extracted from Oni, it is recommended to turn to a PC version's SNDDs.


For what it's worth, the two segments of the main menu theme, '''SNDDmus_ot6.aif''' and '''SNDDmus_ot7.aif''' do ''not'' have the exact same duration (sample count) - and not even the same frame count (415 frames for '''SNDDmus_ot6''' vs 414 frames for '''SNDDmus_ot7''') -, although musically they're both supposed to consist of four 4:4 bars. So, even for the main menu theme, and even on PC where playback is seamless (no dropped samples), the musical segments themselves can have wrong timing at the end (one tick early or late), and you really can't tell that anything is off:
{{divhide|Main menu theme a.k.a. "Oni Trailer"}}
{|
|
{{Table}}
!SNDD name (and frame count)
!PC
!Mac
!difference
|-
|
:'''SNDDmus_ot6'''
:415 frames = 6.9166667 seconds
:~= '''152512.5''' samples (@ 22.05 kHz)
|
:0x25B0C = 154380 = 150x1024 + 780 bytes
:= 150x1012 + 768 = '''152568''' stereo samples
:= 6.919183673469388 s (@ 22.05 kHz)
|
:0x27940 = 162112 = 4768x34 bytes
:= 2384x64 = '''152576''' stereo samples
:= 6.919546485260771 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 8 extra samples at the end
:(i.e., the last 4 bytes of the last two blocks).
|-
|
:'''SNDDmus_ot7'''
414 frames = 6.9 seconds
:~= '''152145''' samples (@ 22.05 kHz)
|
:0x25A3C = 154172 = 150x1024 + 572 bytes
:= 150x1012 + 560 = '''152360''' stereo samples
:= 6.909750566893424 s (@ 22.05 kHz)
|
:0x27874 = 161908 = 4762x34 bytes
:= 2381x64 = '''152384''' stereo samples
:= 6.910839002267574 s (@ 22.05 kHz)
|
:As compared to the PC version of the SNDD,
:the Mac version has 24 extra samples at the end
:(i.e., the last 12 bytes of the last two blocks)
|}
|}
{{divhide|end}}


==PCM export and PC demo detection==
==PCM export and PC demo detection==
Line 359: Line 248:
Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So '''-extract:aif''' will not work on PC SNDDs, '''-extract:wav''' will not work on Mac SNDDs, and '''-create''' will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine.
Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So '''-extract:aif''' will not work on PC SNDDs, '''-extract:wav''' will not work on Mac SNDDs, and '''-create''' will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine.


Importing SNDDs for PC demo (with short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too.
Importing SNDDs for PC demo (with a short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too.


----
----

Revision as of 18:27, 31 May 2020

ONI BINARY DATA
QTNA << Other file types >> StNA
SNDD : Sound Data
switch to XML:SNDD page
Overview @ Oni Stuff
OBD.png
For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see OSBD and its subtypes: OSAm, OSIm and OSGr.

SNDD instances is where Oni stores sound data. Sounds can be either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), and they are typically compressed to save on storage space. Both the PC and Mac versions use a form of ADPCM compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit PCM).

  • On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described HERE. FFmpeg lists this codec as adpcm_ms.
  • On Mac, sounds are encoded using the IMA4 algorithm described HERE. FFmpeg lists this codec as adpcm_ima_qt.
Key shortcomings of the PC demo and Mac SNDDs as compared to PC retail SNDDs
  • PC demo and Mac SNDDs have a "short" .dat part that specifies only the frame count (animation length in game ticks) and number of channels (mono or stereo). The waveform data is always assumed to be sampled at 22.05 kHz, and compressed into 4-bit ADPCM (either IMA4 or MS ADPCM).
  • PC retail supports arbitrary sample rate, which allows for crisper high frequencies: specifically, 44.1 kHz (CD quality) is used for the 46 electric spark sounds ap_hit_shld and zap##. Possibly PC retail supports uncompressed PCM waveforms as well.
Key shortcomings of Mac SNDDs as compared to PC SNDDs (both retail and demo)
  • At 22.05 kHz, the storage size of Mac SNDDs (IMA4) is about 5% larger than for PC equivalents (MS ADPCM), because of a much smaller block size in the .raw part (the smaller .dat part of Mac SNDDs doesn't help).
  • Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). PC SNDDs (retail or demo) have no such artifacts and allow nearly seamless playback. See BELOW for details.

Oni storage

PC retail

Below is the .dat file part used in the PC retail version.

Sndd all.gif


Offset Type Raw Hex Value Description
0x00 res_id 01 D7 08 00 2263 02263-comguy_dth2.aif.SNDD
0x04 lev_id 01 00 00 06 3 level 3
0x08 int32 08 00 00 00 8 flags
1 - (never used in Vanilla Oni)
2 - (never used in Vanilla Oni)
4 - (never used in Vanilla Oni)
8 - ADPCM compressed
0x0C block[50]     wav header (corresponds to the "fmt " section of a RIFF WAVE file)
0x3E int16 37 00 55 duration in 1/60 seconds (game ticks)
0x40 int32 56 28 00 00 10326 size of the part in the raw file in bytes
0x44 offset 20 10 59 00 0x591020 at this position starts the part in the raw file
0x48 char[24] AD DE dead 24 unused bytes (padding)

The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).

.raw part (MS ADPCM)

For a detailed overview of the ADPCM algorithm (if interested), see HERE. For an actual implementation example, see, e.g., FFmpeg.

The MS ADPCM .raw data has 512- or 1024-byte blocks (512 bytes for 22.05 kHz mono, 1024 bytes for 22.05 kHz stereo and 44.1 kHz mono)
Each block consists of a 7- or 14-byte header (7 bytes for mono, 14 bytes for stereo), which includes the block's first two samples.
The remaining 505, 1010 or 1017 bytes of each block consist of nibbles (half-bytes), with left-right interleaving in the case of stereo.
(that's 1010 more samples in the case of 22.05 kHz mono or stereo, and 2034 more samples in the case of 44.1 kHz mono)
Thus the total number of samples per block (including the two in the header) is 1012 for 22.05 kHz (mono or stereo) and 2036 for 44.1 kHz mono.
The final block in the file can be incomplete (the decoder can infer this from the block size and raw data size).

PC demo and Mac

The Mac version and the PC demo version use a simpler format, with no support for different sample rates (all sounds are sampled at 22050 Hz).

Sndd alm.gif

Offset Type Raw Hex Value Description
0x00 res_id 01 D6 08 00 2262 02262-comguy_dth2.aif.SNDD
0x04 lev_id 01 00 00 06 3 level 3
0x08 int32 01 00 00 00 1 flags
1 - (unknown; always the same in Vanilla Oni)
2 - stereo (mono if disabled)
0x0C int32 37 00 00 00 55 duration in 1/60 seconds (game ticks)
0x10 int32 5E 2A 00 00 10846 size of the part in the raw file in bytes
0x14 offset 00 B1 01 00 0x1B100 at this position starts the part in the raw file
0x18 char[8] AD DE dead 8 unused bytes (padding)

The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers).

.raw part (MS ADPCM, PC demo)

For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz.

.raw part (IMA4 ADPCM, Mac)

For an overview of the IMA ADPCM algorithm and IMA4 header (if interested), see HERE

The IMA4 ADPCM data has 34-byte blocks (in the case of stereo, there is an even number of such blocks, because Left and Right blocks are interleaved).
The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples.
The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right).
Unlike for MS ADPCM, incomplete trailing blocks (if any) are not indicated in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data.
For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music:

By looking at the end of the Mac SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last Right block), with no way to interrupt playback upon reaching these trailing samples - because they are no different from regular samples.

Also, from a careful examination of the sound stream that is actually played back by Oni in the main menu, it is clear that all the Oni engines (both Mac and PC) play back all the available data (including the "padding" of the fixed-size IMA4 blocks) before switching to the next segment. The frame count (number of game ticks) is ignored or used only as an indication (e.g., for approximate cueing in BSL).

This uninterrupted playback of fixed-size IMA4 blocks is one of the aspects that impact seamless playback of sound sequences in Mac Oni (music or ambient tracks). See "Looping Issues" below.

NOTE
Musically, the two segments of the main menu music correspond to the same duration (four bars of a 4:4 beat). However, somewhat suprisingly, the two segments don't have the same sample count - or even the same frame count (in game ticks) -, not even when comparing the two sounds on the same platform. That means that, even on PC where the playback is nearly seamless, we are actually hearing musical loops of unequal length, ending 10 milliseconds early or late, and it still sounds OK.

Exporting and importing tips

To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part.

WAV files (from PC retail/demo SNDDs)

  • Write "RIFF"
  • add the size of the part in the raw file + 70 bytes
  • write "WAVE"
  • write "fmt "
  • write 50
  • write the wav header (for PC demo, the wav header is not present in the .dat part, and has to be deduced)
  • OPTIONAL/RECOMMENDED: compute the number of samples and add a "fact" section announcing it
  • write "data"
  • add the size of the part in the raw file OPTIONAL/RECOMMENDED: increase the size if the last sample block is incomplete
  • add the raw file data OPTIONAL/RECOMMENDED: add padding to the last sample block if it is incomplete
  • save it as a wav file.

Sndd wav.gif

Offset Type Raw Hex Value Description
Complete ADPCM wav format header (black outline)
0x00 char[4] 52 49 46 46 RIFF identifier for the "IBM/Microsoft RIFF" standard
0x04 int32 9C 28 00 00 10396 size of the file from 0x08 to the end (= size of the .raw part + 70 bytes)
0x08 char[4] 57 41 56 45 WAVE identifier for the "WAVE" format
0x0C char[4] 66 6D 74 20 "fmt " identifier announcing the following wav format header
0x10 int32 32 00 00 00 50 wave format header size
0x14 block[50]     wav header
0x46 char[4] 64 61 74 61 data identifier announcing the following wav data
0x4A int32 56 28 00 00 10326 size of the following wav data in bytes (= size of the .raw part)

The above is not 100% consistent with the WAVE storage rules, because it allows for a completely arbitrary "data" size. Microsoft ADPCM data is supposed to be stored as a number of fixed-size blocks (in Oni, each block is either 512 bytes for 22.05 kHz mono, or 1024 bytes for 22.05 kHz stereo and 44.1 kHz mono). Thus, according to the standard, the last block - even if incomplete - must be stored in its entirety, and the "data" size must be a multiple of the block size. In the above example, since the format is 22.05 kHz mono, the "data" size should be increased from 10326 to 10752=21x512, and 426 empty bytes should be added as padding, so that there are 21 complete data blocks.

The standard way to deal with incomplete blocks is to specify not just the data size, but the actual number of samples, by adding a "fact" section to the WAVE header, like this:

Offset Type Raw Hex Value Description
Complete ADPCM wav format header
0x00 char[4] 52 49 46 46 RIFF identifier for the "IBM/Microsoft RIFF" standard
0x04 int32 9C 28 00 00 10396 size of the file from 0x08 to the end (= size of the .raw part + 70 bytes)
0x08 char[4] 57 41 56 45 WAVE identifier for the "WAVE" format
0x0C char[4] 66 6D 74 20 "fmt " identifier announcing the following wav format header section
0x10 int32 32 00 00 00 50 wave format header size
0x14 block[50]     wav header
0x46 char[4] 66 61 63 74 fact identifier announcing the following "fact" section
0x4A int32 04 00 00 00 4 size of the following "fact" section in bytes
0x4E int32 B0 4F 00 00 20400 actual number of samples (see below for calculation)
0x52 char[4] 64 61 74 61 data identifier announcing the following wav data
0x56 int32 00 2A 00 00 10752 size of the following wav data in bytes (= size of the .raw part + 426 empty bytes)

The actual number of samples is implied from the actual data size (size of the .raw part) and wav header properties as follows:

  • n_whole_blocks = floor(raw_size/block_size); // EXAMPLE: floor(10326/512) = 20
  • last_block_size = raw_size - whole_blocks*block_size; // EXAMPLE: 10326 - 20x512 = 86
  • last_block_samples = (last_block_size - 7*n_channels)*(8/bits_per_sample/n_channels) + 2; // EXAMPLE: (86 - 7)*(8/4) + 2 = 160
  • n_samples = n_whole_blocks*samples_per_block + last_block_samples; // EXAMPLE: 20*1012 + 160 = 20400

AIF files (from Mac SNDDs)

  • Write "FORM"
  • add the size of the part in the raw file + 50 bytes
  • write "AIFC"
  • write "COMM "
  • add the aif header (after filling in in, the number of channels, the sample rate - always 22.05 kHz -, the bits per sample - always 16 - and the number of sample frames/blocks)
  • write "SSND"
  • add the size of the part in the raw file + 8 bytes
  • add 8 zero bytes (custom "offset" and "block size" fields)
  • add the raw file data and save it as an aif file.

Note the Big Endian order

Sndm aif.gif

Offset Type Raw Hex Value Description
Complete aif format header (black outline)
0x00 char[4] 46 4F 52 4D FORM identifier for the "EA IFF 85" standard
0x04 int32 00 00 2A 90 10896 size of the file from 0x08 to the end (= size of the .raw part + 50 bytes)
0x08 char[4] 41 49 46 43 AIFC identifier for the "AIFC" format (compressed aif file)
0x0C char[4] 43 4F 4D 4D COMM identifier announcing the following aif format header
0x10 block[26]     aif header
0x2A char[4] 53 53 4E 44 SSND identifier announcing the following aif data
0x2E int32 00 00 2A 66 10854 size of the file from 0x32 to the end (= size of the .raw part + 8 bytes)
0x32 int32 00 00 00 00 0 offset; determines where the first sample in the data starts; use zero
0x36 int32 00 00 00 00 0 block size; used in conjunction with offset for block-aligning data; use zero


Looping issues

As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds).

MS ADPCM

Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior when loading an ADPCM-compressed WAV (e.g., in a non-destructive audio program) is to assume full-sized blocks, with padding up to the end of the last block. Depending on the audio program, this can create a silence or some "bad data" at the end of the imported audio, which can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence).

As a workaround, one can preprocess .wav files with some tools that can handle incomplete MS ADPCM blocks and convert to a less ambiguous format:

  • For Sox, padding is disabled by default when joining several files.
  • For FFmpeg, padding can be disabled as an optional setting.

So, you'd either join the .wav files in Sox or FFmpeg, or convert them, e.g., to uncompressed PCM, and then import them into a fancy audio tool.

As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.###

Slight distorsions are sometimes observed near the ends of looping SNDDs (music and ambient tracks). These artifacts were likely caused by Bungie's audio tools, and can not be undone automatically. Barely noticeable, they can be healed by manually editing audio samples near the seams.


IMA ADPCM

Padding

In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition (for one thing, the trailing samples are not flat zero).

As a workaround/solution, the correct sample count of a Mac SNDD can be looked up in a PC counterpart (always available, since we're only talking of music/ambients/sirens, which are neither localized nor sampled at 44.1 kHz), and then used to trim the .aif file in FFmpeg, while converting to .wav (either PCM or ADPCM). However, it's easier (and more reliable) to just grab a PC retail copy of Oni and extract the MS ADPCM sounds.


Initial transient

The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment does not pick up where the previous segment left (or should have left) - instead it builds up from zero over ~7 samples. This introduces about 0.3 milliseconds of silence, and an audible discontinuity in the waveform, even if the two segments are lined up properly.

The values of those initial samples is not recoverable (unlike the padding at the end of SNDDs, which can be trimmed down). Therefore, if working with sound samples extracted from Oni, it is recommended to turn to a PC version's SNDDs.


PCM export and PC demo detection

OniSplit v0.9.### implements export to uncompressed PCM (signed 16-bit linear) from both the PC retail and the Mac SNDD format: use -extract:pcm instead of either -extract:wav or -extract:aif. As compared to ADPCM, linear PCM is a much more straightforward format (almost human readable), and makes it easier to analyze artifacts.

Since the only difference between PC demo and Mac is the actual storage format of SNDD files (in the .raw part), and the template checksum is the same, OniSplit has no way of determining which ADPCM algorithm to use, other than by actually scanning and validating the data as IMA4 (or not). Starting with OniSplit v0.9.###, this automatic check is implemented, allowing both -extract:wav and -extract:pcm on PC-demo SNDDs. It is very unlikely that any Mac SNDDs will be falsely identified as MS ADPCM (or, rather, invalidated as IMA4). If it ever happens, do as instructed by the following warning: "PC-demo MS ADPCM detected; use -nodemo flag to treat as IMA4."

Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So -extract:aif will not work on PC SNDDs, -extract:wav will not work on Mac SNDDs, and -create will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine.

Importing SNDDs for PC demo (with a short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too.



ONI BINARY DATA
QTNA << Other file types >> StNA
SNDD : Sound Data
Generic file