5,389
edits
(→PCM export: PC demo detection) |
(wrapping up the Mac SNDD knowledge; thanks to Ed for the main menu recording) |
||
Line 1: | Line 1: | ||
{{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic | align=center}} | {{OBD_File_Header | type=SNDD | prev=QTNA | next=StNA | name=Sound Data | family=Generic | align=center}} | ||
:''For metadata instances used to group sounds together, randomize them, adjust their volume or frequency, etc, see [[OSBD]] and its subtypes: [[OSAm]], [[OSIm]] and [[OSGr]].'' | |||
SNDD instances is where Oni stores sound data. Sounds can be either mono or stereo waveforms (with sampling frequencies of either 22.05 kHz or 44.1 kHz), and they are typically compressed to save on storage space. Both the PC and Mac versions use a form of [[wp:ADPCM|ADPCM]] compression (Adaptive Differential Pulse-Code Modulation), where 16-bit sound samples are encoded as 4-bit "nibbles" (resulting roughly in a 4:1 compression ratio as compared to uncompressed 16-bit [[wp:PCM|PCM]]). | |||
*On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ms'''. | *On PC (both retail and demo), sounds are encoded using Microsoft's ADPCM algorithm described [https://wiki.multimedia.cx/index.php/Microsoft_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ms'''. | ||
*On Mac, sounds are encoded using the IMA4 algorithm described [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ima_qt'''. | *On Mac, sounds are encoded using the IMA4 algorithm described [https://wiki.multimedia.cx/index.php/Apple_QuickTime_IMA_ADPCM HERE]. [[wp:FFmpeg|FFmpeg]] lists this codec as '''adpcm_ima_qt'''. | ||
;Key shortcomings of the PC demo and Mac SNDDs as compared to PC retail SNDDs: | |||
*PC demo and Mac SNDDs only | *PC demo and Mac SNDDs have a "short" .dat part that specifies only the frame count (animation length in game ticks) and number of channels (mono or stereo). The waveform data is always assumed to be sampled at 22.05 kHz, and compressed into 4-bit ADPCM (either IMA4 or MS ADPCM). | ||
*PC retail supports arbitrary sample rate, which allows for crisper high frequencies: specifically, 44.1 kHz (CD quality) is used for the 46 electric spark sounds '''ap_hit_shld''' and '''zap##'''. Possibly PC retail supports uncompressed PCM waveforms as well. | |||
*At 22.05 kHz, Mac SNDDs (IMA4) | ; Key shortcomings of Mac SNDDs as compared to PC SNDDs (both retail and demo): | ||
*Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). | *At 22.05 kHz, the storage size of Mac SNDDs (IMA4) is about 5% larger than for PC equivalents (MS ADPCM), because of a much smaller block size in the .raw part (the smaller .dat part of Mac SNDDs doesn't help). | ||
*Mac SNDDs have encoding/editing artifacts at the ends of looping segments (music and ambient tracks). PC SNDDs (retail or demo) have no such artifacts and allow nearly seamless playback. See [[OBD:SNDD#Looping_issues|BELOW]] for details. | |||
==Oni storage== | ==Oni storage== | ||
===PC retail=== | ===PC retail=== | ||
Line 33: | Line 34: | ||
|} | |} | ||
The .raw data contains the actual audio sample blocks without any other headers (other than block headers). | The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers). | ||
====.raw part (MS ADPCM)==== | ====.raw part (MS ADPCM)==== | ||
Line 63: | Line 64: | ||
|} | |} | ||
The .raw data contains the actual audio sample blocks without any other headers (other than block headers). | The .raw data contains the actual audio sample blocks without any other headers (other than ADPCM block headers). | ||
====.raw part (MS ADPCM, PC demo)==== | ====.raw part (MS ADPCM, PC demo)==== | ||
For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz. | For PC demo the .raw SNDD data is actually the same as for PC retail, but with the same short .dat header as on Mac. The ADPCM block size is 512 bytes for mono, and 1024 for stereo. The sample rate is 22.05 kHz. | ||
Line 71: | Line 72: | ||
:The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples. | :The first two bytes of each block are used to set the initial predictor (upper 9 bits) and step (lower 7 bits) for decoding the block's samples. | ||
:The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right). | :The other 32 bytes consist of 64 samples stored as nibbles (half-bytes). In the case of stereo, all the nibbles in a block belong to the same channel (either all Left or all Right). | ||
:Unlike for MS ADPCM, incomplete trailing blocks (if any) are not | :Unlike for MS ADPCM, incomplete trailing blocks (if any) are not indicated in any way: the final blocks are stored in their entirety, with no way to tell how much of it is actual data. | ||
:For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for | :For this reason, identical sounds do not have the same sample count on PC (both retail and demo) and on Mac. As an example, here are the stats for the main menu music: | ||
{{divhide| | {{divhide|Main menu music a.k.a. "Oni Trailer"}} | ||
{| | {| | ||
| | | | ||
{{Table}} | {{Table}} | ||
!SNDD name (and frame count) | !SNDD name (and frame count) | ||
!PC | !PC (retail and demo) | ||
!Mac | !Mac | ||
!difference | !difference | ||
|- | |- | ||
| | | | ||
:''' | :'''SNDDmus_ot6''' | ||
: | :415 frames = 6.9166667 seconds | ||
:~= ''' | :~= '''152512.5''' samples (@ 22.05 kHz) | ||
| | | | ||
: | :0x25B0C = 154380 = 150x1024 + 780 bytes | ||
:= | := 150x1012 + 768 = '''152568''' stereo samples | ||
:= | := 6.919183673469388 s (@ 22.05 kHz) | ||
| | | | ||
: | :0x27940 = 162112 = 4768x34 bytes | ||
:= | := 2384x64 = '''152576''' stereo samples | ||
:= | := 6.919546485260771 s (@ 22.05 kHz) | ||
| | | | ||
:As compared to the PC version of the SNDD, | :As compared to the PC version of the SNDD, | ||
:the Mac version has | :the Mac version has 8 extra samples at the end | ||
:(i.e., the last | :(i.e., the last 4 bytes of the last two blocks). | ||
|- | |- | ||
| | | | ||
:''' | :'''SNDDmus_ot7''' | ||
: | :414 frames = 6.9 seconds | ||
:~= ''' | :~= '''152145''' samples (@ 22.05 kHz) | ||
| | | | ||
: | :0x25A3C = 154172 = 150x1024 + 572 bytes | ||
:= | := 150x1012 + 560 = '''152360''' stereo samples | ||
:= | := 6.909750566893424 s (@ 22.05 kHz) | ||
| | | | ||
: | :0x27874 = 161908 = 4762x34 bytes | ||
:= | := 2381x64 = '''152384''' stereo samples | ||
:= | := 6.910839002267574 s (@ 22.05 kHz) | ||
| | | | ||
:As compared to the PC version of the SNDD, | :As compared to the PC version of the SNDD, | ||
:the Mac version has 24 extra samples at the end | :the Mac version has 24 extra samples at the end | ||
:(i.e., the last 12 bytes of the last two blocks) | :(i.e., the last 12 bytes of the last two blocks) | ||
|} | |} | ||
|} | |} | ||
{{divhide|end}} | {{divhide|end}} | ||
By looking at the end of the Mac SNDDs (or the exported AIFF files), it can be confirmed that the extraneous samples are actually there, at the end of the last two 34-byte blocks (last Left block and last Right block), with no way to interrupt playback upon reaching these trailing samples - because they are no different from regular samples. | |||
Also, from a careful examination of the sound stream that is actually played back by Oni in the main menu, it is clear that all the Oni engines (both Mac and PC) play back all the available data (including the "padding" of the fixed-size IMA4 blocks) before switching to the next segment. The frame count (number of game ticks) is ignored or used only as an indication (e.g., for approximate cueing in [[BSL]]). | |||
This uninterrupted playback of fixed-size IMA4 blocks is one of the aspects that impact seamless playback of sound sequences in Mac Oni (music or ambient tracks). See [[OBD:SNDD#Looping_issues|"Looping Issues"]] below. | |||
;NOTE | |||
:Musically, the two segments of the main menu music correspond to the same duration (four bars of a 4:4 beat). However, somewhat suprisingly, the two segments don't have the same sample count - or even the same frame count (in game ticks) -, not even when comparing the two sounds on the same platform. That means that, even on PC where the playback is nearly seamless, we are actually hearing musical loops of unequal length, ending 10 milliseconds early or late, and it still sounds OK. | |||
---- | ---- | ||
==Exporting and importing tips== | ==Exporting and importing tips== | ||
To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part. | To create a wav/aif file one needs to write a file header like below and then write the contents of the raw data part. | ||
Line 281: | Line 216: | ||
As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds). | As detailed above, ADPCM data is stored in blocks, but the actual sound data does not necessarily end exactly at the end of a block. This is true both for MS ADPCM (PC retail or demo) and IMA4 ADPCM (Mac), but is especially noticeable for the comparatively large blocks of MS ADPCM, where the padding can be as large as ~1010 samples, i.e., a ~46-millisecond silence in the case of 22.05 kHz (for IMA4, the biggest possible gap is 63 samples, or ~3 milliseconds). | ||
===MS ADPCM=== | ===MS ADPCM=== | ||
Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior (e.g., in | Although the final block of a MS ADPCM SNDD file (PC retail) is stored in incomplete form (with only the actual samples and no padding), the standard decoding behavior when loading an ADPCM-compressed WAV (e.g., in a non-destructive audio program) is to assume full-sized blocks, with padding up to the end of the last block. Depending on the audio program, this can create a silence or some "bad data" at the end of the imported audio, which can be a problem if one wants to join SNDDs that are supposed to play seamlessly one after another (e.g., a musical or ambient sequence). | ||
As a workaround, one can preprocess .wav files with some tools that | As a workaround, one can preprocess .wav files with some tools that can handle incomplete MS ADPCM blocks and convert to a less ambiguous format: | ||
*For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files. | *For [http://sox.sourceforge.net/ Sox], padding is disabled by default when joining several files. | ||
*For [https://www.ffmpeg.org/download.html | *For [https://www.ffmpeg.org/download.html FFmpeg], padding can be disabled as an optional setting. | ||
So, you'd either join the .wav files in Sox or FFmpeg, or convert them, e.g., to uncompressed PCM, and then import them into a fancy audio tool. | |||
As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.### | As an actual solution, the .wav file should be made compliant with RIFF WAVE standards, i.e., the last block should be padded to its full size, and a "fact" section should be used to specify the actual number of samples. This is implemented in OniSplit v 0.9.### | ||
Line 294: | Line 230: | ||
===IMA ADPCM=== | ===IMA ADPCM=== | ||
====Padding==== | ====Padding==== | ||
In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition. | In the case of IMA ADPCM, the padding is actually present in the stored audio, so it is impossible (both for OniSplit and for a third-party converter) to automatically trim it down to just the relevant audio data. In fact, just by looking at the Mac SNDD itself, there is no way to tell how many of the trailing samples need to be cut for a truly seamless transition (for one thing, the trailing samples are not flat zero). | ||
As a workaround/solution, the correct sample count of a Mac SNDD can be looked up in a PC counterpart (always available, since we're only talking of music/ambients/sirens, which are neither localized nor sampled at 44.1 kHz), and then used to trim the .aif file in FFmpeg, while converting to .wav (either PCM or ADPCM). However, it's easier (and more reliable) to just grab a PC retail copy of Oni and extract the MS ADPCM sounds. | |||
====Initial transient==== | ====Initial transient==== | ||
The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment builds up from zero over ~7 samples | The biggest problem with seamless playback of Mac SNDDs (for music and ambient tracks) is that - even if you figure out the correct length of each segment - the waveform of each next segment does not pick up where the previous segment left (or should have left) - instead it builds up from zero over ~7 samples. This introduces about 0.3 milliseconds of silence, and an audible discontinuity in the waveform, even if the two segments are lined up properly. | ||
The values of those initial samples is not recoverable (unlike the padding at the end of SNDDs, which can be trimmed down). Therefore, if working with sound samples extracted from Oni, it is recommended to turn to a PC version's SNDDs. | |||
==PCM export and PC demo detection== | ==PCM export and PC demo detection== | ||
Line 359: | Line 248: | ||
Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So '''-extract:aif''' will not work on PC SNDDs, '''-extract:wav''' will not work on Mac SNDDs, and '''-create''' will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine. | Note that transcoding (between IMA4 and MS ADPCM) and encoding is not implemented at this point. So '''-extract:aif''' will not work on PC SNDDs, '''-extract:wav''' will not work on Mac SNDDs, and '''-create''' will only work on sound files that use the correct codec and sample rate supported by the PC retail/demo or Mac Oni engine. | ||
Importing SNDDs for PC demo (with short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too. | Importing SNDDs for PC demo (with a short .dat part and MS ADPCM in the .raw part) is also not implemented yet. Last but not least, PC retail apparently supports uncompressed PCM sounds, but they need to be tested in the engine first. Possibly ADPCM encoding/transcoding will be implemented at some point, too. | ||
---- | ---- |