19,503
edits
(→Header: wording) |
(added note about unused field in template descriptor) |
||
(37 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
{{UpdatedForOniX|1.0.0}} | |||
{{OBD Home}} | {{OBD Home}} | ||
{{Hatnote|".dat" redirects here; for other files ending in ".dat", see [[Oni (folder)]].<br> | |||
:''You should read the [[ | :''You should read the [[Game data terminology]] page before this one.''<br> | ||
Files in GameDataFolder/ named "level[0-19]_Final.dat", together with ".raw" and sometimes ".sep" counterparts, contain the game data for Oni | :''The [[Raw|Raw and separate file formats]] page should be read after this one.''}} | ||
Files in GameDataFolder/ named "level[0-19]_Final.dat", together with ".raw" and sometimes ".sep" counterparts, contain the game data for Oni. | |||
The same format was used for the | The same format was used for the tool files, named level0_Tools.dat/.raw/.sep, however the retail Oni game application does not load tool files; for the story behind the tool files, see [[level0_Tools]]. | ||
The level 0 files do not actually contain a level, but instances (resources) shared across all levels. Level 0 is loaded when the game starts, and never unloaded. All other level files, 1-19, are only loaded when their corresponding level starts, and unloaded when it ends. Since Oni can only hold these two levels in memory concurrently, resources have to be duplicated on disk whenever a character class, sound effect, etc. occurs in more than one level. For instance, although there are only 2,380 unique sounds in the game, there are 7,386 sounds stored across all level data files. | |||
{{TOClimit}} | |||
==Backwards and garbage data== | ==Backwards and garbage data== | ||
During development, Oni had an [[level0_Tools|in-game editor]] which presented a GUI for manipulating AIs, particles, etc. in a level. When a developer saved his work, the contents of the level, stored in RAM, were | During development, Oni had an [[level0_Tools|in-game editor]] which presented a GUI for manipulating AIs, particles, etc. in a level. When a developer saved his work, the contents of the level, stored in his PC's RAM, were flushed directly to disk. Thus the structure of the .dat/.raw/.sep files reflects the way in which Bungie West chose to store levels in memory. So when we read the data in the files with a hex editor, we can see eccentricities such as blank space (coming from unused fields and byte-alignment padding) and garbage data (such as now-meaningless pointer values). [[OBD:Raw and separate file formats#Gaps|Further gaps]], mostly representing orphaned obsolete resources, add up to about 25 MB for the whole game. | ||
Additionally, because the levels were built on Intel-based machines, which use a little-endian architecture, sequences of bytes which represent numbers were written from least-significant to most-significant byte. [[ | Additionally, because the levels were built on Intel-based machines, which use a little-endian architecture, sequences of bytes which represent numbers were written from least-significant to most-significant byte. [[wp:FourCC|FourCCs]] in the data are stored "backwards", such as "13RV" which is meant to be read "VR31", because Bungie defined those four bytes as a 32-bit integer, not a string, causing them to be written to disk in little-endian order. | ||
==File limits== | ==File limits== | ||
Line 25: | Line 27: | ||
{{Table}} | {{Table}} | ||
{{OBD_Table_Header}} | {{OBD_Table_Header}} | ||
{{OBDtr| 0x00 | int64 | | 1F 27 DC 33 DF BC 03 00 | 0x0003BCDF33DC271F | | {{OBDtr| 0x00 | int64 | | 1F 27 DC 33 DF BC 03 00 | 0x0003BCDF33DC271F | Total template checksum (main indicator of engine compatibility): | ||
{{OBDtr| 0x08 | int32 | | 31 33 52 56 | ' | *0x0003BCDF33DC271F ("PC") - templates compatible with Windows retail engine(s) | ||
{{OBDtr| 0x0C | | *0x0003BCDF23C13061 ("Mac") - templates compatible with Windows demo and Mac engines | ||
*0x0003BA70A8D8AE11 ("PS2") - templates compatible with PlayStation 2 engine(s) | |||
*0x0000000000000000 (blank) - for use with [[OniX]] engine(s) (VR33), which handle data versioning using the 0x3C field below | |||
OniSplit's .oni files (VR32) use PC checksum by default and Mac/PS2 checksums where required (SNDD, TXMP, AGQG, M3GM, IGSt, TSFT/TSGA, TRAM/TREX) }} | |||
{{OBDtr| 0x08 | int32 | | 31 33 52 56 | '13RV' | .dat version (meant to be read as "VR31")<br>OniSplit's .oni files use '23RV' ("VR32") instead<br>OniX's [[Oni (folder)|GDFX]] uses '33RV' ("VR33") to signify that the new data versioning system is in use }} | |||
{{OBDtr| 0x0C | int16 | | 40 00 | 64 | size of this header }} | |||
{{OBDtr| 0x0E | int16 | | 14 00 | 20 | size of instance descriptor }} | |||
{{OBDtr| 0x10 | int16 | | 10 00 | 16 | size of template descriptor }} | |||
{{OBDtr| 0x12 | int16 | | 08 00 | 8 | size of name descriptor }} | |||
{{OBDtr| 0x14 | int32 | | 83 24 00 00 | 9347 | instance descriptor count }} | {{OBDtr| 0x14 | int32 | | 83 24 00 00 | 9347 | instance descriptor count }} | ||
{{OBDtr| 0x18 | int32 | | D4 1B 00 00 | 7124 | name descriptor count }} | {{OBDtr| 0x18 | int32 | | D4 1B 00 00 | 7124 | name descriptor count }} | ||
Line 35: | Line 45: | ||
{{OBDtr| 0x28 | int32 | | 40 F2 28 00 | 0x28F240 | name table offset }} | {{OBDtr| 0x28 | int32 | | 40 F2 28 00 | 0x28F240 | name table offset }} | ||
{{OBDtr| 0x2C | int32 | | 04 4F 02 00 | 151300 | name table size }} | {{OBDtr| 0x2C | int32 | | 04 4F 02 00 | 151300 | name table size }} | ||
{{OBDtr| 0x30 | int32 | | | {{OBDtr| 0x30 | int32 | | 99 CF 40 00 | (garbage) | used by OniSplit for raw table offset }} | ||
{{OBDtr| 0x34 | int32 | | | {{OBDtr| 0x34 | int32 | | 90 4F 63 00 | (garbage) | used by OniSplit for raw table size }} | ||
{{OBDtr| 0x38 | int32 | | | {{OBDtr| 0x38 | int32 | | F4 55 5F 00 | (garbage) | unused }} | ||
{{OBDtr| 0x3C | int32 | | | {{OBDtr| 0x3C | int32 | | 90 4F 63 00 | (garbage) | used by OniX (three high bytes) for data versioning; contains the highest data version (timestamp) found in any instance in this .dat; see instance descriptor table's 0x10 for format }} | ||
|} | |} | ||
The file's '''template checksum''' tells us that this level data is in the .dat/.raw file scheme, | The file's '''total template checksum''' is the sum of all the template checksums (see "Template descriptors" below). Oni looks at this number in order to validate that it can read this version of the game data format. In practical terms, the total checksum value given for Windows above tells us that this level data is in the .dat/.raw file scheme, and the value given for Mac Oni and the Windows demo tells us that the level data uses the .dat/.raw/.sep file scheme. | ||
The '''version''' of the instance file is the format version. Reading it backwards, as discussed under the "Backwards and garbage data" section, we get "VR31", which | The '''version''' of the instance file is the format version. Reading it backwards, as discussed under the "Backwards and garbage data" section, we get "VR31", which probably means "version 3.1". This is the format version of all instance files in all releases of Oni, regardless of file scheme. | ||
The ''' | The '''descriptor sizes''' are the sizes of the instance, template, and name descriptors which are coming up in this file (see breakdowns in later sections). For instance, each instance descriptor will be 0x14, or 20 bytes, in length. | ||
The '''descriptor counts''' are the sizes of arrays which are coming up | The '''descriptor counts''' are the sizes of arrays which are coming up in this file: the instance, name and template descriptors. For instance, the size of the instance descriptor array will be 0x2483, or 9,347 items, in length. | ||
Next we are told the addresses and sizes of the '''data and name tables''' in the instance file. The name table simply follows the data table, as you'll see if you add the data table offset plus the data table size, but that doesn't mean the name table offset is redundant; if its start was not 32-bit-aligned, it probably would be moved down to start at the next 32-bit word, but this is unnecessary because it happens to be aligned already. | Next we are told the addresses and sizes of the '''data and name tables''' in the instance file. The name table simply follows the data table, as you'll see if you add the data table offset plus the data table size, but that doesn't mean the name table offset is redundant; if its start was not 32-bit-aligned, it probably would be moved down to start at the next 32-bit word, but this is unnecessary because it happens to be aligned already. | ||
After this comes four "int"s of ''' | After this comes four "int"s of '''garbage'''. Space occupied by random values like this is common in the data files, and indicates that something stored in memory at this relative position was written to disk even though it wouldn't be meaningful on disk (probably pointers or uninitialized memory in a space that was being reserved for possible future use). The first two 32-bit fields are, however, used in .oni files generated by OniSplit, and the last 32-bit field is partly used by OniX for a new form of template versioning. Future usage of these fields by OniSplit and/or OniX may change (hopefully not too much). | ||
That concludes the header of the instance file. Immediately after this header, we find the instance descriptors array. | That concludes the header of the instance file. Immediately after this header, we find the instance descriptors array. | ||
Line 58: | Line 68: | ||
The instance descriptor array tells Oni where to find the data and the name of every instance (resource) indexed by the .dat file. The descriptors start at 0x40 in the .dat file, but below is a descriptor found at 0x017B50 in the file which makes a better example. In the table below, we use offsets relative to the start of this descriptor. | The instance descriptor array tells Oni where to find the data and the name of every instance (resource) indexed by the .dat file. The descriptors start at 0x40 in the .dat file, but below is a descriptor found at 0x017B50 in the file which makes a better example. In the table below, we use offsets relative to the start of this descriptor. | ||
{ | {| class="wikitable" | ||
|- bgcolor="#E9E9E9" | |||
! width=5% | Offset | |||
! width=5% | Type | |||
! width=10% | Raw Hex | |||
! width=10% | Value | |||
! width=35% | Description (vanilla) | |||
! width=35% | Description (GDFX) | |||
|- align=center | |||
| 0x00 | |||
| tag | |||
| 54 42 55 53 | |||
| 'SUBT' | |||
|colspan="2" align=left | template tag | |||
|- align=center | |||
| 0x04 | |||
| int32 | |||
| C8 30 22 00 | |||
| 0x2230C8 | |||
|colspan="2" align=left | data offset (relative to data table) | |||
|- align=center | |||
| 0x08 | |||
| int32 | |||
| 01 CB 00 00 | |||
| 0xCB01 | |||
|colspan="2" align=left | name offset (relative to name table) | |||
|- align=center | |||
| 0x0C | |||
| int32 | |||
| C0 09 00 00 | |||
| 2496 | |||
|colspan="2" align=left | data size | |||
|- align=center valign=top | |||
| 0x10 | |||
| int32 | |||
| 00 00 00 00 | |||
| 0 | |||
| align=left | flags; possible values: | |||
:0x'''01''' 00 00 00 - unnamed | |||
:0x'''02''' 00 00 00 - empty | |||
:0x'''04''' 00 00 00 - never used; intended to mark instance as pointing to duplicate data rather than its own data | |||
:0x'''08''' 00 00 00 - instance's data is being used by duplicate instances as a source | |||
The first two of the following bits occur throughout the original .dat files. However these bits are ignored by the engine when loading data because they only have relevance during runtime, when Oni is in Tool mode: | |||
:0x00 00 '''10''' 00 - touched (unsaved data) | |||
:0x00 00 '''20''' 00 - "in batch file" | |||
:0x00 00 '''40''' 00 - delete upon next save | |||
| align=left | flags; possible values: | |||
:0x'''01''' 00 00 00 - unnamed | :0x'''01''' 00 00 00 - unnamed | ||
:0x'''02''' 00 00 00 - empty | :0x'''02''' 00 00 00 - empty | ||
:0x'''04''' 00 00 00 - never used; | :0x'''04''' 00 00 00 - never used; intended to mark instance as pointing to duplicate data rather than its own data | ||
: | :0x'''08''' 00 00 00 - instance's data is being used by duplicate instances as a source | ||
The Tool mode bits have been moved to the upper half of the flags byte (they are cleared altogether in the GDFX data, but this is their location in memory): | |||
:0x'''10''' 00 00 00 - touched (unsaved data) | |||
:0x'''20''' 00 00 00 - "in batch file" | |||
:0x'''40''' 00 00 00 - delete upon next save | |||
This frees up the three higher bytes for the data versioning timestamp which is in YY/MM/DD format, stored thusly: | |||
:0x00 '''00''' 00 00 - versioning timestamp – day | |||
:0x00 00 '''00''' 00 - versioning timestamp – month | |||
:0x00 00 00 '''00''' - versioning timestamp – year | |||
|} | |} | ||
Line 76: | Line 134: | ||
You'll notice that the level file header lists fewer names (7,124) than instances (9,347). That's because there are 3 types of instance: | You'll notice that the level file header lists fewer names (7,124) than instances (9,347). That's because there are 3 types of instance: | ||
*Unnamed and not empty - they are only referenced by other instances in the same file, generally as child data (e.g., 3D geometry elements like ABNA are "contained" by AKEV, a level's environment). | *Unnamed and not empty - they are only referenced by other instances in the same file, generally as child data (e.g., 3D geometry elements like ABNA are "contained" by AKEV, a level's environment). | ||
*:In vanilla Oni .dats there are some rare occurrences of unnamed non-empty ''orphan'' instances (e.g., [[OBD:File types/Naming#TRCM|TRCM]]). These are a form of garbage and are discarded by OniSplit when unpacking a level. | |||
*Named and not empty - they can be referenced by other instances in any file and the engine can use their name or template tag to find them. | *Named and not empty - they can be referenced by other instances in any file and the engine can use their name or template tag to find them. | ||
*Named and empty - "empty" instances are used in level-specific instance files (i.e. not in level0_Final.dat) to associate an instance ID with a name. For every empty resource, there's another one with a matching name in level0_Final.dat that has data in it. The empty resource in the instance file is (usually) looked up by ID, then the engine searches all the loaded files for a non-empty instance with the same name, causing it to find the actual file in the global data in level0_Final.dat. | *Named and empty - "empty" instances are used in level-specific instance files (i.e. not in level0_Final.dat) to associate an instance ID with a name. For every empty resource, there's another one with a matching name in level0_Final.dat that has data in it. The empty resource in the instance file is (usually) looked up by ID, then the engine searches all the loaded files for a non-empty instance with the same name, causing it to find the actual file in the global data in level0_Final.dat. | ||
Line 110: | Line 169: | ||
Likewise, the template descriptor array starts directly after the name descriptors. Since name descriptors are 8 bytes, 8 * 7124 (taken from the header) = 56992, or 0xDEA0, and adding that to the name descriptor array's start address (0x02DA7C) gives us 0x03B91C as the start of the template descriptors. | Likewise, the template descriptor array starts directly after the name descriptors. Since name descriptors are 8 bytes, 8 * 7124 (taken from the header) = 56992, or 0xDEA0, and adding that to the name descriptor array's start address (0x02DA7C) gives us 0x03B91C as the start of the template descriptors. | ||
The template descriptor array contains information about all templates (that is, resource types, | The template descriptor array contains information about all templates (that is, resource types, aka tags), used in the file (56 in this case, as we learned from the file header). Any resource occurring in this instance file has to have its type listed here. Here is the template descriptor at 0x3B9FC: | ||
{{Table}} | {{Table}} | ||
{{OBD_Table_Header}} | {{OBD_Table_Header}} | ||
{{OBDtr| 0x00 | int64 | | | {{OBDtr| 0x00 | int64 | | 3C B9 A6 71 08 00 00 00 | 0x871A6B93C | template checksum }} | ||
{{OBDtr| | {{OBDtr| 0x08 | tag | | 45 47 52 54 | 'EGRT' | template tag }} | ||
{{OBDtr| | {{OBDtr| 0x0C | int32 | | 01 00 00 00 | 1 | unused: number of resources in file that use this template }} | ||
|} | |} | ||
The '''template checksum''' is used to prevent loading of instance files that are not compatible with the current engine version. The '''tag''' is the same kind of number-written-as-backwards-ASCII that we discussed in the "Backwards and garbage data" section. The '''number of resources''' is | The '''template checksum''' is used to prevent loading of instance files that are not compatible with the current engine version. The '''tag''' is the same kind of number-written-as-backwards-ASCII that we discussed in the "Backwards and garbage data" section; in this case, 'EGRT' means [[TRGE]]. The field for the '''number of resources''' using this template is unused. The number should be correct for each template, but Oni never uses it for anything. | ||
You might wonder how Oni knows how to read each type of data, such as a SUBT or an ABNA. The simple answer is that this information is hard-coded into Oni. In fact, the information on each instance type, as stored in Oni's code, is actually the real "template". The file data only gives the tag and checksum that refer to a certain template. Which types of data fields are encountered in which order is already known by Oni. These hardcoded templates also tell Oni which parts of the file data are reserved for pointers. | You might wonder how Oni knows how to read each type of data, such as a SUBT or an ABNA. The simple answer is that this information is hard-coded into Oni. In fact, the information on each instance type, as stored in Oni's code, is actually the real "template". The file data only gives the tag and checksum that refer to a certain template. Which types of data fields are encountered in which order is already known by Oni. These hardcoded templates also tell Oni which parts of the file data are reserved for pointers. | ||
Line 128: | Line 187: | ||
==Data table== | ==Data table== | ||
The data table stores all the instance data. We peeked at this before when we looked at the instance descriptor for SUBTsubtitles. | The data table stores all the instance data (or points to its actual location in a raw/separate file). We peeked at this before when we looked at the instance descriptor for SUBTsubtitles. | ||
The start of each instance's record, the ID number, is always 32 byte-aligned. Thus, even though the template descriptors ended at 0x03BC9C, there are four empty bytes here so that the data table can begin at 0x03BCA0, which divides evenly by 32. This alignment also means that the instance-specific data will always be found at an offset like 0x0008, 0x0028, 0x0148 etc. | The start of each instance's record, the ID number, is always 32 byte-aligned. Thus, even though the template descriptors ended at 0x03BC9C, there are four empty bytes here so that the data table can begin at 0x03BCA0, which divides evenly by 32. This alignment also means that the instance-specific data will always be found at an offset like 0x0008, 0x0028, 0x0148 etc. | ||
Line 143: | Line 202: | ||
The '''instance's ID''' is computed as: | The '''instance's ID''' is computed as: | ||
(instance_descriptor_index << 8) <nowiki>|</nowiki> 1 | (instance_descriptor_index << 8) <nowiki>|</nowiki> 1 | ||
The 1 allows the engine to know which IDs have already been converted to pointers ( | The 1 allows the engine to know which IDs have already been converted to pointers (an instance pointer will always be 8 byte-aligned, so it will never have the zero bit already set). | ||
The '''file ID''' is computed from the name of the instance file. For "_Final" files the file ID is computed as: | The '''file ID''' is computed from the name of the instance file. For "_Final" files the file ID is computed as: | ||
Line 149: | Line 208: | ||
Again, the 1 allows the engine to know which file IDs have already been converted to pointers. | Again, the 1 allows the engine to know which file IDs have already been converted to pointers. | ||
As you can see, after the header, the size of the actual instance data can be almost anything. Thus, we cannot compute the end of the data table in any simple way. That's why the instance file header explicitly gives us the address of the name table. | As you can see, after the header, the size of the actual instance data can be almost anything. Thus, we cannot compute the end of the data table in any simple way. That's why the instance file header explicitly gives us the address of the name table that comes after this. | ||
By the way, how do we know which resource's data we're looking at in the data table? Let's look at the very first data, at 0x03BCA0. Noting that the first two numbers, the instance and file ID, do not count as data, there must be a resource with a data offset of 0x08, the lowest offset possible into the table. We can find this offset listed right at the start of the instance descriptor array: | By the way, how do we know which resource's data we're looking at in the data table? Let's look at the very first data, at 0x03BCA0. Noting that the first two numbers, the instance and file ID, do not count as data, there must be a resource with a data offset of 0x08, the lowest offset possible into the table. We can find this offset listed right at the start of the instance descriptor array: | ||
Line 159: | Line 218: | ||
{{OBDtr| 0x08 | int32 | | 00 00 00 00 | 0x00 | name offset (relative to name table) }} | {{OBDtr| 0x08 | int32 | | 00 00 00 00 | 0x00 | name offset (relative to name table) }} | ||
{{OBDtr| 0x0C | int32 | | 60 0F 00 00 | 3936 | data size }} | {{OBDtr| 0x0C | int32 | | 60 0F 00 00 | 3936 | data size }} | ||
{{OBDtr| 0x10 | int32 | | 00 00 00 00 | 0 | flags | {{OBDtr| 0x10 | int32 | | 00 00 00 00 | 0 | flags }} | ||
|} | |} | ||