Jump to content

OBD:Instance file format: Difference between revisions

m
wording tweaks, mostly in the introduction
m (→‎Header: tidying up total checksum info, mentioning the use of multiple checksums by .oni files)
m (wording tweaks, mostly in the introduction)
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{UpdatedForOniX|1.0.0}}
{{UpdatedForOniX|1.0.0}}<!--Documentation below is waiting to be un-commented.-->
{{OBD Home}}
{{OBD Home}}
{{Hatnote|".dat" redirects here; for other files ending in ".dat", see [[Oni (folder)]].}}
{{Hatnote|".dat" redirects here; for other files ending in ".dat", see [[Oni (folder)]].<br>
{{Hatnote|You should read the [[Game data terminology]] page before this one.}}
:You should read the [[Game data terminology]] page before this one.<br>
{{Hatnote|The [[Raw|Raw and separate file formats]] page should be read after this one.}}
:The [[Raw|Raw and separate file formats]] page should be read after this one.}}
Files in GameDataFolder/ named "level[0-19]_Final.dat", together with ".raw" and sometimes ".sep" counterparts, contain the game data for Oni.
Files in GameDataFolder/ named "level[0-19]_Final.dat", together with ".raw" and sometimes ".sep" counterparts, contain the game data for Oni. These are called "instance files" internally, but a more common-sense name for them is level data files. The format described below was also used for the tool files which supplied the GUI for the in-game editor, however the retail Oni game application refuses to load tool files; for the story behind the tool files, see [[level0_Tools]].


The same format was used for the tool files, named level0_Tools.dat/.raw/.sep, however the retail Oni game application does not load tool files; for the story behind the tool files, see [[level0_Tools]].
The level 0 files do not contain resources for a specific level but rather resources (instances) shared across all levels. Level 0 is loaded when the game starts and is never unloaded. All other level files, 1-19, are only loaded when their corresponding level starts and then unloaded when it ends. Oni can only hold two level files in memory concurrently. Thus, resources have to be duplicated on disk whenever a character class, sound effect, etc. occurs in more than one level. For instance, although there are only 2,380 unique sounds in the game, there are 7,386 sounds stored across all level data files.
 
The level 0 files do not actually contain a level, but instances (resources) shared across all levels. Level 0 is loaded when the game starts, and never unloaded. All other level files, 1-19, are only loaded when their corresponding level starts, and unloaded when it ends. Since Oni can only hold these two levels in memory concurrently, resources have to be duplicated on disk whenever a character class, sound effect, etc. occurs in more than one level. For instance, although there are only 2,380 unique sounds in the game, there are 7,386 sounds stored across all level data files.
{{TOClimit}}
{{TOClimit}}
==Backwards and garbage data==
==Backwards and garbage data==
During development, Oni had an [[level0_Tools|in-game editor]] which presented a GUI for manipulating AIs, particles, etc. in a level. When a developer saved his work, the contents of the level, stored in his PC's RAM, were flushed directly to disk. Thus the structure of the .dat/.raw/.sep files reflects the way in which Bungie West chose to store levels in memory. So when we read the data in the files with a hex editor, we can see eccentricities such as blank space (coming from unused fields and byte-alignment padding) and garbage data (such as now-meaningless pointer values). [[OBD:Raw_and_separate_file_formats#Gaps|These gaps]] between data chunks add up to about 25 MB for the whole game.
As mentioned, the game's developers used the in-game editor to create AIs, particles, etc. in a level. When one of these developers saved his work, the contents of the level, stored in his PC's RAM, were flushed directly to disk. Thus the structure of the .dat/.raw/.sep files reflects the way in which Bungie West chose to store levels in memory. So when we read the data in the files with a hex editor, we can see eccentricities such as blank space (coming from unused fields and byte-alignment padding) and garbage data (such as now-meaningless pointer values). [[OBD:Raw and separate file formats#Gaps|Further gaps]], mostly representing orphaned obsolete resources, add up to about 25 MB for the whole game.


Additionally, because the levels were built on Intel-based machines, which use a little-endian architecture, sequences of bytes which represent numbers were written from least-significant to most-significant byte. [[wikipedia:FourCC|FourCCs]] in the data are stored "backwards", such as "13RV" which is meant to be read "VR31", because Bungie defined those four bytes as a 32-bit integer, not a string, causing them to be written to disk in little-endian order.
Additionally, because the levels were built on Intel-based machines, which use a little-endian architecture, sequences of bytes which represent numbers were written from least-significant to most-significant byte. [[wp:FourCC|FourCCs]] in the data are stored "backwards", such as "13RV" which is meant to be read "VR31", because Bungie defined those four bytes as a 32-bit integer, not a string, causing them to be written to disk in little-endian order.


==File limits==
==File limits==
Line 24: Line 22:


==Header==
==Header==
Here is a walkthrough of an instance file using the level0_Final.dat in English Windows Oni. Follow along in a hex editor for maximum learnage. Each term will be explained in-depth when we fully consider the related data. First, here is how the file begins:
Here is a walkthrough of an instance file using the level0_Final.dat in English Windows Oni. Follow along in a hex editor for maximum educational value. Each term will be explained in-depth when we fully consider the related data. First, here is how the file begins:
{{Table}}
{{Table}}
{{OBD_Table_Header}}
{{OBD_Table_Header}}
{{OBDtr| 0x00 | int64   | | 1F 27 DC 33 DF BC 03 00 | 0x0003BCDF33DC271F | Total template checksum (main indicator of engine compatibility):
{{OBDtr| 0x00 | uint64   | | 1F 27 DC 33 DF BC 03 00 | 0x0003BCDF33DC271F | Total template checksum (main indicator of engine compatibility):
*0x0003BCDF33DC271F ("PC") - templates compatible with Windows retail engine(s)
*0x0003BCDF33DC271F (v1.0) - templates compatible with Windows retail engine(s)
*0x0003BCDF23C13061 ("Mac") - templates compatible with Windows demo and Mac engines
*0x0003BCDF23C13061 (v1.1) - templates compatible with Windows demo and Mac engines
*0x0003BA70A8D8AE11 ("PS2") - templates compatible with PlayStation 2 engine(s)
*0x0003BA70A8DBAE11 (PS2) - templates compatible with PlayStation 2 engine(s)
*0x0000000000000000 (blank) - for use with [[OniX]] engine(s) (VR33), which handle data versioning using the 0x3C field below
*0x0000000000000000 (blank) - for use with [[OniX]] engine(s)<!--, which instead handle data versioning using the 0x3C field below-->
OniSplit's .oni files (VR32) use PC checksum by default and Mac/PS2 checksums where required (SNDD, TXMP, AGQG, M3GM, IGSt, TSFT/TSGA, TRAM/TREX) }}
OniSplit's .oni files use PC 1.0 checksum by default and 1.1 checksums when holding data that is stored differently in the 1.1 format (SNDD, TXMP, AGQG, M3GM, IGSt, TSFT/TSGA, TRAM/TREX) }}
{{OBDtr| 0x08 | int32   | | 31 33 52 56             | '13RV'             | .dat version (meant to be read as "VR31")<br>OniSplit's .oni files use '23RV' ("VR32") instead<br>OniX's [[Oni (folder)|GDFX]] uses '33RV' ("VR33") to signify that the new data versioning system is in use }}
{{OBDtr| 0x08 | uint32   | | 31 33 52 56 | '13RV'   | .dat version (meant to be read as "VR31")<br>OniSplit's .oni files use '23RV' ("VR32") instead<br>OniX's [[Oni (folder)|GDFX]] uses '33RV' ("VR33") to signify that the new data versioning system is in use }}
{{OBDtr| 0x0C | int16   | | 40 00 | 64 | size of this header }}
{{OBDtr| 0x0C | uint16   | | 40 00       | 64       | size of this header }}
{{OBDtr| 0x0E | int16   | | 14 00 | 20 | size of instance descriptor }}
{{OBDtr| 0x0E | uint16   | | 14 00       | 20       | size of instance descriptor }}
{{OBDtr| 0x10 | int16   | | 10 00 | 16 | size of template descriptor }}
{{OBDtr| 0x10 | uint16   | | 10 00       | 16       | size of template descriptor }}
{{OBDtr| 0x12 | int16   | | 08 00 | 8 | size of name descriptor }}
{{OBDtr| 0x12 | uint16   | | 08 00       | 8         | size of name descriptor }}
{{OBDtr| 0x14 | int32   | | 83 24 00 00 | 9347      | instance descriptor count  }}
{{OBDtr| 0x14 | uint32   | | 83 24 00 00 | 9347      | instance descriptor count  }}
{{OBDtr| 0x18 | int32   | | D4 1B 00 00 | 7124      | name descriptor count }}
{{OBDtr| 0x18 | uint32   | | D4 1B 00 00 | 7124      | name descriptor count }}
{{OBDtr| 0x1C | int32   | | 38 00 00 00 | 56        | template descriptor count }}
{{OBDtr| 0x1C | uint32   | | 38 00 00 00 | 56        | template descriptor count }}
{{OBDtr| 0x20 | int32   | | A0 BC 03 00 | 0x03BCA0  | data table offset }}
{{OBDtr| 0x20 | uint32   | | A0 BC 03 00 | 0x03BCA0  | data table offset }}
{{OBDtr| 0x24 | int32   | | A0 35 25 00 | 2438560  | data table size }}
{{OBDtr| 0x24 | uint32   | | A0 35 25 00 | 2438560  | data table size }}
{{OBDtr| 0x28 | int32   | | 40 F2 28 00 | 0x28F240  | name table offset }}
{{OBDtr| 0x28 | uint32   | | 40 F2 28 00 | 0x28F240  | name table offset }}
{{OBDtr| 0x2C | int32   | | 04 4F 02 00 | 151300    | name table size }}
{{OBDtr| 0x2C | uint32   | | 04 4F 02 00 | 151300    | name table size }}
{{OBDtr| 0x30 | int32   | | 99 CF 40 00 | (garbage) | used by OniSplit for raw table offset }}
{{OBDtr| 0x30 | uint32   | | 99 CF 40 00 | (garbage) | used by OniSplit for raw table offset }}
{{OBDtr| 0x34 | int32   | | 90 4F 63 00 | (garbage) | used by OniSplit for raw table size }}
{{OBDtr| 0x34 | uint32   | | 90 4F 63 00 | (garbage) | used by OniSplit for raw table size }}
{{OBDtr| 0x38 | int32   | | F4 55 5F 00 | (garbage) | unused }}
{{OBDtr| 0x38 | uint32   | | F4 55 5F 00 | (garbage) | unused<!--used by OniX for data versioning; the three high bytes contains the highest data version (timestamp) found in any instance in this .dat; see instance descriptor table's 0x10 for format--> }}
{{OBDtr| 0x3C | int32   | | 90 4F 63 00 | (garbage) | used by OniX (three high bytes) for data versioning; contains the highest data type version found in any instance in this .dat }}
{{OBDtr| 0x3C | uint32   | | 90 4F 63 00 | (garbage) | unused<!--used by OniX for content versioning; the three high bytes contain the highest content version (timestamp) found in any instance in this .dat; see instance descriptor table's 0x10 for format--> }}
|}
|}


Line 123: Line 121:
:0x'''20''' 00 00 00 - "in batch file"
:0x'''20''' 00 00 00 - "in batch file"
:0x'''40''' 00 00 00 - delete upon next save
:0x'''40''' 00 00 00 - delete upon next save
This frees up the three higher bytes for the data versioning timestamp:
This frees up the three higher bytes for the data versioning timestamp which is in YY/MM/DD format, stored thusly:
:0x00 '''00''' 00 00 - versioning timestamp – year
:0x00 '''00''' 00 00 - versioning timestamp – day
:0x00 00 '''00''' 00 - versioning timestamp – month
:0x00 00 '''00''' 00 - versioning timestamp – month
:0x00 00 00 '''00''' - versioning timestamp – day
:0x00 00 00 '''00''' - versioning timestamp – year
|}
|}


Line 134: Line 132:
You'll notice that the level file header lists fewer names (7,124) than instances (9,347). That's because there are 3 types of instance:
You'll notice that the level file header lists fewer names (7,124) than instances (9,347). That's because there are 3 types of instance:
*Unnamed and not empty - they are only referenced by other instances in the same file, generally as child data (e.g., 3D geometry elements like ABNA are "contained" by AKEV, a level's environment).
*Unnamed and not empty - they are only referenced by other instances in the same file, generally as child data (e.g., 3D geometry elements like ABNA are "contained" by AKEV, a level's environment).
*:In vanilla Oni .dats there are some rare occurrences of unnamed non-empty ''orphan'' instances (e.g., [[OBD:File_types/Named#TRCM|TRCM]]). These are a form of garbage and are discarded by OniSplit when unpacking a level.  
*:In vanilla Oni .dats there are some rare occurrences of unnamed non-empty ''orphan'' instances (e.g., [[OBD:File types/Naming#TRCM|TRCM]]). These are a form of garbage and are discarded by OniSplit when unpacking a level.  
*Named and not empty - they can be referenced by other instances in any file and the engine can use their name or template tag to find them.
*Named and not empty - they can be referenced by other instances in any file and the engine can use their name or template tag to find them.
*Named and empty - "empty" instances are used in level-specific instance files (i.e. not in level0_Final.dat) to associate an instance ID with a name. For every empty resource, there's another one with a matching name in level0_Final.dat that has data in it. The empty resource in the instance file is (usually) looked up by ID, then the engine searches all the loaded files for a non-empty instance with the same name, causing it to find the actual file in the global data in level0_Final.dat.
*Named and empty - "empty" instances are used in level-specific instance files (i.e. not in level0_Final.dat) to associate an instance ID with a name. For every empty resource, there's another one with a matching name in level0_Final.dat that has data in it. The empty resource in the instance file is (usually) looked up by ID, then the engine searches all the loaded files for a non-empty instance with the same name, causing it to find the actual file in the global data in level0_Final.dat.
Line 169: Line 167:
Likewise, the template descriptor array starts directly after the name descriptors. Since name descriptors are 8 bytes, 8 * 7124 (taken from the header) = 56992, or 0xDEA0, and adding that to the name descriptor array's start address (0x02DA7C) gives us 0x03B91C as the start of the template descriptors.
Likewise, the template descriptor array starts directly after the name descriptors. Since name descriptors are 8 bytes, 8 * 7124 (taken from the header) = 56992, or 0xDEA0, and adding that to the name descriptor array's start address (0x02DA7C) gives us 0x03B91C as the start of the template descriptors.


The template descriptor array contains information about all templates (that is, resource types, AKA tags), used in the file (56 in this case, as we learned from the file header). Any resource occurring in this instance file has to have its type listed here. Here is the template descriptor at 0x3B9FC:
The template descriptor array contains information about all templates (that is, resource types, aka tags), used in the file (56 in this case, as we learned from the file header). Any resource occurring in this instance file has to have its type listed here. Here is the template descriptor at 0x3B9FC:


{{Table}}
{{Table}}
Line 175: Line 173:
{{OBDtr| 0x00 | int64  | | 3C B9 A6 71 08 00 00 00 | 0x871A6B93C | template checksum }}
{{OBDtr| 0x00 | int64  | | 3C B9 A6 71 08 00 00 00 | 0x871A6B93C | template checksum }}
{{OBDtr| 0x08 | tag    | | 45 47 52 54            | 'EGRT'      | template tag }}
{{OBDtr| 0x08 | tag    | | 45 47 52 54            | 'EGRT'      | template tag }}
{{OBDtr| 0x0C | int32  | | 01 00 00 00            | 1          | number of resources in file that use this template }}
{{OBDtr| 0x0C | int32  | | 01 00 00 00            | 1          | unused: number of resources in file that use this template }}
|}
|}


The '''template checksum''' is used to prevent loading of instance files that are not compatible with the current engine version. The '''tag''' is the same kind of number-written-as-backwards-ASCII that we discussed in the "Backwards and garbage data" section; in this case, 'EGRT' means [[TRGE]]. The '''number of resources''' is self-explanatory.
The '''template checksum''' is used to prevent loading of instance files that are not compatible with the current engine version. The '''tag''' is the same kind of number-written-as-backwards-ASCII that we discussed in the "Backwards and garbage data" section; in this case, 'EGRT' means [[TRGE]]. The field for the '''number of resources''' using this template is unused. The number should be correct for each template, but Oni never uses it for anything.


You might wonder how Oni knows how to read each type of data, such as a SUBT or an ABNA. The simple answer is that this information is hard-coded into Oni. In fact, the information on each instance type, as stored in Oni's code, is actually the real "template". The file data only gives the tag and checksum that refer to a certain template. Which types of data fields are encountered in which order is already known by Oni. These hardcoded templates also tell Oni which parts of the file data are reserved for pointers.
You might wonder how Oni knows how to read each type of data, such as a SUBT or an ABNA. The simple answer is that this information is hard-coded into Oni. In fact, the information on each instance type, as stored in Oni's code, is actually the real "template". The file data only gives the tag and checksum that refer to a certain template. Which types of data fields are encountered in which order is already known by Oni. These hardcoded templates also tell Oni which parts of the file data are reserved for pointers.