Flux RSS

Another failed attempt at conceiling data

Another notable game developer also tried to conceil its game data on the disc. The game's Kingdom Hearts 2. Their idea was to identify each file by the hash of its name. Smart don't you think? Well, not quite...

In theory, the idea of using hash instead of real name is good for hiding the true file name from the curious. But in that particular case, they made a big mistake that rendered the whole protection useless. It's probable that during the game development, the game engine was accessing all the data by their file name. And of course, some data files were referencing other files by their name. As a result, the real names of lots of files are spread across the whole game. So, when the time came to create the game master, they didn't modify the data to only use the hash of a file to access it, but just made a function that would take a file name, computes its hash, and access the data based on this hash. Now to circumvent the whole protection around the data filenames, one can easily hack into the function that compute the hash, and just add a printf call to have the clear file names output to the console as the game is running. Using those dumped file names, it's easy to figure out the layout of the game and just bruteforce all the other filenames. For example, I then knew that all level files were named "level-*.dat" with * being the level number.

Now, at the end of the process, I ended-up with a bunch of files that I hadn't managed to bruteforce their names. But by taking a look at the way the hash table, I noticed that it was sorted alphabetically. And that was the second big mistake in that game.

This is an example of what the hash table would look like:

hashfile name

In that case, that means that my bruteforce method had failed to figure out the name of the file whose hash is 0x89ABCDEF. But since I know the hash table is in alphabetical order, That means that my missing file must be after "datafile1001.dat" and before "datafile1002.dat". So, the missing file's name must be of the form: "datafile100*". Using that knowledge, I wrote another bruteforce system that would bruteforce all the file names starting with "datafile100*" and compare it to the missing file hash 0x89ABCDEF. Of course, in practice, the bruteforce would start with file whose name were ordered after "datafile1001.dat".

In conclusion, if you want to hide your files, plan it before-hand and use the file hash early in the development so you don't leave clear filenames in your build. And of course, don't sort the hashtable by filenames. You can for example sort it by hash value for a potential speed boost while searching for your hash.