The Valkyrie Profile CD only contain a very big file, and that's about it. This file starts with some huge chunks of obviously encrypted data, then is followed with normal data. This is just another case of proprietary archive system. Here's some extract of the encrypted part at the top:

00000000   05 01 7e 7d bd 89 bc 37 19 54 57 c3 f5 36 be de
00000010 c1 99 64 f8 9d 65 de 93 79 62 ae 2f 55 68 c5 eb
00000020 21 70 c7 8a bf 1b e1 73 1b 62 b6 0f f7 e8 4f 8b
00000030 c3 77 4c 06 9f d6 5a 91 7b 35 5a 28 57 11 5a e3
00000040 23 7d 51 be bf c9 44 59 1b 14 15 70 f7 72 61 0b
00000050 c3 d9 f3 86 9f 24 4f 10 7b 23 ac ab 57 2b 81 67
00000060 23 33 dd 33 bf 1b 70 cf 1b 62 87 ca f7 e8 98 c4
00000070 c3 77 b7 d6 9f d7 98 e0 7b 34 76 ba 57 10 45 55
00000080 23 7c 3b 64 bf c8 b9 37 1b 15 3e c3 f7 73 9c de
.
.
.
00000260 23 22 dc 33 bf 09 3b cf 1b 70 d5 ca f7 fb f4 c4
00000270 c3 64 d6 d6 9f c5 ce e0 7b 26 1b ba 57 0d 73 55
00000280 23 61 1c 64 bf d5 db 37 1b 09 2f c3 f7 6f ef de
00000290 c3 c4 74 f8 9e 60 dd 93 7a 67 af 2f 56 6e f7 eb
000002a0 22 76 d0 a7 be 5e a0 73 1a 27 e5 0f f6 ae 52 8b
000002b0 c2 31 3b 06 9e 91 48 91 7a 72 58 28 56 51 6d e3
000002c0 22 3d 77 be be 89 28 59 1a 54 35 70 f6 33 15 0b
000002d0 c2 98 fb 86 9e 66 1c 10 7a 61 e3 ab 56 69 8f 67
000002e0 22 71 da 33 be 58 21 cf 1a 21 a0 ca f6 a4 f9 c4
000002f0 c2 3b e5 d6 9e 9a fc e0 7a 78 7e ba 56 5c 17 55
.
.
.

There's a few interesting things to spot there. First, if you look at certain columns, you can see that the value flips between two extreme values. For example, the very first column flips from c3 to 23 back and forth for a long time. 23 in binary is 00100011 and C3 is 11000011, so only a few bits are actually flipping. Second, you can see that these value slowly change. The c3/23 column changes to c2/22 after a while. There's another very similar column later, flipping between 1b/7b and then flipping between 1a/7a.

For me, that's actually a big hint that there's XOR encryption at work here, but with a fixed-length key. A kinda big one, but still fixed length. If you try a bit harder, you'll finally notice that the whole pattern of flipping bits repeats itself after 256 bytes. Now, looking up to the end of the encrypted part, you'll notice that the same 256-bytes sequence repeats over and over. That's another hint about the fact the people doing this are quite braindead: the index is non-compressed, zero-padded, which means that you get the encryption stream in clear during that part. So a very simple bruteforce is to take the 256-bytes pattern at the end of the encrypted part, and use it to decrypt the whole index.

First rule: if you're going to try encrypting stuff, compress it first. That's one basic golden rule for encryption.

Now, let's dig further. This algorithm is obviously using XOR, right ? So, let's disassemble the binary using a very simple home-made MIPS disassembler, and search for the first "xor" instruction. Ho, here it goes:

(--) 80011DE0 02002021: move    $a0, $s0        ; a0 = s0
(--) 80011DE4 3C056428: lui $a1, 0x6428 ; Load upper immediate
(--) 80011DE8 34A53921: ori $a1, 0x3921 ; Or immediate
(--) 80011DEC 00003821: move $a3, $0 ; a3 = 0
(--) 80011DF0 3C086428: lui $t0, 0x6428 ; Load upper immediate
(--) 80011DF4 35083921: ori $t0, 0x3921 ; Or immediate
(--) 80011DF8 02003021: move $a2, $s0 ; a2 = s0
(--) 80011DFC 02072021: addu $a0, $s0, $a3 ; a0 = s0 + a3
(--) Reference from 0x80011E30
(--) 80011E00 24E70001: addiu $a3, 0x0001 ; Add immediate
(--) 80011E04 8CC30004: lw $v1, 0x0004($a2) ; Load word
(--) 80011E08 00051040: sll $v0, $a1, 0x01 ; v0 = a1 << immediate
(--) 80011E0C 00651826: xor $v1, $a1 ; v1 = v1 ^ a1
(--) 80011E10 00A22826: xor $a1, $v0 ; a1 = a1 ^ v0
(--) 80011E14 ACC30004: sw $v1, 0x0004($a2) ; Store word
(--) 80011E18 24C60004: addiu $a2, 0x0004 ; Add immediate
(--) 80011E1C 90835004: lbu $v1, 0x5004($a0) ; Load unsigned byte
(--) 80011E20 00051027: nor $v0, $0, $a1 ; v0 = ~(0 & a1)
(--) 80011E24 00651826: xor $v1, $a1 ; v1 = v1 ^ a1
(--) 80011E28 00482826: xor $a1, $v0, $t0 ; a1 = v0 ^ t0
(--) 80011E2C 28E21400: slti $v0, $a3, 0x1400 ; v0 = a3 < immediate ? 1 : 0 (signed)
(--) 80011E30 1440FFF2: bnz $v0, 0x80011DFC ; Branch if v0 != 0
(--) Reference to 0x80011DFC
(--) 80011E34 A0835004: sb $v1, 0x5004($a0) ; Store byte
(--) 80011E38 8FBF0020: lw $ra, 0x0020($sp) ; Load word
(--) 80011E3C 8FB1001C: lw $s1, 0x001C($sp) ; Load word
(--) 80011E40 8FB00018: lw $s0, 0x0018($sp) ; Load word
(--) 80011E44 27BD0028: addiu $sp, 0x0028 ; Add immediate
(--) 80011E48 03E00008: jr $ra ; Jump register
(--) 80011E4C 00000000: nop

And THAT's the actual encryption algorithm. Found in about 8 seconds.

Second rule: if you ever ship the encryption algorithm within the product alongside the encrypted data, OBFUSCATE it, and don't use extra-obvious encryption instructions such as xor. There are other very simple mechanisms you can use such as add/sub for instance that'll achieve the same goal, but will be way less obvious to spot within the code.

Finally, let's have a look at the algorithm itself. For you who don't really like MIPS assembly, here's the a simplified C++ version of it:

class VP_crypto() {
public:
    static unsigned int key = 0x64283921;
    VP_crypto() : m_state(key) { }
    unsigned int GetNext() {
        unsigned int next = m_state;
        m_state ^= m_state << 1;
        m_state = ~m_state ^ key;
        return next;
    }
private:
    unsigned int m_state;
}

At least two things to note there.

  • The "key" should be something secret. Even though this is shipped within the binary, you SHOULD NOT emit the key itself as the first encryption bytes! The "next" value gets the key value at the first iteration.
  • You actually can do a proper random number generator which would look this way. Almost. But the fact this one is trying its best to be smart or fast makes it a huge failure, as this will invariably loop after a very short amount of bytes! The failure lies in not trying to multiply the state by a prime number. Simply doing this would get you million of random number bytes without too much sweat. The result would still be very weak, but not as weak as this awful beast here.

As a conclusion: never ever try to cook your own encryption algorithm. Use AES. Use even RC4. And try to write implementations of them which are NOT obvious to find. Remember that just everyone is going to use a pseudo random number generator, and use its output to XOR the values. Just avoiding XORing is going to help a lot, since the opcodes are going to be awfully obvious to find within the disassembled output. Oh and CHECK your implementation against known examples! How many times did I see buggy RC4 implementations broken to the point that it wasn't really encrypting anything. Encryption is serious business.