Parts
- Part 1 — Searching The Data → /posts/wows_depack_part1/
- Part 2 — Getting The Metadata → /posts/wows_depack_part2/
- Part 3 — Reading The Database → /posts/wows_depack_part3/
- Part 4 — Tidying-Up The Project → /posts/wows_depack_part4/
Searching & Reading The Metadata
In Part 1, we discovered:
- Data lives in
res_packages/
as custom.pkg
archives. - Each
.pkg
is a sequence of DEFLATE-compressed blobs separated by 64-bit IDs with zero padding. - No file names inside
.pkg
; names/paths must exist elsewhere (in indexes).
Now we need to find where and how the file and directory structure is stored.
Back To File Exploration
The metadata hopefully isn’t embedded in executables. Let’s search:
# List all files
# then remove uninteresting bits like replays, crashes, DLLs, logs, .pkg
# or CEF stuff (embedded Chrome used for the armory, inventory, dockyard and clan base))
kakwa@linux Games/World of Warships » find ./ -type f | grep -v cef | grep -v replays \
| grep -v crashes | grep -v '.pkg' | grep -v '.dll' | grep -v '.log' \
| grep -v '\.exe' | less
[...]
./bin/6775398/res/texts/pl/LC_MESSAGES/global.mo
./bin/6775398/res/texts/zh_tw/LC_MESSAGES/global.mo
./bin/6775398/res/texts/fr/LC_MESSAGES/global.mo
./bin/6775398/res/texts/zh_sg/LC_MESSAGES/global.mo
./bin/6775398/res/camerasConsumer.xml
./bin/6775398/bin32/paths.xml
./bin/6775398/bin32/Licenses.txt
./bin/6775398/bin32/monitor_config.json
./bin/6775398/bin64/paths.xml
./bin/6775398/bin64/Licenses.txt
./bin/6775398/bin64/monitor_config.json
[...]
./bin/6775398/idx/spaces_dock_dry.idx
./bin/6775398/idx/spaces_dock_hsf.idx
./bin/6775398/idx/spaces_sea_hope.idx
./bin/6775398/idx/vehicles_pve.idx
./bin/6775398/idx/vehicles_level8_fr.idx
./bin/6775398/idx/vehicles_level5_panasia.idx
[...]
The .idx
files look promising, especially their names match the .pkg
files:
spaces_dock_hsf.idx
→spaces_dock_hsf_0001.pkg
vehicles_level9_ned.idx
→vehicles_level9_ned_0001.pkg
Checking We Have The Metadata
Let’s take a look:
kakwa@linux bin/6775398/idx » strings -n 5 system_data.idx
#%B'E
#%B'E
#%B'E
E)zj'
FM'lb
%}n:b
( ?A+
c|'lY
zc78'
tKDStorage.bin
waves_heights1.dds
animatedMiscs.xml
LowerDeck.dds
[...]
maps
helpers
FoamMapLowFreq.dds
tritanopia.dds
color_correct_default.dds
Color.dds
Shimmer.dds
highlight_noise.dds
space_variation_dummy.dds
waves_heights0.dds
4F$)p
E)zjp
|5*y
FM'lp
k4|W8
%}n:p
( ?A+
c|'lp
LrL)t
atw|$
:M+Xp
F?mep
wsystem_data_0001.pkg
Bingo, we have all the file names, and at the end, the name of the corresponding .pkg
file.
These .idx
files, as the extension indicates, are our indexes containing all the file names and metadata.
Also, note that we have a few names (like maps
or helpers
) without extensions, these are probably directory names.
Note: bin/
directory and game versions
The ./bin
directory contains build-numbered subdirectories (5241351, 6081105, etc.), with WoWs keeping the current and previous builds. We’ll use the highest number for the latest indexes.
IDX File Structure
General Layout
Let’s examine an index file:
kakwa@linux 6775398/idx » hexdump -C system_data.idx | less
00000000 49 53 46 50 00 00 00 02 91 9d 39 b4 40 00 00 00 |ISFP......9.@...|
00000010 37 01 00 00 1c 01 00 00 01 00 00 00 00 00 00 00 |7...............|
00000020 28 00 00 00 00 00 00 00 f6 3b 00 00 00 00 00 00 |(........;......|
00000030 36 71 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 |6q..............|
00000040 e0 26 00 00 00 00 00 00 8f 0c 9a ba 4f 40 b6 93 |.&..........O@..|
00000050 27 b9 08 b1 d1 a1 b1 db 13 00 00 00 00 00 00 00 |'...............|
00000060 ce 26 00 00 00 00 00 00 8f ec 87 4a 28 d0 f7 c7 |.&.........J(...|
00000070 a4 eb 1b 3e 50 21 d8 74 12 00 00 00 00 00 00 00 |...>P!.t........|
00000080 c1 26 00 00 00 00 00 00 ad 70 a2 e7 ac 2c 4f 6b |.&.......p...,Ok|
00000090 27 b9 08 b1 d1 a1 b1 db 0e 00 00 00 00 00 00 00 |'...............|
000000a0 b3 26 00 00 00 00 00 00 4e 84 a5 6a 94 dc 1f 7f |.&......N..j....|
000000b0 62 f5 aa 4b 5e 15 7f 93 10 00 00 00 00 00 00 00 |b..K^...........|
000000c0 a1 26 00 00 00 00 00 00 4e b0 fe 23 62 40 a5 65 |.&......N..#b@.e|
000000d0 27 b9 08 b1 d1 a1 b1 db 20 00 00 00 00 00 00 00 |'....... .......|
000000e0 91 26 00 00 00 00 00 00 8e c7 6a 58 7c 86 62 33 |.&........jX|.b3|
000000f0 27 b9 08 b1 d1 a1 b1 db 0f 00 00 00 00 00 00 00 |'...............|
00000100 91 26 00 00 00 00 00 00 8e 43 3d e9 cf 49 52 a4 |.&.......C=..IR.|
00000110 62 f5 aa 4b 5e 15 7f 93 16 00 00 00 00 00 00 00 |b..K^...........|
00000120 80 26 00 00 00 00 00 00 0e 3c 9a 6d 22 de 7b da |.&.......<.m".{.|
00000130 4e b0 fe 23 62 40 a5 65 0b 00 00 00 00 00 00 00 |N..#b@.e........|
00000140 76 26 00 00 00 00 00 00 0e 48 58 ea 50 44 1a 47 |v&.......HX.PD.G|
00000150 df 61 50 67 c7 3a dd 7a 13 00 00 00 00 00 00 00 |.aPg.:.z........|
00000160 61 26 00 00 00 00 00 00 e0 9f c3 bd d2 12 20 04 |a&............ .|
00000170 09 28 f1 df 2d 04 93 de 0b 00 00 00 00 00 00 00 |.(..-...........|
00000180 54 26 00 00 00 00 00 00 e0 95 53 f6 cc 08 c0 46 |T&........S....F|
00000190 06 cf 85 bd 69 99 e2 46 0d 00 00 00 00 00 00 00 |....i..F........|
000001a0 3f 26 00 00 00 00 00 00 e0 99 68 5e 2d 70 13 72 |?&........h^-p.r|
000001b0 4c d1 2e 30 73 38 d9 13 10 00 00 00 00 00 00 00 |L..0s8..........|
[...]
000026c0 0d 15 00 00 00 00 00 00 ce 6a 28 bc cf e7 79 c8 |.........j(...y.|
000026d0 a4 eb 1b 3e 50 21 d8 74 1a 00 00 00 00 00 00 00 |...>P!.t........|
000026e0 01 15 00 00 00 00 00 00 6c c0 c9 f7 7e 00 03 05 |........l...~...|
000026f0 a4 eb 1b 3e 50 21 d8 74 13 00 00 00 00 00 00 00 |...>P!.t........|
00002700 fb 14 00 00 00 00 00 00 74 d1 1b 8d f4 ff 7a ce |........t.....z.|
00002710 a4 eb 1b 3e 50 21 d8 74 4b 44 53 74 6f 72 61 67 |...>P!.tKDStorag|
00002720 65 2e 62 69 6e 00 77 61 76 65 73 5f 68 65 69 67 |e.bin.waves_heig|
00002730 68 74 73 31 2e 64 64 73 00 61 6e 69 6d 61 74 65 |hts1.dds.animate|
00002740 64 4d 69 73 63 73 2e 78 6d 6c 00 4c 6f 77 65 72 |dMiscs.xml.Lower|
00002750 44 65 63 6b 2e 64 64 73 00 63 6f 6d 6d 61 6e 64 |Deck.dds.command|
[...]
00003bc0 2e 64 64 73 00 68 69 67 68 6c 69 67 68 74 5f 6e |.dds.highlight_n|
00003bd0 6f 69 73 65 2e 64 64 73 00 73 70 61 63 65 5f 76 |oise.dds.space_v|
00003be0 61 72 69 61 74 69 6f 6e 5f 64 75 6d 6d 79 2e 64 |ariation_dummy.d|
00003bf0 64 73 00 77 61 76 65 73 5f 68 65 69 67 68 74 73 |ds.waves_heights|
00003c00 30 2e 64 64 73 00 8f 0c 9a ba 4f 40 b6 93 70 11 |0.dds.....O@..p.|
00003c10 03 07 0d 33 ed 77 00 00 00 00 00 00 00 00 05 00 |...3.w..........|
00003c20 00 00 01 00 00 00 f5 21 00 00 bf 00 45 5c 6c 36 |.......!....E\l6|
00003c30 00 00 00 00 00 00 8f ec 87 4a 28 d0 f7 c7 70 11 |.........J(...p.|
00003c40 03 07 0d 33 ed 77 1e 9b ef 05 00 00 00 00 05 00 |...3.w..........|
00003c50 00 00 01 00 00 00 15 15 01 00 03 77 63 97 3e ab |...........wc.>.|
00003c60 02 00 00 00 00 00 ad 70 a2 e7 ac 2c 4f 6b 70 11 |.......p...,Okp.|
00003c70 03 07 0d 33 ed 77 05 22 00 00 00 00 00 00 05 00 |...3.w."........|
00003c80 00 00 01 00 00 00 cb 01 00 00 6d b9 de c1 ad 0c |..........m.....|
[...]
000070a0 00 00 01 00 00 00 90 24 00 00 62 c1 d9 06 f8 d8 |.......$..b.....|
000070b0 00 00 00 00 00 00 57 b3 82 06 56 f0 2a e6 70 11 |......W...V.*.p.|
000070c0 03 07 0d 33 ed 77 ea cc 15 0a 00 00 00 00 05 00 |...3.w..........|
000070d0 00 00 01 00 00 00 7c 01 00 00 84 a8 12 1d 61 04 |......|.......a.|
000070e0 00 00 00 00 00 00 57 64 91 29 c9 3c f0 96 70 11 |......Wd.).<..p.|
000070f0 03 07 0d 33 ed 77 a9 ef 15 0a 00 00 00 00 05 00 |...3.w..........|
00007100 00 00 01 00 00 00 6f 09 00 00 fc 56 94 f8 9a 37 |......o....V...7|
00007110 00 00 00 00 00 00 21 67 ac 70 22 ec ca b8 70 11 |......!g.p"...p.|
00007120 03 07 0d 33 ed 77 28 f9 15 0a 00 00 00 00 05 00 |...3.w(.........|
00007130 00 00 01 00 00 00 0b 2d 00 00 03 bd b0 50 67 e9 |.......-.....Pg.|
00007140 00 00 00 00 00 00 15 00 00 00 00 00 00 00 18 00 |................|
00007150 00 00 00 00 00 00 70 11 03 07 0d 33 ed 77 73 79 |......p....3.wsy|
00007160 73 74 65 6d 5f 64 61 74 61 5f 30 30 30 31 2e 70 |stem_data_0001.p|
00007170 6b 67 00 |kg.|
The general layout of the file appears to be at least in 3 chunks:
- A first chunk of metadata
- All the file name strings
\0
-separated, but also a few strings without extensions, probably directory names - A second chunk of metadata
Let’s look for the IDs we found in the corresponding .pkg
(here system_data_0001.pkg
), for example 00 00 00 00 | bf 00 45 5c | 6c 36 00 00 | 00 00 00 00
.
[......................] 00 8f 0c 9a ba 4f 40 b6 93 70 11 |0.dds.....O@..p.|
00003c10 03 07 0d 33 ed 77 00 00 00 00 00 00 00 00 05 00 |...3.w..........|
00003c20 00 00 01 00 00 00 f5 21 00 00 bf 00 45 5c 6c 36 |.......!....E\l6|
00003c30 00 00 00 00 00 00 8f ec [...]
Ok, it’s there, in the second chunk. And it also works if we test for other IDs. We have at least a link by ID between the .idx
and the .pkg
file.
We will come back later to the second chunk, remembering that, but let’s focus on the first chunk for now.
Metadata Chunk Format
The first section contains file metadata entries:
kakwa@linux 6775398/idx » hexdump -C system_data.idx | less
00000000 49 53 46 50 00 00 00 02 91 9d 39 b4 40 00 00 00 |ISFP......9.@...|
00000010 37 01 00 00 1c 01 00 00 01 00 00 00 00 00 00 00 |7...............|
00000020 28 00 00 00 00 00 00 00 f6 3b 00 00 00 00 00 00 |(........;......|
00000030 36 71 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 |6q..............|
00000040 e0 26 00 00 00 00 00 00 8f 0c 9a ba 4f 40 b6 93 |.&..........O@..|
00000050 27 b9 08 b1 d1 a1 b1 db 13 00 00 00 00 00 00 00 |'...............|
00000060 ce 26 00 00 00 00 00 00 8f ec 87 4a 28 d0 f7 c7 |.&.........J(...|
00000070 a4 eb 1b 3e 50 21 d8 74 12 00 00 00 00 00 00 00 |...>P!.t........|
00000080 c1 26 00 00 00 00 00 00 ad 70 a2 e7 ac 2c 4f 6b |.&.......p...,Ok|
00000090 27 b9 08 b1 d1 a1 b1 db 0e 00 00 00 00 00 00 00 |'...............|
000000a0 b3 26 00 00 00 00 00 00 4e 84 a5 6a 94 dc 1f 7f |.&......N..j....|
000000b0 62 f5 aa 4b 5e 15 7f 93 10 00 00 00 00 00 00 00 |b..K^...........|
000000c0 a1 26 00 00 00 00 00 00 4e b0 fe 23 62 40 a5 65 |.&......N..#b@.e|
000000d0 27 b9 08 b1 d1 a1 b1 db 20 00 00 00 00 00 00 00 |'....... .......|
000000e0 91 26 00 00 00 00 00 00 8e c7 6a 58 7c 86 62 33 |.&........jX|.b3|
000000f0 27 b9 08 b1 d1 a1 b1 db 0f 00 00 00 00 00 00 00 |'...............|
00000100 91 26 00 00 00 00 00 00 8e 43 3d e9 cf 49 52 a4 |.&.......C=..IR.|
00000110 62 f5 aa 4b 5e 15 7f 93 16 00 00 00 00 00 00 00 |b..K^...........|
00000120 80 26 00 00 00 00 00 00 0e 3c 9a 6d 22 de 7b da |.&.......<.m".{.|
00000130 4e b0 fe 23 62 40 a5 65 0b 00 00 00 00 00 00 00 |N..#b@.e........|
00000140 76 26 00 00 00 00 00 00 0e 48 58 ea 50 44 1a 47 |v&.......HX.PD.G|
00000150 df 61 50 67 c7 3a dd 7a 13 00 00 00 00 00 00 00 |.aPg.:.z........|
00000160 61 26 00 00 00 00 00 00 e0 9f c3 bd d2 12 20 04 |a&............ .|
00000170 09 28 f1 df 2d 04 93 de 0b 00 00 00 00 00 00 00 |.(..-...........|
00002690 a4 eb 1b 3e 50 21 d8 74 0c 00 00 00 00 00 00 00 |...>P!.t........|
000026a0 21 15 00 00 00 00 00 00 00 40 cc c7 49 c4 54 09 |!........@..I.T.|
000026b0 a4 eb 1b 3e 50 21 d8 74 14 00 00 00 00 00 00 00 |...>P!.t........|
000026c0 0d 15 00 00 00 00 00 00 ce 6a 28 bc cf e7 79 c8 |.........j(...y.|
000026d0 a4 eb 1b 3e 50 21 d8 74 1a 00 00 00 00 00 00 00 |...>P!.t........|
000026e0 01 15 00 00 00 00 00 00 6c c0 c9 f7 7e 00 03 05 |........l...~...|
000026f0 a4 eb 1b 3e 50 21 d8 74 13 00 00 00 00 00 00 00 |...>P!.t........|
00002700 fb 14 00 00 00 00 00 00 74 d1 1b 8d f4 ff 7a ce |........t.....z.|
00002710 a4 eb 1b 3e 50 21 d8 74 4b 44 53 74 6f 72 61 67 |...>P!.tKDStorag|
00002720 65 2e 62 69 6e 00 77 61 76 65 73 5f 68 65 69 67 |e.bin.waves_heig|
00002730 68 74 73 31 2e 64 64 73 00 61 6e 69 6d 61 74 65 |hts1.dds.animate|
00002740 64 4d 69 73 63 73 2e 78 6d 6c 00 4c 6f 77 65 72 |dMiscs.xml.Lower|
00002750 44 65 63 6b 2e 64 64 73 00 63 6f 6d 6d 61 6e 64 |Deck.dds.command|
Staring at the hexdump long enough and we can start to see some patterns.
At regular intervals, every 256 bits, we get a 64-bit integer with a relatively low value, hinting at individual metadata sets of 256 bits.
This looks suspiciously like some kind of constant enum coded into a 64-bit integer. My best guess right now would be some kind of file type code, itself defined as a constant in the game engine like this:
#define FILE_TYPE_1 0x01
#define FILE_TYPE_2 0x02
// [...]
Let’s call it file_type
for now.
Looking at the end of the first chunk (right before we get KDStorage.bin
), we have to go back 4 x 64 bits to get something that looks like a file_type
.
This means file_type
is the first field in the 256-bit structure.
Let’s look at the next 64 bits: ce 26 00 00 00 00 00 00
, c1 26 00 00 00 00 00 00
, 80 26 00 00 00 00 00 00
, etc. These values are again rather small.
Also these values, at least for the first ones, are suspiciously close to 00002718
, i.e. right where the section containing the file names starts.
The last ones, 01 15 00 00 00 00 00 00
, fb 14 00 00 00 00 00 00
, etc., are smaller, and suspiciously, they have similar values to the length of the file name section (approximately 0x00003c00 - 0x00002710 = 0x0000014f0
).
The second field is then probably some kind of offset. Most likely from the start of one 256-bit chunk to the start of one of the file names.
Looking at other .idx
files seems to confirm that.
Let’s call it offset
for now.
Next, let’s look at the remaining 128 bits.
8f 0c 9a ba 4f 40 b6 93 | 27 b9 08 b1 d1 a1 b1 db
8f ec 87 4a 28 d0 f7 c7 | a4 eb 1b 3e 50 21 d8 74
ad 70 a2 e7 ac 2c 4f 6b | 27 b9 08 b1 d1 a1 b1 db
4e 84 a5 6a 94 dc 1f 7f | 62 f5 aa 4b 5e 15 7f 93
4e b0 fe 23 62 40 a5 65 | 27 b9 08 b1 d1 a1 b1 db
8e c7 6a 58 7c 86 62 33 | 27 b9 08 b1 d1 a1 b1 db
First thing to note: all the bits are used, which disqualifies offsets or simple enum IDs like before.
Right now, we are not even sure these 128 bits are part of one 128-bit field (for example a hash), two 64-bit integers, four 32-bit integers, or any combination of 16, 32 or 64 bits that ends up making a 128-bit chunk.
Looking at it more closely, the first 64 bits looks rather random, the last 64 bits however? we see quite a few values repeating themselves (ex: 27 b9 08 b1 d1 a1 b1 db
).
Let’s check the whole file:
kakwa@linux 6775398/idx » hexdump -C system_data.idx | grep '00 00 00 00 00 00 00' | \
sed 's/^..........//' | sed 's/ .*//' | sort | uniq -c | sort -n
# note: first number is the number of occurrences
1 00 00 00 00 00 00 00 af
1 1c 6a 8d 7f df 8e b3 35
1 28 00 00 00 00 00 00 00
1 36 71 00 00 00 00 00 00
1 37 01 00 00 1c 01 00 00
1 7d e3 1c a4 35 3e 98 8b
1 e0 95 53 f6 cc 08 c0 46
2 12 46 58 36 c8 ec 47 8b
2 4e b0 fe 23 62 40 a5 65
2 d0 d7 a5 ce a8 86 0e ae
3 1a aa c7 3c 4e 76 ad 94
3 4c d1 2e 30 73 38 d9 13
3 4c f0 ea c1 d5 5a 8d 12
3 88 57 fc 1c 72 f3 84 fa
3 ac 82 d5 f6 9e db 47 f9
3 b1 86 45 dc 7a 63 37 38
3 fc 27 83 2d 44 46 30 a3
6 76 83 17 b5 cf dd b7 0e
8 06 cf 85 bd 69 99 e2 46
9 09 28 f1 df 2d 04 93 de
10 a4 eb 1b 3e 50 21 d8 74
10 d7 22 2f fc 0a 67 7a 0d
14 aa db f0 18 01 89 b6 d8
15 df 61 50 67 c7 3a dd 7a
18 19 ac 65 3f 91 78 97 dc
19 38 e6 83 3c 74 a7 20 b2
19 59 dc e0 43 fc 88 b7 7c
23 d3 9e 86 23 25 42 27 45
24 0e 48 58 ea 50 44 1a 47
33 d7 19 f3 03 3e 6e 59 03
34 27 b9 08 b1 d1 a1 b1 db
38 62 f5 aa 4b 5e 15 7f 93
Indeed, the repetitions are quite frequent, and running the same command on other files yields roughly the same values:
kakwa@linux 6775398/idx » hexdump -C particles.idx | grep '00 00 00 00 00 00 00' | \
sed 's/^..........//' | sed 's/ .*//' | sort | uniq -c | sort -n
1 27 b9 08 b1 d1 a1 b1 db
1 28 00 00 00 00 00 00 00
1 8e ae 00 00 00 00 00 00
1 bd 01 00 00 b7 01 00 00
4 ca 2c b7 97 24 b8 1c 86
5 1f 19 a6 0c a2 9b 7f b3
16 94 a1 23 f4 c5 41 b8 42
20 91 cc 55 52 25 2a 42 d4
81 de 3e 45 0e 99 dc 30 14
317 66 52 00 d6 89 64 1d 2e
Moreover, sound_music.idx
, which as the name implies probably only contains sound files, returns mostly one type:
kakwa@linux 6775398/idx » hexdump -C sound_music.idx | grep '00 00 00 00 00 00 00' | \
sed 's/^..........//' | sed 's/ .*//' | sort | uniq -c | sort -n
513 93 63 67 56 c2 97 75 69
Note: these commands are by no means accurate; they are likely to catch garbage and miscount. But these are quick and dirty ways to validate hypotheses.
So it seems we are dealing with another file_type
field. Let’s call it file_type2
, and rename the first one file_type1
.
So in the end, making a few assumptions for now, we have figured out the rough format of this section:
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
| T1 | T1 | T1 | T1 | T1 | T1 | T1 | T1 || OF | OF | OF | OF | OF | OF | OF | OF |
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
|<---------- file type 1 -------------->||<------------ offset ----------------->|
| 64 bits || 64 bits |
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
| UN | UN | UN | UN | UN | UN | UN | UN || T2 | T2 | T2 | T2 | T2 | T2 | T2 | T2 |
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
|<------------ unknown ---------------->||<---------- file type 2 -------------->|
| 64 bits || 64 bits |
Header Section Analysis
The .idx
file header contains:
00000000 49 53 46 50 00 00 00 02 91 9d 39 b4 40 00 00 00 |ISFP......9.@...|
00000010 37 01 00 00 1c 01 00 00 01 00 00 00 00 00 00 00 |7...............|
00000020 28 00 00 00 00 00 00 00 f6 3b 00 00 00 00 00 00 |(........;......|
00000030 36 71 00 00 00 00 00 00 0e 00 00 00 00 00 00 00 |6q..............|
00000040 e0 26 00 00 00 00 00 00 8f 0c 9a ba 4f 40 b6 93 |.&..........O@..|
00000050 27 b9 08 b1 d1 a1 b1 db 13 00 00 00 00 00 00 00 |'...............|
00000060 ce 26 00 00 00 00 00 00 8f ec 87 4a 28 d0 f7 c7 |.&.........J(...|
00000070 a4 eb 1b 3e 50 21 d8 74 12 00 00 00 00 00 00 00 |...>P!.t........|
00000080 c1 26 00 00 00 00 00 00 ad 70 a2 e7 ac 2c 4f 6b |.&.......p...,Ok|
00000090 27 b9 08 b1 d1 a1 b1 db 0e 00 00 00 00 00 00 00 |'...............|
These first bytes don’t look like the section
previously mentioned; we have a magic number (ISFP
), and then the content doesn’t look like a section
at first (too many low-value 32-bit integers).
This means we most likely have a header section containing things like:
- magic numbers
- types
- sizes
- number of entries/files
The first thing to determine is the size of the header. Looking at it, the first section
starts at 0x38
(recognisable by the full 64-bit integers).
This means the header is 7 × 64 bits.
Let’s analyze the content.
Looking at all the files, for the first 128 bits, we get:
kakwa@linux 6775398/idx » for i in *;do hexdump -C $i | head -n 1;done
[...]
00000000 49 53 46 50 00 00 00 02 1b 73 f9 d5 40 00 00 00 |ISFP.....s..@...|
00000000 49 53 46 50 00 00 00 02 78 f0 2c 09 40 00 00 00 |ISFP....x.,.@...|
00000000 49 53 46 50 00 00 00 02 b8 fe ba b9 40 00 00 00 |ISFP........@...|
00000000 49 53 46 50 00 00 00 02 06 24 fa 2d 40 00 00 00 |ISFP.....$.-@...|
00000000 49 53 46 50 00 00 00 02 1e 7e f6 d9 40 00 00 00 |ISFP.....~..@...|
00000000 49 53 46 50 00 00 00 02 dd 21 74 c2 40 00 00 00 |ISFP.....!t.@...|
00000000 49 53 46 50 00 00 00 02 33 28 63 bd 40 00 00 00 |ISFP....3(c.@...|
00000000 49 53 46 50 00 00 00 02 cb 5c e2 0d 40 00 00 00 |ISFP.....\..@...|
00000000 49 53 46 50 00 00 00 02 cb e8 8a fd 40 00 00 00 |ISFP........@...|
00000000 49 53 46 50 00 00 00 02 6e 04 b1 62 40 00 00 00 |ISFP....n..b@...|
00000000 49 53 46 50 00 00 00 02 15 1c a2 f9 40 00 00 00 |ISFP........@...|
[...]
As we can see, the 1st, 2nd, and 4th 32-bit chunks are always the same, and looking at the values, we have respectively:
- a magic number (
ISFP
), 00 00 00 02
which is rather weird (it could be some kind of ID if we were little-endian, but the format is big-endian). Maybe it is actually part of the magic number. As it doesn’t vary, it’s not too important for the task at hand here.40 00 00 00
which, liketype 1
, looks like a low-value enum, and given its position in the index file, within the header, we are most likely dealing with an archive type. Again, as it doesn’t vary, it’s not really important.
The 3rd 32-bit integer uses all the available bits, so it’s unlikely a size. Maybe it’s a CRC32 or a unique ID for the archives.
Edit: the 40 00 00 00
value upon closer inspection might not be an archive type; its value is 64 in decimal, which might be a header size, or simply storing the size of an integer.
So we have:
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
| MA | MA | MA | MA || 00 | 00 | 00 | 02 || ID | ID | ID | ID || 40 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
|<----- magic ----->||<----- ???? ------>||<---- id/crc ----->||<----- ??????? --->|
Now, let’s look at the next 128 bits (second line in the hexdump).
kakwa@linux 6775398/idx » for i in *;do hexdump -C $i | head -n 2 | tail -n 1;done
00000010 bd 01 00 00 b7 01 00 00 01 00 00 00 00 00 00 00 |................|
00000010 35 01 00 00 1d 01 00 00 01 00 00 00 00 00 00 00 |5...............|
00000010 5b 00 00 00 57 00 00 00 01 00 00 00 00 00 00 00 |[...W...........|
00000010 5c 00 00 00 58 00 00 00 01 00 00 00 00 00 00 00 |\...X...........|
00000010 10 18 00 00 ff 17 00 00 01 00 00 00 00 00 00 00 |................|
00000010 29 3b 00 00 0b 3b 00 00 01 00 00 00 00 00 00 00 |);...;..........|
00000010 04 02 00 00 02 02 00 00 01 00 00 00 00 00 00 00 |................|
00000010 1e 05 00 00 c2 03 00 00 01 00 00 00 00 00 00 00 |................|
00000010 a5 01 00 00 36 01 00 00 01 00 00 00 00 00 00 00 |....6...........|
00000010 13 04 00 00 fb 02 00 00 01 00 00 00 00 00 00 00 |................|
00000010 79 03 00 00 a9 02 00 00 01 00 00 00 00 00 00 00 |y...............|
So here, we recognize two 32-bit integers due to the 00 00
, and then either a fixed 64-bit integer with always a 01 00 00 00 00 00 00 00
value, or something like two 32-bit integers with value 01 00 00 00
and 00 00 00 00
(as the value never varies, again, it’s not that important).
Let’s try to determine the two 32-bit values. Let’s look at one of the files in particular:
kakwa@linux 6775398/idx » hexdump -C system_data.idx | less
[...]
00000010 37 01 00 00 1c 01 00 00 01 00 00 00 00 00 00 00 |7...............|
[...]
The first value is 37 01 00 00
, i.e. converted to decimal, 311
. Doing a strings system_data.idx > listing
and removing the garbage (ex: w6~n
) as best as possible, plus the .pkg
file name, only keeping files and directory names, we get 310
entries, a remarkably close value.
Looking at other files, story is similar, this field roughly matches the number of strings we get from strings
(never perfectly however, but if the names are too short, strings
will ignore them, most likely explaining the small delta we have each time).
Consequently we can deduce it’s most likely the number of entries (files and directories) in the index file.
Next, we have 1c 01 00 00
, i.e. converted to decimal, 284
. This value is suspiciously close to the previous value. As we have both directories and file names, this number probably represents the number of items that are actual files.
Let’s validate that:
# Q&D filtering out names without an extension (no '.')
kakwa@linux 6775398/idx » cat listing | grep '\.' | wc -l
284
Bingo, we have the exact number we were looking for.
The last 64 bits could simply be ignored for now since they always have the same value.
So we have:
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
| FD | FD | FD | FD || FI | FI | FI | FI || 01 | 00 | 00 | 00 || 00 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
|<file + dir count >||<-- file count --->||<-------------- ???? ------------------>|
Next 128 bits:
kakwa@linux 6775398/idx » for i in *;do hexdump -C $i | head -n 3 | tail -n 1;done | less
[...]
00000020 28 00 00 00 00 00 00 00 58 8d 00 00 00 00 00 00 |(.......X.......|
00000020 28 00 00 00 00 00 00 00 7c 3f 02 00 00 00 00 00 |(.......|?......|
00000020 28 00 00 00 00 00 00 00 77 2f 00 00 00 00 00 00 |(.......w/......|
00000020 28 00 00 00 00 00 00 00 bc e9 01 00 00 00 00 00 |(...............|
00000020 28 00 00 00 00 00 00 00 c6 ea 02 00 00 00 00 00 |(...............|
00000020 28 00 00 00 00 00 00 00 a9 0f 00 00 00 00 00 00 |(...............|
00000020 28 00 00 00 00 00 00 00 c4 22 00 00 00 00 00 00 |(........"......|
00000020 28 00 00 00 00 00 00 00 b6 2b 00 00 00 00 00 00 |(........+......|
00000020 28 00 00 00 00 00 00 00 d6 41 00 00 00 00 00 00 |(........A......|
00000020 28 00 00 00 00 00 00 00 fd 21 00 00 00 00 00 00 |(........!......|
00000020 28 00 00 00 00 00 00 00 e1 25 00 00 00 00 00 00 |(........%......|
00000020 28 00 00 00 00 00 00 00 f5 31 00 00 00 00 00 00 |(........1......|
00000020 28 00 00 00 00 00 00 00 1e 42 00 00 00 00 00 00 |(........B......|
00000020 28 00 00 00 00 00 00 00 70 42 02 00 00 00 00 00 |(.......pB......|
[...]
So here, we have two 64-bit integers. The first one is 28 00 00 00 00 00 00 00
and always has the same value. Not sure what it represents; the value is somewhat close to the header size in bytes: 40 for this value, 56 for the full header size.
Maybe the header could vary in size in certain situations, and this represents its size minus some fixed part (like the first 16 bytes/128 bits). I’m also kind of betting that if the previous 64-bit integer (01 00 00 00 00 00 00 00
) changes, this will also change.
But as it never varies in the set of index files we have here, we cannot really make any deduction, only guesses. So once again, let’s ignore it.
At this point, the idea of downloading other Wargaming games like World of Tanks or World of Warplanes popped up; maybe this will give complementary information regarding the unknown fields that start to pile up.
But let’s continue for now.
EDIT: It’s no help; World of Tanks and World of Warplanes simply pack their resources in .zip
files…
It’s a wild guess, but I kind of expect WoWs to be the same in the future.
The WoWs packing format feels, in fact, somewhat legacy, custom, and far less efficient than a standard run-of-the-mill .zip
file. Not to mention using .zip
files means removing one bit of code to maintain.
The next value is again a 64-bit integer; it changes between each file.
Let’s focus on one file:
[...]
00000020 28 00 00 00 00 00 00 00 f6 3b 00 00 00 00 00 00 |(........;......|
[...]
At first, I thought it might be a file size, but quickly checking the index file size, I got:
- index file size: 29043
- value of this field in decimal: 15350
Checking another file, I got 77461 and 44807.
So no, it’s not the index size. However it is suspiciously ~1/2 of the file size, and after having stared at hexdumps for hours, I had another idea.
The third chunk of the file is right after the bundle of dir/file name strings which varies in length wildly (i.e., it’s not a fixed length or a multiple of a fixed length).
We probably need an offset pointing to the start of this section in the header.
And sure enough, looking where the bundle of strings stops, we get:
00003bd0 6f 69 73 65 2e 64 64 73 00 73 70 61 63 65 5f 76 |oise.dds.space_v|
00003be0 61 72 69 61 74 69 6f 6e 5f 64 75 6d 6d 79 2e 64 |ariation_dummy.d|
00003bf0 64 73 00 77 61 76 65 73 5f 68 65 69 67 68 74 73 |ds.waves_heights|
00003c00 30 2e 64 64 73 00 8f ec 87 4a 28 d0 f7 c7 70 11 |0.dds....J(...p.|
00003c10 03 07 0d 33 ed 77 1e 9b ef 05 00 00 00 00 05 00 |...3.w..........|
Okay, the string bundle ends at 00003c05; that’s quite near 3bf6, so this is certainly the offset to this third section or the end of the bundle of strings.
Most likely, the offset is not from the start of the file but from a specific point in the header (this field? end of header?); that’s why we get a -15 difference (0x00003c05 - 0x3bf6 = 0xF = 15). This -15 value is constant between files.
So we have:
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
| HS | HS | HS | HS | HS | HS | HS | HS || OF | OF | OF | OF | OF | OF | OF | OF |
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
|<------------ header size (?) -------->||<---- offset third section start ----->|
Last 64 bits:
kakwa@linux 6623042/idx » for i in *;do hexdump -C $i | head -n 4 | tail -n 1;done | less
[...]
00000030 40 af 03 00 00 00 00 00 1b 00 00 00 00 00 00 00 |@...............|
00000030 54 8d 1f 00 00 00 00 00 21 00 00 00 00 00 00 00 |T.......!.......|
00000030 8e ae 00 00 00 00 00 00 17 00 00 00 00 00 00 00 |................|
00000030 02 81 00 00 00 00 00 00 13 00 00 00 00 00 00 00 |................|
00000030 c9 24 00 00 00 00 00 00 1f 00 00 00 00 00 00 00 |.$..............|
[...]
So, we have a 64-bit integer, which is a relatively low value. This means it’s most likely a size or an offset.
If we pick one:
kakwa@linux 6623042/idx » hexdump -C system_data.idx | less
[...]
00000030 36 71 00 00 00 00 00 00 13 00 00 00 00 00 00 00 |6q..............|
[...]
We can see that the 36 71 00 00 00 00 00 00
, 28982 once converted to decimal, is remarkably close to the file size (29043 bytes).
From there, we can guess it might be three things:
- the actual index file size
- an offset to something at the end of the file
- pointer to the end of the third section
Let’s note that for now, and figure out the finer details at implementation time.
So, to recap, here is the header section format:
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
| MA | MA | MA | MA || 00 | 00 | 00 | 02 || ID | ID | ID | ID || 40 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
|<----- magic ----->||<----- ???? ------>||<---- id/crc ----->||<----- ??????? --->|
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
| FD | FD | FD | FD || FI | FI | FI | FI || 01 | 00 | 00 | 00 || 00 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====++====+====+====+====++====+====+====+====+
|<file + dir count >||<-- file count --->||<-------------- ???? ------------------>|
+====+====+====+====+====+====+====+=====++=====+====+====+====+====+====+====+====+
| HS | HS | HS | HS | HS | HS | HS | HS || OF | OF | OF | OF | OF | OF | OF | OF |
+====+====+====+====+====+====+====+=====++=====+====+====+====+====+====+====+====+
|<------------ header size (?) --------->||<----- offset third section start ----->|
+====+====+====+====+====+====+====+=====+
| OE | OE | OE | OE | OE | OE | OE | OE |
+====+====+====+====+====+====+====+=====+
|<---- offset third section end -------->|
Filename Section
Not much to say there.
Here’s what it looks like:
kakwa@linux 6775398/idx » hexdump -C system_data.idx| less
[...]
00002700 fb 14 00 00 00 00 00 00 74 d1 1b 8d f4 ff 7a ce |........t.....z.|
00002710 a4 eb 1b 3e 50 21 d8 74 4b 44 53 74 6f 72 61 67 |...>P!.tKDStorag|
00002720 65 2e 62 69 6e 00 77 61 76 65 73 5f 68 65 69 67 |e.bin.waves_heig|
00002730 68 74 73 31 2e 64 64 73 00 61 6e 69 6d 61 74 65 |hts1.dds.animate|
[...]
000028c0 74 73 2e 62 69 6e 00 6d 69 73 63 53 65 74 74 69 |ts.bin.miscSetti|
000028d0 6e 67 73 2e 78 6d 6c 00 64 61 6d 61 67 65 5f 64 |ngs.xml.damage_d|
000028e0 65 63 5f 32 5f 64 2e 64 64 32 00 64 61 6d 61 67 |ec_2_d.dd2.damag|
000028f0 65 5f 64 65 63 5f 31 5f 65 2e 64 64 73 00 63 72 |e_dec_1_e.dds.cr|
[...]
00003be0 61 72 69 61 74 69 6f 6e 5f 64 75 6d 6d 79 2e 64 |ariation_dummy.d|
00003bf0 64 73 00 77 61 76 65 73 5f 68 65 69 67 68 74 73 |ds.waves_heights|
00003c00 30 2e 64 64 73 00 8f 0c 9a ba 4f 40 b6 93 70 11 |0.dds.....O@..p.|
00003c10 03 07 0d 33 ed 77 00 00 00 00 00 00 00 00 05 00 |...3.w..........|
00003c20 00 00 01 00 00 00 f5 21 00 00 bf 00 45 5c 6c 36 |.......!....E\l6|
So it’s a bunch of \0
separated strings. The only thing interesting to note is that it’s not a fixed size section.
PKG Pointer Section
Here what it looks like:
kakwa@linux 6775398/idx » hexdump -C system_data.idx| less
[...]
00003bf0 64 73 00 77 61 76 65 73 5f 68 65 69 67 68 74 73 |ds.waves_heights|
00003c00 30 2e 64 64 73 00 8f 0c 9a ba 4f 40 b6 93 70 11 |0.dds.....O@..p.|
00003c10 03 07 0d 33 ed 77 00 00 00 00 00 00 00 00 05 00 |...3.w..........|
00003c20 00 00 01 00 00 00 f5 21 00 00 bf 00 45 5c 6c 36 |.......!....E\l6|
00003c30 00 00 00 00 00 00 8f ec 87 4a 28 d0 f7 c7 70 11 |.........J(...p.|
00003c40 03 07 0d 33 ed 77 1e 9b ef 05 00 00 00 00 05 00 |...3.w..........|
00003c50 00 00 01 00 00 00 15 15 01 00 03 77 63 97 3e ab |...........wc.>.|
00003c60 02 00 00 00 00 00 ad 70 a2 e7 ac 2c 4f 6b 70 11 |.......p...,Okp.|
00003c70 03 07 0d 33 ed 77 05 22 00 00 00 00 00 00 05 00 |...3.w."........|
00003c80 00 00 01 00 00 00 cb 01 00 00 6d b9 de c1 ad 0c |..........m.....|
00003c90 00 00 00 00 00 00 8e c7 6a 58 7c 86 62 33 70 11 |........jX|.b3p.|
00003ca0 03 07 0d 33 ed 77 e0 23 00 00 00 00 00 00 05 00 |...3.w.#........|
00003cb0 00 00 01 00 00 00 65 00 00 00 f1 d2 87 5a d2 00 |......e......Z..|
00003cc0 00 00 00 00 00 00 8e 43 3d e9 cf 49 52 a4 70 11 |.......C=..IR.p.|
00003cd0 03 07 0d 33 ed 77 4d 1f 6a 05 00 00 00 00 05 00 |...3.wM.j.......|
00003ce0 00 00 01 00 00 00 bb 19 00 00 f7 53 4a b1 38 ab |...........SJ.8.|
00003cf0 00 00 00 00 00 00 0e 3c 9a 6d 22 de 7b da 70 11 |.......<.m".{.p.|
00003d00 03 07 0d 33 ed 77 55 24 00 00 00 00 00 00 05 00 |...3.wU$........|
00003d10 00 00 01 00 00 00 72 0d 00 00 83 0a 72 88 a3 5c |......r.....r..\|
00003d20 00 00 00 00 00 00 98 45 00 7a 16 6e 84 21 70 11 |.......E.z.n.!p.|
00003d30 03 07 0d 33 ed 77 73 ce cb 06 00 00 00 00 05 00 |...3.ws.........|
00003d40 00 00 01 00 00 00 f9 b6 b7 00 ff 7e f7 7b 5b 30 |...........~.{[0|
00003d50 b8 00 00 00 00 00 98 51 4a 00 2c 12 71 ad 70 11 |.......QJ.,.q.p.|
00003d60 03 07 0d 33 ed 77 7e 63 75 05 00 00 00 00 05 00 |...3.w~cu.......|
[...]
00007100 00 00 01 00 00 00 6f 09 00 00 fc 56 94 f8 9a 37 |......o....V...7|
00007110 00 00 00 00 00 00 21 67 ac 70 22 ec ca b8 70 11 |......!g.p"...p.|
00007120 03 07 0d 33 ed 77 28 f9 15 0a 00 00 00 00 05 00 |...3.w(.........|
00007130 00 00 01 00 00 00 0b 2d 00 00 03 bd b0 50 67 e9 |.......-.....Pg.|
00007140 00 00 00 00 00 00 15 00 00 00 00 00 00 00 18 00 |................|
00007150 00 00 00 00 00 00 70 11 03 07 0d 33 ed 77 73 79 |......p....3.wsy|
00007160 73 74 65 6d 5f 64 61 74 61 5f 30 30 30 31 2e 70 |stem_data_0001.p|
00007170 6b 67 00 |kg.|
00007173
(END)
This section contains 384-bit records (48 bytes each) linking files to their data in .pkg
files through IDs and offsets.
Each record contains metadata IDs, PKG file offsets, data sizes, and compression types.
# let's skip the first 6 bytes to align hexdump output
kakwa@linux 6775398/idx » hexdump -s 6 -C system_data.idx| less
[...]
00003be6 6f 6e 5f 64 75 6d 6d 79 2e 64 64 73 00 77 61 76 |on_dummy.dds.wav|
00003bf6 65 73 5f 68 65 69 67 68 74 73 30 2e 64 64 73 00 |es_heights0.dds.|
00003c06 8f 0c 9a ba 4f 40 b6 93 70 11 03 07 0d 33 ed 77 |....O@..p....3.w|
00003c16 00 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c26 f5 21 00 00 bf 00 45 5c 6c 36 00 00 00 00 00 00 |.!....E\l6......|
00003c36 8f ec 87 4a 28 d0 f7 c7 70 11 03 07 0d 33 ed 77 |...J(...p....3.w|
00003c46 1e 9b ef 05 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c56 15 15 01 00 03 77 63 97 3e ab 02 00 00 00 00 00 |.....wc.>.......|
00003c66 ad 70 a2 e7 ac 2c 4f 6b 70 11 03 07 0d 33 ed 77 |.p...,Okp....3.w|
[...]
Much nicer!
With that, we immediately notice a cycle, with the 70 11 03 07 0d 33 ed 77
value.
There are 6 × 64 = 384 bits between each 70 11 03 07 0d 33 ed 77
value.
Let’s try to confirm that with the IDs.
We are spotting the bf 00 45 5c 6c 36
seen previously in the .pkg
file. 384 bits later, we see 03 77 63 97 3e ab 02
.
With a bit of digging (it’s right in the middle of the .pkg
file, and this ID is not neatly in a 64-bit aligned chunk because of the variable size of the DEFLATE blocks), we indeed find it.
We have indeed a 384-bit cycle, which neatly fits in 3 lines of hexdump!
So each of these is one record:
00003c06 8f 0c 9a ba 4f 40 b6 93 70 11 03 07 0d 33 ed 77 |....O@..p....3.w|
00003c16 00 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c26 f5 21 00 00 bf 00 45 5c 6c 36 00 00 00 00 00 00 |.!....E\l6......|
00003c36 8f ec 87 4a 28 d0 f7 c7 70 11 03 07 0d 33 ed 77 |...J(...p....3.w|
00003c46 1e 9b ef 05 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c56 15 15 01 00 03 77 63 97 3e ab 02 00 00 00 00 00 |.....wc.>.......|
00003c66 ad 70 a2 e7 ac 2c 4f 6b 70 11 03 07 0d 33 ed 77 |.p...,Okp....3.w|
00003c76 05 22 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |."..............|
00003c86 cb 01 00 00 6d b9 de c1 ad 0c 00 00 00 00 00 00 |....m...........|
00003c96 8e c7 6a 58 7c 86 62 33 70 11 03 07 0d 33 ed 77 |..jX|.b3p....3.w|
00003ca6 e0 23 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |.#..............|
00003cb6 65 00 00 00 f1 d2 87 5a d2 00 00 00 00 00 00 00 |e......Z........|
00003cc6 8e 43 3d e9 cf 49 52 a4 70 11 03 07 0d 33 ed 77 |.C=..IR.p....3.w|
00003cd6 4d 1f 6a 05 00 00 00 00 05 00 00 00 01 00 00 00 |M.j.............|
00003ce6 bb 19 00 00 f7 53 4a b1 38 ab 00 00 00 00 00 00 |.....SJ.8.......|
All these records are from the same file,
let’s grab a few from other files:
File 2:
0000f6fd 58 fe 65 b4 59 b6 b0 77 a4 8c 78 6a 58 aa 65 84 |X.e.Y..w..xjX.e.|
0000f70d 00 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |................|
0000f71d bc 99 05 00 f4 3a 67 8b 80 00 08 00 00 00 00 00 |.....:g.........|
0000f72d a0 e5 c7 22 cc 49 d3 31 a4 8c 78 6a 58 aa 65 84 |...".I.1..xjX.e.|
0000f73d 3e 9e 7e 05 00 00 00 00 05 00 00 00 01 00 00 00 |>.~.............|
0000f74d 79 c6 03 00 dc b8 5f 80 80 00 08 00 00 00 00 00 |y....._.........|
File 3:
00002d0c ed cf 33 f8 a5 94 53 56 0d a7 9c b9 bf 60 f5 3e |..3...SV.....`.>|
00002d1c 40 a8 08 07 00 00 00 00 00 00 00 00 00 00 00 00 |@...............|
00002d2c 38 ab 00 00 5d cf 4e b6 38 ab 00 00 00 00 00 00 |8...].N.8.......|
00002d3c ed ea da 52 4e 8f 70 ed 0d a7 9c b9 bf 60 f5 3e |...RN.p......`.>|
00002d4c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00002d5c 98 00 00 00 a4 63 f2 9b 98 00 00 00 00 00 00 00 |.....c..........|
Lets study:
00003c06 8f 0c 9a ba 4f 40 b6 93 70 11 03 07 0d 33 ed 77 |....O@..p....3.w|
00003c16 00 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c26 f5 21 00 00 bf 00 45 5c 6c 36 00 00 00 00 00 00 |.!....E\l6......|
For the first 128 bits we get:
00003c06 8f 0c 9a ba 4f 40 b6 93 70 11 03 07 0d 33 ed 77 |....O@..p....3.w|
From the fact the last 64 bits are constant, we can deduce we have probably two 64 bits integers in the first 128 bits
Once again, these are using all the available bits and seem rather random. It’s difficult to link these to their role, so… lets simply ignore these for now.
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
| UO | UO | UO | UO | UO | UO | UO | UO || UT | UT | UT | UT | UT | UT | UT | UT |
+====+====+====+====+====+====+====+====++====+====+====+====+====+====+====+====+
|<------------ unknown 1 -------------->||<------------- unknow 2 -------------->|
| 64 bits || 64 bits |
Next 64 bits, we have a low value 64 bits integer, so most likely an offset. It is also suspiciously at 0x0
for the first record which is also the first data chunk in the .pkg
file.
So, it’s most likely the start of a data chunk in a .pkg
file. Looking at other records confirms that.
Next, we have two extremely low value 32 bits (0x5 and 0x1). So once again, most likely some kind of enum, lets call them type1
and type2
.
+====+====+====+====+====+====+====+====++====+====+====+====++====+====+====+====+
| OF | OF | OF | OF | OF | OF | OF | OF || T1 | T1 | T1 | T1 || T2 | T2 | T2 | T2 |
+====+====+====+====+====+====+====+====++====+====+====+====++====+====+====+====+
|<----------start offset pkg ---------->||<---- type 1 ----->||<----- type 2 ---->|
| 64 bits || 32 bits || 32 bits |
For the Last 128 bits, we get the following:
00003c26 f5 21 00 00 bf 00 45 5c 6c 36 00 00 00 00 00 00 |.!....E\l6......|
So, here, we see our .pkg
ID (bf 00 45 5c 6c 36
) right in the middle. Given its size, this ID is probably stored on 64 bits.
After that, last 32 bits, we get a bunch of 00
, maybe a reserved field, but more likely some kind of padding.
Before that, we get a low value 32 bits integer. When comparing with the .pkg
file, f5 21 00 00
is the offset where the first data chunk ends.
So it’s the end offset. But 32 bits seems rather small to store such offset (specially given the start offset is 64 bits. Also, for other data chunks, this doesn’t line-up.
However, it could very much be a relative offset (to the start of the data chunk).
Lets validate that with the second record:
00003c36 8f ec 87 4a 28 d0 f7 c7 70 11 03 07 0d 33 ed 77 |...J(...p....3.w|
00003c46 1e 9b ef 05 00 00 00 00 05 00 00 00 01 00 00 00 |................|
00003c56 15 15 01 00 03 77 63 97 3e ab 02 00 00 00 00 00 |.....wc.>.......|
More hexdump! (searching the data chunk using the 03 77 63 97 3e ab 02
ID and the start offset 1e 9b ef 05 00 00 00 00
, or 0x05ef9b1e
once we take endianess into account):
kakwa@linux World of Warships/res_packages » hexdump -C system_data_0001.pkg | less
[...]
05ef9b10 00 00 17 4b 28 42 80 40 00 00 00 00 00 00 8c 7d |...K(B.@.......}|
05ef9b20 3b 50 5d d9 b6 dd 7d b6 3e 76 15 b2 5f 20 89 04 |;P]...}.>v.._ ..|
05ef9b30 55 bd 00 44 82 aa ec 7a 9c bd f7 79 45 57 39 40 |U..D...z...yEW9@|
05ef9b40 90 d0 4e 8c c0 01 9d 21 48 e8 0c 41 a2 ce 10 24 |..N....!H..A...$|
[...]
05f0b010 d1 d7 52 b0 de cc fa 9c 2b 8d b5 57 a4 02 ff 1a |..R.....+..W....|
05f0b020 43 5d 2d 43 3f cc d1 0b f8 88 92 a8 9f 5a 70 64 |C]-C?........Zpd|
05f0b030 46 fb 7f 00 00 00 00 03 77 63 97 3e ab 02 00 00 |F.......wc.>....|
05f0b040 00 00 00 94 b7 67 90 db 68 9e e6 d9 1f 36 62 6f |.....g..h....6bo|
05f0b050 6f 22 76 f6 62 e3 62 77 67 7a 7a ba ab bb 8c aa |o"v.b.bwgzz.....|
05f0b060 4a 52 95 5c ca a5 cf 64 d2 24 bd 27 41 d0 00 04 |JR.\...d.$.'A...|
[...]
We indeed find the start of our data chunk at 05ef9b1e
(8c
after a bunch of 00
on the first line).
And looking for the 00 00 00 00 ID ID [...]
pattern in between the data chunks, we can determine the end of this chunk to be at 0x05f0b032
.
Doing 0x05f0b032 - 0x05ef9b1e
, we get 0x11514
, that’s almost our 15 15 01 00
once we swap endianess, and add 1
.
+====+====+====+====++====+====+====+====+====+====+====+====++====+====+====+====+
| OE | OE | OE | OE || ID | ID | ID | ID | ID | ID | ID | ID || 00 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====+====+====+====+====++====+====+====+====+
|<-- offset end --->||<------------- ID '.pkg' ------------->||<---- padding ---->|
| 32 bits || 64 bits || 32 bits |
So to recap, we have:
+====+====+====+====+====+====+====+====++=====+====+====+====+====+====+====+====+
| UO | UO | UO | UO | UO | UO | UO | UO || UT | UT | UT | UT | UT | UT | UT | UT |
+====+====+====+====+====+====+====+====++=====+====+====+====+====+====+====+====+
|<------------ unknown 1 -------------->||<-------------- unknown 2 ------------->|
| 64 bits || 64 bits |
+====+====+====+====+====+====+====+====++====+====+====+====++====+====+====+====+
| OF | OF | OF | OF | OF | OF | OF | OF || T1 | T1 | T1 | T1 || T2 | T2 | T2 | T2 |
+====+====+====+====+====+====+====+====++====+====+====+====++====+====+====+====+
|<--------- start offset pkg ---------->||<---- type 1 ----->||<----- type 2 ---->|
| 64 bits || 32 bits || 32 bits |
+====+====+====+====++====+====+====+====+====+====+====+====++====+====+====+====+
| OE | OE | OE | OE || ID | ID | ID | ID | ID | ID | ID | ID || 00 | 00 | 00 | 00 |
+====+====+====+====++====+====+====+====+====+====+====+====++====+====+====+====+
|<-- offset end --->||<------------- ID '.pkg' ------------->||<---- padding ---->|
| 32 bits || 64 bits || 32 bits |
The Last Bits
So, okay, we have the core of the 3 parts of the file.
But we can see the last few bits don’t follow this pattern (especially with the .pkg
file name), which means we have a footer:
# Last block
00007116 21 67 ac 70 22 ec ca b8 70 11 03 07 0d 33 ed 77 |!g.p"...p....3.w|
00007126 28 f9 15 0a 00 00 00 00 05 00 00 00 01 00 00 00 |(...............|
00007136 0b 2d 00 00 03 bd b0 50 67 e9 00 00 00 00 00 00 |.-.....Pg.......|
# Footer
00007146 15 00 00 00 00 00 00 00 18 00 00 00 00 00 00 00 |................|
00007156 70 11 03 07 0d 33 ed 77 73 79 73 74 65 6d 5f 64 |p....3.wsystem_d|
00007166 61 74 61 5f 30 30 30 31 2e 70 6b 67 00 |ata_0001.pkg.|
If we look at the content, we have 3 × 64 bits before the name starts (73 79 73 74 65 6d 5f 64
for system_d
)
Looking at it more closely, given all the 00
, it seems we have three 64-bit integers there.
15 00 00 00 00 00 00 00
18 00 00 00 00 00 00 00
70 11 03 07 0d 33 ed 77
70 11 03 07 0d 33 ed 77
is the “unknown 2” we saw previously, so still no luck, but let’s name it the same way.
18 00 00 00 00 00 00 00
seems to have the same value across all files, so probably not that important.
The only one that varies across files is the 15 00 00 00 00 00 00 00
, but always in the same kind of values around 0x15. In decimal it’s 21.
Strangely, system_data_0001.pkg
is 20 chars long, 21 if we include the \0
at the end.
So we can deduce it’s actually the .pkg
file name string size.
So we have:
+====+====+====+====+====+====+====+====++=====+====+====+====+====+====+====+====+
| UO | UO | UO | UO | UO | UO | UO | UO || U3 | U3 | U3 | U3 | U3 | U3 | U3 | U3 |
+====+====+====+====+====+====+====+====++=====+====+====+====+====+====+====+====+
|<--------- size pkg file name -------->||<-------------- unknown 3 ------------->|
| 64 bits || 64 bits |
+=====+====+====+====+====+====+====+====+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...
| UT | UT | UT | UT | UT | UT | UT | UT | file name string
+=====+====+====+====+====+====+====+====+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...
|<-------------- unknown 2 ------------->|
| 64 bits |
Recap (Part 2)
- Identified
.idx
files in./bin/<latest>/idx/
as indexes for.pkg
archives and matched them by name. - Mapped overall
.idx
layout into 4 parts:- Header: magic
ISFP
, counts (entries and files), header-size-like field, offsets to start/end of the third section, and a per-file id/crc-like value. - Names:
\0
-separated file and directory names. - Records section (per-record 384 bits): two 64-bit unknowns (likely ids, one enabling parent/child hierarchy), 64-bit start offset in
.pkg
, 2×32-bit types, 32-bit relative end size, 64-bit.pkg
chunk id, and padding. - Footer:
.pkg
filename length, an unknown 64-bit value, a repeated 64-bit constant, then the.pkg
filename string.
- Header: magic
- Established the link between names and
.pkg
chunks via IDs and offsets, enough to reconstruct directories and locate data.
In the next part of this series we will start the actual implementation.