The VEXX File Format

2022-06-17

Previous post: MD Remote Resistor Values

This is a very high-level description of the VEXX file format, also known as ".vex" for the file extension. The file format was first seen in the PSP game WipEout Pure, and has since been used (in one form or the other) in newer games in the WipEout series.

The file header is quite simple:

struct VexxHeader {
    uint32_t version; // Pure on PSP == 4, Pulse on PSP == 6, obsolete version 3 seen in Pure's data.wad
    uint32_t part1_length; // length in bytes of part 1
    uint32_t part2_length; // length in bytes of part 2
    char magic[4]; // "VEXX" for PSP/PS2/Vita (little endian MIPS/ARM), "XXEV" for PS3 (big endian PPC)
};

After the header, part 1 follows and then immediately part 2. Part 1 starts at offset 16 in the file, and part 2 starts at offset 16 + part1_length.

As seen above, for PS3, everything is stored as big-endian, but on all other platforms (PSP, PS2, Vita), everything is little-endian.

Part 1 contains the serialized object tree.

Part 2 contains any attached texture objects (referenced from texture nodes in part 1).

In this post, we'll only have a quick look at the basic structure of part 1.

Part 1 (The Object Tree)

Each object has a 16-byte header like this:

struct TreeNodeHeader {
    uint32_t signature; // Signature/marker of object (its type), depends on version
    uint16_t header_len; // Length of the header, plus name, plus potentially other(?) data
    uint16_t unk1; // ???
    uint32_t payload_len; // Length of the payload
    uint16_t nchildren; // How many child objects this object has
    uint16_t unk2; // ???
};

The VEXX tree starts with a single root node that contains all other nodes.

If the object has any children (nchildren > 0), the children are serialized directly after the "parent" object. This can be parsed recursively (the next object starts at the header_len + payload_len bytes after the TreeNodeHeader start offset).

Immediately after the TreeNodeHeader follows a '\0'-terminated ASCII string that describes the node.

At the offset of TreeNodeHeader + header_len is where the payload for the node starts. There are payload_len bytes of payload, after which the next TreeNodeHeader starts.

Example Parser

As a working example, here's a small Python 3 script that can parse the tree structure of any VEX file (tested with PSP Pure and PSP Pulse files, other files not tested):

#
# walk-vex.py: VEXX file tree walker
# 2022-06-17 Thomas Perl <m@thp.io>
#

import struct
import argparse

class Vexx(object):
    def __init__(self, d):
        self.d = d
        self.offset = 0
        self.indent = 0

    def eat(self, fmt):
        return struct.unpack(fmt, self.read(struct.calcsize(fmt)))

    def read(self, size):
        result = self.d[self.offset:self.offset+size]
        self.offset += size
        return result

    def parse(self, n):
        self.indent += 1
        result = [self.parse_one() for _ in range(n)]
        self.indent -= 1
        return result

    def parse_one(self):
        signature, header_len, unk1, payload_len, nchildren, unk2 = self.eat(f'<IHHIHH')

        namedata = self.read(header_len - 16)
        payload = self.read(payload_len)

        # Sometimes there's other data trailing the string in "namedata"
        name = namedata[:namedata.index(b'\0')].decode()

        print(f'{" "*self.indent}{signature=:08x} {header_len=} {unk1=} {payload_len=} {nchildren=} {unk2=} {name=}')

        children = self.parse(nchildren)

        return (signature, name, payload, children)


parser = argparse.ArgumentParser(description='Parse the tree structure of a VEXX file')
parser.add_argument('filename', type=str, metavar='filename.vex', help='Path to the .vex file to parse')
args = parser.parse_args()

with open(args.filename, 'rb') as fp:
    vexx = Vexx(fp.read())

    version, part1_length, part2_length, magic = vexx.eat('<III4s')
    print(f'{version=}, {part1_length=}, {part2_length=}, {magic=}')

    assert len(vexx.d) == 16 + part1_length + part2_length

    vexx.parse_one()

    assert vexx.offset == 16 + part1_length

Download this script here: walk-vex.py

Where To Go From Here

This tool should allow parsing and inspecting of the tree structure contained in ".vex" files. Depending on the version field and the signatures of the tree nodes, the payload data of each node can be interpreted.

For example, the signature 0x3c1 in version 6 (Pulse) files is a texture, whereas in version 4 (Pure) the signature 0x373 is used for textures.

More on this in a future posting.

Thomas Perl · 2022-06-17