Hooking file open operations in DOS

2023-09-15

This article is about how to hook the DOS interrupt handler and redirect file open operations, so that e.g. unmodified DOS games running from CD-ROM can still read and write savegames to hard disk (without them knowing or needing to have special code).

System Calls on Modern Systems

Let's quickly recap how system calls are done on modern systems.

On Linux, system calls (on 32-bit x86) are done using int 0x80 (or syscall/sysenter depending on your CPU vendor, which is one of the original reasons why the VDSO exists - but we'll ignore that here) and those syscalls are usually wrapped by glibc. For completeness, on x86-64 Linux, syscalls are done using the syscall instruction, and since that existed already when the AMD64 architecture was designed, and Intel adopted it, even Intel has support for the syscall instruction on x86-64 (and this is why on x86-64, the VDSO doesn't need a _kernel_vsyscall symbol).

On Windows NT (which means every 32-/64-bit Windows that isn't 3.x/9x/Me) the system call interface is unstable, meaning that the system call numbers change regularly and are not supposed to have well-known fixed numbers. Unstable in this context doesn't mean that the machine will crash when doing system calls, of course - it just means that there is no guarantee that syscalls will have the same number or even exist between two versions of the operating system.

System calls in DOS

On DOS the "system call" interface for DOS itself is int 0x21, and some features aren't exposed by DOS, but by the BIOS (e.g. I/O, keyboard, timer), some are exposed by the video BIOS (e.g. VGA - int 0x10) and even others are specified for drivers (e.g. Mouse - int 0x33 or MSCDEX for CD-ROM access on int 0x2f, the DOS multiplex interrupt).

So which interrupts/syscalls exist in DOS? There's the well-known Ralf Brown's Interrupt List (it even has a Wikipedia page) and the HelpPC Reference Library also has - among other things - a list of interrupt services. I personally find the DOS interrupt list on HelpPC quite useful.

We will be using Open Watcom V2.0 as the compiler, and restrict ourselves to 16-bit DOS code, so we don't need to deal with protected mode. The hooking will still work for protected-mode programs, as they need to call down to real mode anyway when calling the DOS interrupt.

AH=0x0F, AH=0x16: Open/Create file using FCB

Quite high in the DOS interrupt list is int 0x21, ah = 0x0f, which is named "Open a File Using FCB" and int 0x21, ah = 0x16, which is "Create a File Using FCB". Some old DOS applications might be using those, but in my case (DOS Game Jam submissions) those weren't used, so I didn't look further. The techniques described in this article could also be used to hook those file open methods.

AH=0x3C, AH=0x3D: Create/Open File Using Handle

We are interested in these two function calls:

In both cases, the file name is specified using segment addressing, which means the DS register holds the "data segment" information, and the DX register holds the 16-bit "pointer" to the zero-terminated string within that segment -- this is written as DS:DX and means that the linear address used is calculated as DS * 16 + DX (because DS is shifted 4 bits to the left, which is the same as multiplying by 2⁴, which is 16).

As with most DOS functions, if the CF (carry flag, which resides in the FLAGS register) is set, then AX will contain the error code, but if CF is not set, AX will contain the file handle.

Since we are only interested in redirecting opening of files, as soon as the file is opened we don't need to do any additional hooking, so our task is to intercept the call, override DS:DX with a custom string, and then chain the original ISR (interrupt service routine, the software interrupt handler).

Hooking on Modern Systems

Just like above, let's recap the mechanisms on modern systems.

On Linux, the LD_PRELOAD mechanism can be used to override symbols from shared libraries. This only works for dynamically-linked libraries, and there are some other restrictions when it applies. But for a normal (non-statically-linked) binary, LD_PRELOAD can be used easily to hook e.g. fopen() and using dlsym(RTLD_NEXT, "fopen") one can get a pointer to the original function to call/chain. The Maemulator uses this technique to hook functions this way.

The LD_PRELOAD mechanism doesn't work for code that does the syscalls on its own or doesn't use the dynamic linker. For these cases, an approach like zpoline could be used.

On Windows, there are multiple techniques for API hooking, some of them involving ReadProcessMemory and WriteProcessMemory. There's also the more intrusive AppInit_DLLs setting, and/or you can use CreateRemoteThread to run a thread in another process.

Hooking Interrupts in DOS

Hooking an ISR is easy. First you store the old interrupt handler as a FAR pointer (which is a 32-bit pointer, 16-bit segment, and 16-bit offset of the code to run):

#include <stdio.h>
#include <dos.h>

void _interrupt _far (*old_int21_handler)(void);

int main()
{
    old_int21_handler = _dos_getvect(0x21);

    printf("Got handler: 0x%08lx\n", (unsigned long)old_int21_handler);

    return 0;
}

Assuming you save this as tut1.c, you would compile it with owcc (a GCC-like wrapper around the Open Watcom toolchain) using:

owcc -o tut1.exe tut1.c -I$WATCOM/h -bdos

This assumes that $WATCOM is set to where you extracted/installed Open Watcom (in case you are on 64-bit Linux, owcc would be in $WATCOM/binl64/owcc, which you can add to your $PATH). -bdos makes sure you are building a 16-bit DOS EXE (using -bcom would for example build a 16-bit DOS "COM" file).

Running the resulting tut1.exe in DOSBox yields:

Got handler: 0xf00014a0

Running it in DOSBox-X yields:

Got handler: 0xf000d0e0

The absolute address doesn't really matter, and depending on where in memory DOS was loaded, this might be different. In this case, the segment address 0xF000 (remember, the function pointer is a far pointer) means that the ISR resides in the Upper Memory Blocks, which starts at linear address 0xA0000 (that is, the 0xA000 segment).

Random aside: If you are interested how DOSBox-X implements the INT 0x21 interrupt handler, the file src/dos/dos.cpp has a function DOS_21Handler.

Now, to actually do the hooking, we need to write a different ISR to the interrupt vector 0x21, so we create a small function and set it up:

#include <stdio.h>
#include <dos.h>
#include <stdint.h>

void _interrupt _far (*old_int21_handler)(void);

void _interrupt
hook_int21(union INTPACK r)
{
    /* .. do something .. */

    _chain_intr(old_int21_handler);
}

int
main(int argc, char *argv[])
{
    old_int21_handler = _dos_getvect(0x21);
    printf("Old handler: 0x%08lx\n", (unsigned long)old_int21_handler);

    _dos_setvect(0x21, hook_int21);
    printf("New handler: 0x%08lx\n", (unsigned long)_dos_getvect(0x21));

    /* .. do something .. */

    _dos_setvect(0x21, old_int21_handler);

    return 0;
}

Do not forget to restore the original INT 21 handler on program exit, or bad things will happen (compared to modern OSes, where the OS will close most open resources like files, sockets, mapped memory, etc.. for you at process exit, DOS doesn't provide such a feature, and doesn't restore the interrupt handler).

For one, this code example shows how to define a new interrupt routine (a void function decorated with the _interrupt keyword, and taking an union INTPACK as single parameter). It also shows how to chain an interrupt method using _chain_intr() from Open Watcom. This function is defined in bld/clib/intel/a/chint086.asm, but no need to look into it now, as we'll get to it in due time.

How software interrupts work

Software interrupts work using the INT opcode in the x86 instruction set that will fire the interrupt. As described in the link above, the action of the INT n instruction is like a far call, but the FLAGS (in case of 16-bit x86) register content is pushed to the stack before the return address, and the return address is a far pointer consisting of the current CS (code segment) and IP (instruction pointer / program counter) register contents.

Once the ISR has finished running, it uses the IRET opcode to return from the interrupt, which consumes the 3 items pushed to the stack in the INT call: the program counter gets restored to IP, the code segment gets restored to CS and the flags register is restored from the stack (as if the POPF opcode was executed).

One special case that is worth mentioning is how the DOS ISR 0x21 takes care of setting/clearing the CF (carry flag) in the FLAGS register. It cannot set the flag manually, because the IRET will overwrite the FLAGS register with the flags stored on ISR entry. So instead of setting the flag, it reaches down into the stack to modify the stored-on-the-stack FLAGS content that will get applied when IRET is executed. In the DOSBox-X implementation, this can be found as CALLBACK_SCF() in src/cpu/callback.cpp, which reads the 16-bit word from SP + 4 (that's real_readw(SegValue(ss),reg_sp+4)), then sets or unsets the carry flag, and then stores the new flags again (real_writew(SegValue(ss),reg_sp+4,(uint16_t)tempf);).

Why am I telling you this? Because correctly forwarding the CF result from the DOS ISR will be important if we are to hook the file open function.

On The Stack

So, what else is on the stack? In case of Open Watcom, decorating a function with _interrupt will make sure that it saves all registers on enter, restores them on exit, and also that it uses IRET to return from the ISR.

Let's look at an example, create tut3.c:

#include <i86.h>

extern void function_body(void);

void _interrupt
empty_interrupt_function(union INTPACK r)
{
    function_body();
}

Then compile (don't link) it with owcc -c tut3.c -bdos -I$WATCOM/h - this should leave us with an object file called tut3.o. You can inspect the assembly that was generated using wdis tut3.o, which outputs (edited/trimmed for readability and comments added):

Segment: _TEXT BYTE USE16 00000024 bytes
0000  empty_interrupt_function_:

; ISR execution begins here, first all registers are
; pushed to the stack for later restoration, so that
; the interrupted function won't have its register
; contents overwritten -- the layout is the same as
; the "union INTPACK r" used as parameter)

0000  50        push  ax
0001  51        push  cx
0002  52        push  dx
0003  53        push  bx
0004  54        push  sp
0005  55        push  bp
0006  56        push  si
0007  57        push  di
0008  1E        push  ds
0009  06        push  es
000A  50        push  ax   ; dummy gs entry is a copy of ax
000B  50        push  ax   ; dummy fs entry is a copy of ax

; from here, the new stack frame is established:

000C  89 E5     mov   bp,sp

; not sure why it clears the direction flag in the FLAGS
; register, but this is the code that OpenWatcom generates:

000E  FC        cld

; the function body is called here, "DS" is forced
; so that the near call (E8 xx xx) will jump to the
; correct code segment:

000F  B8 00 00  mov   ax,DGROUP:CONST
0012  8E D8     mov   ds,ax
0014  E8 00 00  call  function_body_

; after the function body is done, the stack is
; unwound and backed-up registers are restored:

0017  58        pop   ax  ; dummy fs entry is restored
0018  58        pop   ax  ; dummy gs entry is restored
0019  07        pop   es
001A  1F        pop   ds
001B  5F        pop   di
001C  5E        pop   si
001D  5D        pop   bp
001E  5B        pop   bx
001F  5B        pop   bx
0020  5A        pop   dx
0021  59        pop   cx
0022  58        pop   ax
0023  CF        iret

This is quite simple, and it allows modifying the registers on the stack (by modifying the contents of union INTPACK r).

The FS and GS registers were only added on the 80386, so while union INTPACK includes those, the values in it are bogus for 16-bit code targetting the 8086 -- the OpenWatcom code pushes the content of AX into them, and when restoring first pops them and then pops AX, meaning practically those values are not used (but you can use their stack space to store additional data...).

Chaining to the original ISR

So we now know how OpenWatcom backs up all registers on the stack for an _interrupt function. In case we want to call the original ISR, we could just use _chain_intr, but we want to modify DS:DX before chaining (so that the modified file name is used), and we want to restore the original DS:DX on return. In order to do this, we declare a new chaining function on the C side of things:

extern void _chain_intr_dsdx(
    void (_interrupt _far *__handler)(),
    unsigned short ds,
    unsigned short dx);

This function takes 3 parameters: A far pointer to the original ISR, a 16-bit DS value to restore, and a 16-bit DX value to restore. Both these values are for when we return from the ISR, the override DS:DX values we want to put in can be set in union INTPACK r directly.

After calling this function, it will set up the stack in such a way that the original ISR can be called (with modified DS:DX), and then it jumps to another function (_restore_ds_dx) that we implement in assembly that takes care of restoring the original DS:DX and then copies the carry flag (now in the FLAGS register, from the ISR return of the original ISR) to the stack-saved FLAGS that we will restore on the "IRET" carried out by the _restore_ds_dx). This might be a bit nested, but it seems to work fine in my tests, and seems easier to pull of compared to modifying the stack even further to allow a normal far "RET" to take care of jumping to the right caller.

Here's the assembler implementation of this functionality, which you can download as chain.s:

; chain a DOS interrupt with restoring of DS:DX to
; original values passed into the chain function
; 2023-09-12 Thomas Perl <m@thp.io>

_TEXT   segment word public 'CODE'
_TEXT   ends

_TEXT   segment

_restore_ds_dx proc far
    public "C", _restore_ds_dx

    ; we are almost done -- we just need to restore the original
    ; DS:DX that we overwrote with our custom filename, and we
    ; need to transfer the carry flag (used as status bit) from
    ; the current context (because the DOS ISR assumes "we" are
    ; the caller, so we see the carry flag, but it hasn't been
    ; written to the flags stored on the stack for our "iret")

    ; save ax and bp, as we are going to use ax as temporary, and
    ; bp is used so we can read/write the return flags buried in
    ; the stack (that will eventually be loaded by "iret"
    push ax
    push bp
    mov bp,sp

    ; offset is 12 bytes from the stack pointer, because:
    ;  2 bytes bp (pushed above)
    ;  2 bytes ax (pushed above)
    ;  2 bytes dx (stored for us)
    ;  2 bytes ds (stored for us)
    ;  4 bytes return address far pointer (CS:IP) (for "iret")
    mov ax,12[bp]

    ; clear the carry flag
    and ax, 0xFFFEh

    jnc no_carry

    ; set the carry flag
    or ax, 0x0001h

no_carry:
    ; store new FLAGS with carry flag from DOS ISR
    mov 12[bp],ax

    ; restore sp, bp and ax
    mov sp,bp
    pop bp
    pop ax

    ; restore saved ds/dx
    pop dx
    pop ds

    iret ; finally- return from ISR (with the correct carry flag)
_restore_ds_dx endp

_chain_intr_dsdx     proc far
    public "C", _chain_intr_dsdx
    ; never return to the caller
    ; doesn't have return address on the stack

    mov     sp,bp                   ; reset SP to point to saved registers

    ; incoming variables:
    ; ax = offset
    ; dx = segment
    ; bx = ds to restore
    ; cx = dx to restore

    xchg ax,bx

    ; incoming variables:
    ; bx = offset
    ; dx = segment
    ; ax = ds to restore
    ; cx = dx to restore

    ; stack layout before:
    ; bp +  0 = (dummy fs) (free for overwriting immediately)
    ; bp +  2 = (dummy gs) (free for overwriting immediately)
    ; bp +  4 = saved es
    ; bp +  6 = saved ds
    ; bp +  8 = saved di
    ; bp + 10 = saved si (can be overwritten after restore)
    ; bp + 12 = saved bp (moved)
    ; bp + 14 = saved sp (not restored)
    ; bp + 16 = saved bx (moved)
    ; bp + 18 = saved dx (can be overwritten after restore)
    ; bp + 20 = saved cx (swapped for restore)
    ; bp + 22 = saved ax (swapped for restore)
    ; bp + 24 = offset of ISR CALLER
    ; bp + 26 = segment of ISR CALLER
    ; bp + 28 = flags to restore for ISR CALLER

    xchg    cx,20[bp] ; restore cx, & put in "dx to restore"
    xchg    ax,22[bp] ; restore ax, & put in "ds to restore"

    mov si,10[bp] ; restore si immediately, so we can overwrite it
    mov 10[bp],bx ; store offset of ISR

    mov bx,12[bp] ; move saved bp
    mov 0[bp],bx

    mov 12[bp],dx ; store segment of ISR

    mov bx,16[bp] ; move saved bx
    mov 2[bp],bx

    mov dx,18[bp] ; restore dx immediately, so we can overwrite it

    mov bx,offset _restore_ds_dx ; load offset
    mov 14[bp],bx

    mov bx,seg _restore_ds_dx ; load segment
    mov 16[bp],bx

    mov bx,28[bp] ; load flags
    and bx,0FCFFh ; except for IF and TF
    mov 18[bp],bx

    ; required stack layout:              __
    ; bp +  0 = saved bp OK                 |
    ; bp +  2 = saved bx OK                 |
    ; bp +  4 = saved es -- STAYS THE SAME  |-- Restored using "pop"
    ; bp +  6 = saved ds -- STAYS THE SAME  |
    ; bp +  8 = saved di -- STAYS THE SAME _|
    ; bp + 10 = offset of ISR to call OK  \____ Called using "ret"
    ; bp + 12 = segment of ISR to call OK /           ____
    ; bp + 14 = offset of _restore_ds_dx -- CALCULATED    |
    ; bp + 16 = segment of _restore_ds_dx -- CALCULATED   |-- Used up by "iret" - return to _restore_ds_dx
    ; bp + 18 = flags for _restore_ds_dx -- COPY FROM 28 _|
    ; bp + 20 = dx to restore OK \___ Restored by _restore_ds_dx using "pop"
    ; bp + 22 = ds to restore OK /                     ____________
    ; bp + 24 = offset of ISR CALLER -- STAYS THE SAME             |
    ; bp + 26 = segment of ISR CALLER -- STAYS THE SAME            |-- Used up by "iret" in _restore_ds_dx,
    ; bp + 28 = flags to restore for ISR CALLER -- STAYS THE SAME _|   finally return to original ISR caller

    mov     bx,28[bp] ; restore flags
    and     bx,0FCFFh ; except for IF and TF
    push    bx        ; bx -> flags via stack
    popf

    ; restore saved registers
    pop bp
    pop bx
    pop es
    pop ds
    pop di

    ; consume the offset + segment of the ISR to call
    ret

    ; the ISR will eventually "iret" to _restore_ds_dx, and this will
    ; restore the original "dx" and "ds" values from the stack and then
    ; do its own "iret" to return to the original ISR caller
_chain_intr_dsdx     endp
_TEXT ends
end

Putting it all together

The calling side in C is given below (doshook.c), abbreviated and simplified for readability. As an ISR just restores its code segment via the far call (CS), but doesn't restore its data segment (DS), we have to do some dirty "storing data in the code segment by overwriting bytes of a function that lives in the code segment, but that isn't ever called, so nobody will care about it" trick. In cases where CS equals DS, this might not be necessary, and you could just address data via MK_FP(CS, DS-relative-offset), but this solution now still works even in situations where CS != DS (by storing DS + two offsets within a known position in the code segment).

static void _interrupt _far
(*old_int21_handler)(void);

static char
original_filename_buf[64];

static char
redirect_filename_buf[64];

/**
 * This dummy function is used here to provide
 * some data storage in the code segment.
 *
 * While we are executing in hook_int21(), "cs" is
 * is set to this function's code segment, and we
 * can use its offset to read the "ds" of this
 * module and get a pointer to redirect_filename_buf.
 **/
static int
dummy_function_for_data_storage(int a)
{
    int res = 0;
    for (int i=0; i<a; ++i) {
        res += a;
    }
    return res;
}

struct DummyFunctionDataStorage {
    short ds;
    short original_filename_buf_offset;
    short redirect_filename_buf_offset;
};

static void _interrupt
hook_int21(union INTPACK r)
{
    /**
     * 3C = create/truncate file
     * 3D = open existing file (AL = 0 read only, AL = 1 write only, AL = 2 read/write)
     **/
    if (r.h.ah == 0x3c || r.h.ah == 0x3d) {
        int is_write = (r.h.ah == 0x3c || (r.h.ah == 0x3d && (r.h.al == 1 || r.h.al == 2)));

        // Store original ds/dx values here
        unsigned short orig_ds = r.w.ds;
        unsigned short orig_dx = r.w.dx;

        /* Determine DS from CS */
        struct DummyFunctionDataStorage far *dfds =
            (struct DummyFunctionDataStorage far *)dummy_function_for_data_storage;

        /* filename to open */
        char far *filename = MK_FP(r.w.ds, r.w.dx);
        char far *original_filename_far = MK_FP(dfds->ds, dfds->original_filename_buf_offset);
        char far *redirect_filename_far = MK_FP(dfds->ds, dfds->redirect_filename_buf_offset);

        char far *fn_cmp = filename;
        char far *or_cmp = original_filename_far;

        /* Do not call strcmp() here, as it's a library function,
         * and our segments might not be set up properly (for its
         * globals) */
        while (*fn_cmp != '\0' && *fn_cmp == *or_cmp) {
            ++fn_cmp;
            ++or_cmp;
        }

        if (*fn_cmp == '\0' && *or_cmp == '\0') {
            filename = redirect_filename_far;
            r.w.ds = FP_SEG(filename);
            r.w.dx = FP_OFF(filename);

            _chain_intr_dsdx(old_int21_handler, orig_ds, orig_dx);
        }
    }

    // Just chain normally
    _chain_intr(old_int21_handler);
}

static void
init_fileopen_hook()
{
    /**
     * Store our current data segment and pointer
     * to the redirect filename buffer in the
     * DummyFunctionDataStorage, so we can access
     * it in situations where we just know CS.
     **/
    struct SREGS segs;
    segread(&segs);

    struct DummyFunctionDataStorage far *dfds =
        (struct DummyFunctionDataStorage far *)dummy_function_for_data_storage;

    dfds->ds = segs.ds;
    dfds->original_filename_buf_offset = (short)original_filename_buf;
    dfds->redirect_filename_buf_offset = (short)redirect_filename_buf;

    // TODO: Retrieve original and redirect file names from game catalog,
    // and only if we have figured out that we are running from CD-ROM :)

    strcpy(original_filename_buf, "loonies8.hig");
    strcpy(redirect_filename_buf, "C:\\LOON8RE.DIR");

    old_int21_handler = _dos_getvect(0x21);
    _dos_setvect(0x21, hook_int21);
}

static void
deinit_fileopen_hook()
{
    _dos_setvect(0x21, old_int21_handler);
}

This just uses a single hardcoded filename, but one could see how this could be extended to work more dynamically, and maybe do an on-demand copy + redirect like overlayfs in Linux does (read-only is from the original path, but if opening for writing, copy the file to the writable path and open that file instead).

Because we're writing an ISR, you have to be careful to not call any functions that are inappropriate to call in the context of a ISR.

Making this generic is left as an exercise to the reader, this article is mostly for me to know the reasoning and stack effects behind the hooking of DOS INT 0x21 functions, and how to implement it with 16-bit OpenWatcom (C and ASM).