BSD DevCenter
oreilly.comSafari Books Online.Conferences.


IRIX Binary Compatibility, Part 4

by Emmanuel Dreyfus

Native Implementation of Signals

Signals are the difficult of part IRIX emulation. However, before examining the way they work on IRIX, let us study the signals implementation in NetBSD/mips.

A user process enters the kernel by a trap. When a trap is caught, the hardware transfers control to the kernel. Assembly code in sys/arch/mips/mips/locore.S builds a trap frame (this is a struct frame, defined in sys/arch/mips/include/proc.h) on the kernel stack, in which CPU registers are saved. Then the trap() function from sys/arch/mips/mips/trap.c is called to handle the trap.

When resuming the process execution, the kernel just restores the CPU registers from the trap frame. This restores the program counter, stack pointer, and so on.

When the kernel invokes a signal handler, it has to trick the user process so that on return to userland, it executes the signal handler instead of resuming normal execution where it was stopped when the trap occurred.

This is done by modifying the trap frame. The saved program counter is modified so that on return to userland the signal handler is run. Registers A0, A1 and A2 are set to the signal handler's arguments. This is done for native process in sys/arch/mips/mips/mips_machdep.c:sendsig().

In This Series

IRIX Binary Compatibility, Part 6
With IRIX threads emulated, it's time to emulate share groups, a building block of parallel processing. Emmanuel Dreyfus digs deep into his bag of reverse engineering tricks to demonstrate how headers, documentation, a debugger, and a lot of luck are helping NetBSD build a binary compatibility layer for IRIX.

IRIX Binary Compatibility, Part 5
How do you emulate a thread model on an operating system that doesn't support native threads (in user space, anyway)? Emmanuel Dreyfus returns with the fifth article of his series on reverse engineering and kernel programming. This time, he explains thread models and demonstrates how NetBSD emulates IRIX threads.

IRIX Binary Compatibility, Part 3
Emmanuel Dreyfus shows us some of the IRIX oddities, the system calls that you will not see anywhere else.

IRIX Binary Compatibility, Part 2
Emmanual Dreyfus shows us how he implemented the things necessary to start an IRIX binary. These things include the program's arguments, environment, and for dynamic binaries, the ELF auxiliary table, which is used by the dynamic linker to learn how to link the program.

IRIX Binary Compatibility, Part 1
This article details the IRIX binary compatibility implementation for the NetBSD operating system. It covers creating a new emulation subsystem inside the NetBSD kernel as well as some reverse engineering to understand and reproduce how IRIX internals work.

The difficult task is resuming the process normally once the signal handler is executed. To achieve this, we must save the machine state from the trap frame somewhere, and restore it correctly later. This is done by copying the machine state to a struct sigcontext (defined in sys/arch/mips/include/signal.h) on the process' user stack. The struct sigcontext address is handed to the signal handler through its third argument, as documented in sigaction(2).

Another requirement is to get control back after the signal handler is executed. On MIPS processors, there is an RA register, which holds the Return Address. Function returns are implemented in assembly by a jump to RA:

 j      $ra

The RA saved in the trap frame is used to control where the signal handler will return. We want to return to the kernel in order to undo the trap frame modification so that the process will be able to resume normally the next time it returns to userland. Unfortunately, it is not possible to just jump to kernel code from a user program: the user process must do a trap to return to the kernel.

Things are done by calling a dedicated system call: sigreturn(2). The RA is set to return to a small piece of code known as the signal trampoline, which is copied by the kernel onto the user stack. Then things work, because the signal handler returns to code in user space. The signal trampoline calls sigreturn(2).

Here is the signal trampoline for NetBSD/mips, as defined in sys/arch/mips/mips/locore.S

 addu    a0, sp, 16              # address of sigcontext
 li      v0, SYS___sigreturn14   # sigreturn(scp)

If the signal handler did not screw up the stack pointer, the signal context structure will be 16 bytes above the stack pointer. The signal trampoline sets up A0 with the address of the struct sigcontext, which will be the first argument to sigreturn(2).

It is important to hand the struct sigcontext to sigreturn(2), because it needs it in order to restore the process trap frame. This is done in sys/arch/mips/mips/mips_machdep.c:sigreturn().

sigreturn(2) is a system call that does not return: once the trap frame is restored and we return to user space, execution will resume after the trap that occurred before the signal handler execution. The context of the signal handler does not exist anymore.

IRIX Implementation of Signal Delivery

On IRIX, things are much more complicated. Once again, doing things incrementally is a good solution, allowing different problems to be addressed separately.

Hence, implementing a irix_sendsig() and irix_sigreturn() as plain copies of the native sendsig and sigreturn is a good first step. The only original item we need is an IRIX signal trampoline. It can be made from the native signal trampoline, since the only thing that needs to change is the system call number for irix_sigreturn:

 addu      $a0, $sp, 16             # address of sigcontext
 li        $v0, IRIX_SYS_sigreturn + SYSCALL_SHIFT
 syscall                            # irix_sys_sigreturn(scp)
 break     0                        # just in case sigreturn fails

Here we are storing into A0 the address of sigcontext, which is 16 bytes lower than the stack pointer. A0 is used to store the first argument to the irix_sys_sigreturn system call.

This is really basic, and we are far from emulating what really happens on IRIX, but, in fact, it will even "just work" for a lot of programs where the signal handler does not use its arguments.

Let us review the arguments of the signal handler on IRIX. According to the IRIX sigaction(2) man page, we have two situations.

If the SA_SIGINFO flag was not set on sigaction(2) call, then the signal handler is invoked with:

  • the signal number
  • a code indicating the cause of the signal (may be 0 for no known reason)
  • a pointer to the struct irix_sigcontext (defined in sys/compat/irix/irix_signal.h) where the process context was saved.

If SA_SIGINFO was set:

  • the signal number
  • a pointer to a struct irix_siginfo (defined in sys/compat/irix/irix_signalh) that explains in greater detail the causes of the signal. This pointer may be NULL
  • a pointer to the struct irix_ucontext (again see sys/compat/irix/irix_signal.h) where the process context was saved

If the signal handler attempts to use its arguments, we must accurately emulate them. When SA_SIGINFO is not set, things are simple because this is exactly what the native sendsig() sets up. However, there is a problem when SA_SIGINFO is set.

It is easy to modify irix_sendsig() so that it saves the context in a struct irix_ucontext instead of struct irix_sendsig when SA_SIGINFO was set. But the problem is in irix_sys_sigreturn(): how to distinguish between the situation where the context is to be restored from a struct irix_sigcontext or a struct irix_ucontext. Assuming the wrong structure will lead to restoring a screwed context, and this means a crash for the process, because execution will resume at the location the PC register was wrongly set to.

We have the beginning of an answer by looking at IRIX's sigreturn man page.

IRIX's sigreturn expects in three arguments:

  • a pointer to the struct sigcontext (called scp) where the context was saved
  • a mysterious pointer of type void *, named ucp
  • the signal number

One more time, Linux has already implemented this, hence the temptation is here to pick up good ideas from Linux. Looking at Linux source for sigreturn is not very instructive: when the first argument is NULL, Linux uses the mysterious ucp pointer to find the process context. Because we have no real idea about how to handle this ucp pointer and the SA_SIGINFO flag, we will just leave them behind for now. The easy thing to do is to modify irix_sendsig() and irix_sigreturn() to make use of struct irix_sigcontext instead of the native struct sigcontext. At least this will make programs that do not set SA_SIGINFO happy with the content of the signal context if they happen to access it.

Pages: 1, 2, 3, 4

Next Pagearrow

Sponsored by: