Machine Check Handling

Current PowerPC kernels have a number of flaws in their machine check handling. The first fault is that they raise a SIGSEGV exception if the machine check occurs during user mode code. Many causes of machine check are not directly related to the current process. For example, on some platforms a PCI parity error detected while a PCI bus master writes into memory will cause a machine check. On some platforms a parity error reading data from the L2 cache in order to write it out to memory will cause a machine check.

It’s not possible to deal with this by treating all machine checks as fatal, because some platforms generate machine checks in recoverable situations, such as configuration cycles to empty PCI slots. The patch thus creates a platform specific machine check handler, with example implementations for the Qspan PCI bridge, the Powermac, and the MPC107 bridge. The Qspan and Powermac implementations are based on the existing kernel code for those machines. Only the MPC107 implementation has been tested.

The patch also contains two minor features: it turns on the internal error checking of the 7400 and 7410, and it provides a mechanism to attach handlers to SMI, for boards which use that interrupt.