On 9/10/07, Oren Beck [email protected] wrote:
Ok folks, here's a mindset wrench.
CAN we make a set of fallbacks to allow a certain minimal function allowing a potential "panic" to seek external help?
Not much.
The defining characteristic of a kernel panic is an error that the kernel has detected within its own data structures and procedures, from which recovery is not possible. An application shouldn't be able to cause a panic; only kernel code itself (including drivers) or a HW problem like bad RAM can do so. AIX and Macs have some NVRAM set aside into which the kernel is able to write some information about the error before shutting down or rebooting. At boot time, these OSes copy the error information to a file on the hard drive from which further analysis is possible. That part could easily be automated.
But the only reason that IBM and Apple are able to do this is that they control the hardware. They are able to spec the exact way in which this NVRAM is to be written. It isn't safe for a kernel that has panicked to write to any hard drive, because the kernel data structures that keep track of what files belong where on the drive are suspect. Writing any data to the drive risks corrupting the entire filesystem.
So we either have to get this special reserved NVRAM, ideally supported by a BIOS ROM routine that can't possibly have been corrupted, or a network interface operating under the same constraints that can send a kernel panic report somewhere that it can be safely saved....
Or we virtualize.
We write a VM system that sets up one or more virtual machines, do our real computing within the VM(s), which would have panic() configured to put the panic report into a specific location in (high?) memory before calling for a warm reset, then let the host system trap the reset instruction. It should be able to detect the signature of a panic, and write the memory image to a file on the host while rebooting the VM. Being able to put the whole mess into a debugger would be so valuable.
As multi-core CPUs and hardware assistance to virtualization become more common, this should be easier.