Debugging Windows NT Kernel STOPs on RISC-Based Platforms

ID: Q157472


The information in this article applies to:


SUMMARY

This article discusses the basic steps involved in reading and interpreting the kernel stack on RISC platforms, using the DEC Alpha AXP as a base example. The following basic areas are covered


MORE INFORMATION

It is assumed that the reader has a basic understanding of kernel debugging Windows NT on Intel-based systems, as that is used as a basis for comparison. Any debug commands listed should work in WINDBG as well as the KD debuggers (ALPHAKD, MIPSKD, PPCKD).

Part 1: The Windows NT Stack on a RISC System

On an Intel-based system, a stack dump using the KB or KV command gives you a wealth of information. This is because of three things that are done in Intel assembly when a function call occurs: These three steps allow the debugger to display a stack showing the first three arguments passed to each function, as well as other information (such as trap frames). On a RISC system, the frame pointers (equivalent to the EBP) are present as are the return addresses. However, arguments are passed in special argument registers and cannot be found on the stack. RISC systems will use the stack for more permanent storage of variables (as you will see later).

On MIPs-based and Dec Alpha-based systems, the arguments are passed in order, first to last, in registers labeled a0, a1, a2 and so on (referred to as the argument registers). There are 4 such registers on MIPs systems (a0 - a3) and 6 on an Alpha (a0 - a5). If there are more arguments than can fit in the aX registers, both types of systems use the temporary (t0 - t7) registers. On PPC systems, the arguments are passed in order, first to last, in the registers r3, r4, r5 - r31.

The following examples show an Intel system and an Alpha system stack (using the KB and KV command on the Intel system, the KB command only on the alpha):

   KDx86> kb
   FramePtr  RetAddr   Param1   Param2   Function Name
   fcbc69ac  80128bfb  fcc86588 fcdb3808 NT!KiTrap0E+0x252
   fcbc6a58  8013b26b  00000001 f8bc6ee0 NT!MmAccessFault+0x1cd
   fcbc6a58  80102af6  00000001 f8bc6ee0 NT!KiTrap0E+0xa7
   fcbc6af8  801bc367  fcc83000 fa4e6000 NT!@IofCompleteRequest@8+0x15c
   fcbc6b04  801bd84b  fcbc6c80 fcbc6c84 NTFS!NtfsCompleteRequest+0x58
   fcbc6b14  801bdb7b  fcdb3808 fcd75020 NTFS!NtfsCommonWrite+0xee8
   fcdb3808  fcbc6d34  00000043 00000000 NTFS!NtfsCommonWrite+0x1218

   KDx86> kv
   fcbc69ac  80128bfb  NT!KiTrap0E+0x252 (FPO: [0,0] TrapFrame @ fcbc69ac)
   fcbc6a58  8013b26b  NT!MmAccessFault+0x1cd
   fcbc6a58  80102af6  NT!KiTrap0E+0xa7 (FPO: [0,0] TrapFrame @ fcbc6a6c)
   fcbc6af8  801bc367  NT!@IofCompleteRequest@8+0x15c
   fcbc6b04  801bd84b  NTFS!NtfsCompleteRequest+0x58 (FPO: [3,0,2])
   fcbc6b14  801bdb7b  NTFS!NtfsCommonWrite+0xee8 (FPO: [seh] [0,0,0])
   fcdb3808  fcbc6d34  NTFS!NtfsCommonWrite+0x1218

   KDalpha> kb
   FramePtr  RetAddr   Param1   Param2   Function Name
   f18a6f20  80554ca8  818e20a0 001901ac NT!KeBugCheckEx+0x58
   f18a7220  8056c920  818e20a0 001901ac NTFS!NtfsExceptionFilter+0x118
   f18a7250  800b0dc8  001901ac 001901ac
   NTFS!NtfsCommonFileSystemControl+0xa0
   f18a7260  800d8ef0  001901ac 001901ac NT!OtsCSpecificHandler+0x78
   f18a72b0  800b08cc  001901ac 001901ac
   NT!RtlpExecuteHandlerForException+0x10
   f18a72c0  800c4360  001901ac 001901ac NT!RtlDispatchException+0xec
   f18a7600  800c2840  001901ac 001901ac NT!KiDispatchException+0x3f0
   f18a7900  800c2980  001901ac 001901ac NT!KiExceptionDispatch+0x50
   f18a79a0  80082f80  001901ac 001901ac
   NT!KiMemoryManagementException+0xc8
   f18a7ba0  80552c94  001901ac 001901ac NT!ExFreePool+0x270
   f18a7bf0  800a758c  001901ac 001901ac NTFS!NtfsFreeFcbTableEntry+0xa4 

NOTE: In the above stacks, the Param3 column was removed for readability.

The first two stacks are from a STOP 0xA on an Intel system. In the KB command output, the first two parameters passed to each function can be seen on the stack and you would normally see the third parameter as well. In the KV command, trap frames associated with the function call also show up.

The third stack is from a STOP 0x24 occurring on a DEC Alpha system. The frame pointers and the return addresses are all valid, however, the parameters, although they may appear to be valid addresses, are not necessarily arguments. Generally, the parameters are values that are pushed on the stack by the functions themselves.

Part 2: Finding the Trap Frame On a Windows NT RISC System

On a RISC system, the KV command will not show the trap frame as it will on an Intel system; more work is required to find the trap frame. In most cases, the frame pointer from one of the exception handling routines will be used as a trap frame and will need to identify the correct function. If that is not possible, you might also be able to identify a function which passed the trap frame as an argument.

Here is an example from the STOP 0x24 stack listed above:

   KDalpha> kb
   FramePtr  RetAddr   Function Name
   f18a6f20  80554ca8  NT!KeBugCheckEx+0x58
   f18a7220  8056c920  NTFS!NtfsExceptionFilter+0x118
   f18a7250  800b0dc8  NTFS!NtfsCommonFileSystemControl+0xa0
   f18a7260  800d8ef0  NT!OtsCSpecificHandler+0x78
   f18a72b0  800b08cc  NT!RtlpExecuteHandlerForException+0x10
   f18a72c0  800c4360  NT!RtlDispatchException+0xec
   f18a7600  800c2840  NT!KiDispatchException+0x3f0
   f18a7900  800c2980  NT!KiExceptionDispatch+0x50
   f18a79a0  80082f80  NT!KiMemoryManagementException+0xc8
   f18a7ba0  80552c94  NT!ExFreePool+0x270
   f18a7bf0  800a758c  NTFS!NtfsFreeFcbTableEntry+0xa4
   f18a7c20  80580a80  NT!RtlDeleteElementGenericTable+0x6c 

First, recognize where the exception handling code starts and ends. This indicates what portion of the stack is exception handling code and where the trap occurred. In the above stack, the first exception handling routine is NT!KiMemoryManagementException, the rest of the stack is all exception code. This means that the actual line which caused the trap is NT!ExFreePool+0x270.

Next, walk through the code for each function, starting with KiMemoryManagementException until you find one that deals with the trap frame. There are two ways in this stack - first the FramePtr of KiMemoryManagementException should point to the trap frame. Additionally, the trap frame is passed to KiDispatchException as its third parameter.

Doing a !Trap on the FramePointer for KiMemoryManagementException shows the following:

   KDalpha> !trap f18a79a0
   Debugger extension library [kdextalp.dll] loaded
   v0 = 00000000 00000040     a0 = 00000000 00000000
   t0 = 00000000 00000000     a1 = 00000000 00000001
   t1 = 00000000 0002f89c     a2 = ffffffff e1836048
   t2 = 00000000 00000000     a3 = ffffffff 81918008
   t3 = 00000000 00000000     a4 = ffffffff e19d69ec
   t4 = 00000000 00000001     a5 = 00000000 0039a014
   t5 = 00000000 00000000     t8 = ffffffff e1b10f88
   t6 = ffffffff c1b11008     t9 = ffffffff e188af8c
   t7 = 00000000 000003c0    t10 = ffffffff e188af8c
                             t11 = ffffffff 809fcb08
                              ra = ffffffff 80082ed0
                             t12 = ffffffff 809fcb08
                              at = ffffffff 818e0065
                              gp = ffffffff 800ee088
   fp = 00000000 00000004     sp = ffffffff f18a7ba0
   fir= ffffffff 80082f80
   ExFreePool+0x270
   0x80082f80  a0e70000         ldl         t6,0x0(t6) 

Part 3: Tracing the Function Call Arguments

Despite the fact that the arguments are not pushed on the stack on a RISC system, it is still possible to trace the arguments as they are passed from function to function; although it will require some knowledge of the assembly language used to do so. The only thing that makes it possible is the fact that in RISC assembly, most function calls initially save off several of the commonly used registers on to more permanent storage locations on the stack. This means that it can be possible to trace the arguments by performing the following steps:
  1. Disassemble each function on the stack in two places, at the very beginning and before the call to the next function.


  2. Look at what is being put into the argument registers before each function call, and what happens to both the argument registers and any other registers in use after the function call.


In most cases, you will find one of two things; either the argument registers themselves are pushed on the stack at the beginning of the call, or the registers whose values where loaded on to the argument registers were pushed on the stack.

Here is an example from an Alpha dump file. The portion of the stack you wish to trace is:

   FramePtr  RetAddr Function Name
   f16c7550  ec138ae8 NTFS!BinarySearchIndex+0x134
   f16c7670  ec133254 NTFS!FindFirstIndexEntry+0xf8
   f16c76d0  ec13ba00 NTFS!NtfsRestartIndexEnumeration+0xe4
   f16c7830  ec1375f4 NTFS!NtfsQueryDirectory+0x728
   f16c7a50  ec12d930 NTFS!NtfsCommonDirectoryControl+0x124
   f16c7a90  8008607c NTFS!NtfsFsdDirectoryControl+0xe0
   f16c7b10  ec324d40 NT!IofCallDriver+0x8c 

Starting with the first function, BinarySearchIndex, the code shows 5 parameters, which means that on an Alpha, registers a0 through a4 will be used to pass them.

Now disassemble the function which called BinarySearchIndex, right before the return address. This will reveal where the values in the a0 through a4 registers came from:

NTFS!FindFirstIndexEntry+0xe4:
0xec138ad4  47ea0411         bis          zero,s1,a1
0xec138ad8  a21e0054         ldl          a0,0x54(sp)
0xec138adc  47eb0413         bis          zero,s2,a3
0xec138ae0  47ec0414         bis          zero,s3,a4
0xec138ae4  47e90412         bis          zero,s0,a2
0xec138ae8  d3401c29         bsr          ra,BinarySearchIndex 

The above assembly instructions would be read as follows:
  1. bis zero,s1,a1: bis is the mnemonic for a logical or, zero is a reference to a special register on the Alpha that always holds the value zero. This instruction is a fast copy from one register to another, in this case s1 to a1. The other three bis instructions accomplish the same purpose with different registers.


  2. ldl a0,0x54(sp): ldl is the mnemonic for load long, which loads the long (dword) value from the memory address in operand 2 into the register in operand 1. 0x54(sp) is the Alpha equivalent of the Intel instruction dword ptr [ebp+54].


  3. bsr ra,BinarySearchindex: bsr is a branch subroutine command, this is effectively the same as a call on an Intel system.


Based on the above assembly, the following values are being placed in argument registers:

a0 = 0x54(SP)
a1 = s1
a2 = s0
a3 = s2
a4 = s4 

Now disassemble the beginning of BinarySearchIndex to see what is done with all of the above registers:

NTFS!BinarySearchIndex+0x0:
0xec13fb90  23defee0         lda          sp,-0x120(sp)
0xec13fb94  b53e0000         stq          s0,0x0(sp)
0xec13fb98  b55e0008         stq          s1,0x8(sp)
0xec13fb9c  b57e0010         stq          s2,0x10(sp)
0xec13fba0  b59e0018         stq          s3,0x18(sp)
0xec13fba4  b5be0020         stq          s4,0x20(sp)
0xec13fba8  b5de0028         stq          s5,0x28(sp)
0xec13fbac  b5fe0030         stq          fp,0x30(sp)
KDalpha> u
NTFS!BinarySearchIndex+0x20:
0xec13fbb0  b75e0038         stq          ra,0x38(sp)
0xec13fbb4  47f1040a         bis          zero,a1,s1
0xec13fbb8  b21e0040         stl          a0,0x40(sp)
<BR/> 
In the above code, you are saving off a number of the registers using the instructions stq (store quadword) and stl (store longword)(dword). These instructions work similarly to load longword(ldl) but in reverse: the value in the register is written out to the memory address specified by the memory location. In the first seven instructions, the s0 through s5 registers are written out to various locations on the stack, and later a0 is also written out on to the stack. You now know the following:

a0 = 0x40(sp)
a1 = s1 = 0x8(sp)
a2 = s0 = 0x0(sp)
a3 = s2 = 0x10(sp)
a4 = s4 = 0x20(sp) 

At the beginning of a function like this, the sp is equal to the FramePtr value given in the stack dump for this function(f16c7550), you can use that to dump out the values for the following arguments:

KDalpha> dd f16c7550+40 l1    Argument 1
0xF16C7590  80da1848
KDalpha> dd f16c7550+8 l1     Argument 2
0xF16C7558  e18d95c8
KDalpha> dd f16c7550 l1       Argument 3
0xF16C7550  e1ce9a40
KDalpha> dd f16c7550+10 l1    Argument 4
0xF16C7560  80e3f948
KDalpha> dd f16c7550+20 l1    Argument 5
0xF16C7570  e1ce9a08
<BR/> 
To verify that you have found the correct values, check the function code to determine the variable types, and use that information to determine if you have the correct values. This method will work for tracing the values of most arguments passed from function to function, although occasionally you might have to follow a variable through a couple of functions before you find it pushed out onto the stack in an identifiable location.

Additional query words: csu/dsu debugref


Keywords          : NTSrvWkst 
Version           : WinNT:3.1,3.5,3.51,4.0
Platform          : winnt 
Issue type        : kbinfo 

Last Reviewed: January 26, 1999