Monday, December 26, 2011

Troubleshooting 0x1E KMODE_EXCEPTION_NOT_HANDLED

The Debugging Tools for Windows are required to analyze crash dump files. If you do not have the Debugging Tools for Windows installed or dump files are not being generated on system crash, see this post for installation/configuration instructions:
http://mikemstech.blogspot.com/2011/11/windows-crash-dump-analysis.html

0x1000001E KMODE_EXCEPTION_NOT_HANDLED is a common bug check (BSOD) that occurs on Windows systems (Windows XP, Windows Server 2003, Windows Vista, Windows Server 2008, Windows 7, Windows Server 2008 R2, and Windows 8). Similar to 0x8E KERNEL_MODE_EXCEPTION_NOT_HANDLED, this error indicates that an error occurred in privileged mode (kernel mode) without any associated code to handle the error. Like 8E, there are a couple of variations, some that are due to other drivers (caused by memory corruption, often with a sub-status of 0xc0000005 STATUS_ACCESS_VIOLATION), and those that are due to the drivers in which they are detected.

Troubleshooting these bug checks are fairly straightforward, starting with a !analyze -v. If there is no memory corruption and a driver is identified, alternate versions of the identified driver should be tested, along with upgraded/downgraded BIOS versions. The same approach should be taken with memory corruption issues, but in addition, hardware problems should be ruled out and the driver verifier should be enabled to see if more informative dumps can be generated (specifically implicating a driver that is reading or writing invalid memory).

Below are a few debugging examples for this blue screen of death,

Example 1: Sloppy Programming/Development Practices

During the development processes, programmers insert breakpoints into specific sections of code so that they can use a debugger to see what the driver/application state is at a specific point of execution. Before an application/driver is released, the breakpoints should be removed, otherwise they will generate an exception. In kernel mode, this crashes the system. In user mode, this simply crashes the application (unless a debugger is attached and the command is given to continue past the breakpoint). Allowing a breakpoint to be shipped into production code is a sloppy practice that is the result of a poorly controlled and audited development process.

In this dump, the silabser.sys driver is implicated and the exception is clearly due to a breakpoint in the code that shipped with the final release.

5: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: 0000000000000000, The exception code that was not handled
Arg2: 0000000000000000, The address that the exception occurred at
Arg3: 0000000000000000, Parameter 0 of the exception
Arg4: 0000000000000000, Parameter 1 of the exception

Debugging Details:
------------------


EXCEPTION_CODE: (Win32) 0 (0) - The operation completed successfully.

FAULTING_IP: 
+3132323761623761
00000000`00000000 ??              ???

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000000000000000

ERROR_CODE: (NTSTATUS) 0 - STATUS_WAIT_0

BUGCHECK_STR:  0x1E_0

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

PROCESS_NAME:  System

CURRENT_IRQL:  2

EXCEPTION_RECORD:  fffff880030b08a8 -- (.exr 0xfffff880030b08a8)
ExceptionAddress: fffff800030d4a70 (nt!DbgBreakPoint)
   ExceptionCode: 80000003 (Break instruction exception)
  ExceptionFlags: 00000000
NumberParameters: 1
   Parameter[0]: 0000000000000000

TRAP_FRAME:  fffff880030b0950 -- (.trap 0xfffff880030b0950)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=fffff88000f60f70
rdx=fffffa8007ad26b0 rsi=0000000000000000 rdi=0000000000000000
rip=fffff800030d4a71 rsp=fffff880030b0ae8 rbp=0000000000000000
 r8=fffffa8007ad26b0  r9=fffff88000f60f40 r10=7efefeff71647261
r11=fffff880030b0ad0 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl nz na pe nc
nt!DbgBreakPoint+0x1:
fffff800`030d4a71 c3              ret
Resetting default scope

LAST_CONTROL_TRANSFER:  from fffff800030d45fe to fffff800030dcc10

STACK_TEXT:  
... : nt!KeBugCheck
... : nt!KiKernelCalloutExceptionHandler+0xe
... : nt!RtlpExecuteHandlerForException+0xd
... : nt!RtlDispatchException+0x415
... : nt!KiDispatchException+0x135
... : nt!KiExceptionDispatch+0xc2
... : nt!KiBreakpointTrap+0xf4
... : nt!DbgBreakPoint+0x1
... : Wdf01000!FxRequest::VerifierVerifyRequestIsCancelable+0x80
... : Wdf01000!FxIoQueue::RequestCancelable+0xe7
... : Wdf01000!imp_WdfRequestUnmarkCancelable+0xbc
... : silabser+0x7319
... : 0xfffffa80`0878b880
... : 0xfffffa80`0878ba10
... : 0x57f`f7874778
... : 0x57f`f7874778
... : 0x1
... : 0xfffffa80`07038d28
... : 0xc0000120
... : silabser+0x804e
... : 0xfffffa80`0878ba10


STACK_COMMAND:  kb

FOLLOWUP_IP: 
silabser+7319
fffff880`04dbf319 ??              ???

SYMBOL_STACK_INDEX:  b

SYMBOL_NAME:  silabser+7319

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: silabser

IMAGE_NAME:  silabser.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4e83514f

FAILURE_BUCKET_ID:  X64_0x1E_0_silabser+7319

BUCKET_ID:  X64_0x1E_0_silabser+7319

Followup: MachineOwner
--------- 
 

Example 2: Memory Corruption

This is a particularly interesting case of memory corruption because the symbols and state that was dumped was corrupted enough to make the symbols unrecognizable from a debugger perspective. Even though we can obviously tell that the problem was detected in the NT kernel, the debugger has difficulty analyzing the dump and complains (incorrectly) about missing symbols.


kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff80002e7c1d1, The address that the exception occurred at
Arg3: 0000000000000000, Parameter 0 of the exception
Arg4: ffffffffffffffff, Parameter 1 of the exception

Debugging Details:
------------------

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************
*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************
*************************************************************************
***                                                                   ***
***                                                                   ***
***    Your debugger is not using the correct symbols                 ***
***                                                                   ***
***    In order for this command to work properly, your symbol path   ***
***    must point to .pdb files that have full type information.      ***
***                                                                   ***
***    Certain .pdb files (such as the public OS symbols) do not      ***
***    contain the required information.  Contact the group that      ***
***    provided you with these symbols if you need this command to    ***
***    work.                                                          ***
***                                                                   ***
***    Type referenced: nt!_KPRCB                                     ***
***                                                                   ***
*************************************************************************

ADDITIONAL_DEBUG_TEXT:  
Use '!findthebuild' command to search for the target build information.
If the build information is available, run '!findthebuild -s ; .reload' 
to set symbol path and load symbols.

MODULE_NAME: nt

FAULTING_MODULE: fffff80002e0c000 nt

DEBUG_FLR_IMAGE_TIMESTAMP:  4a5bc600

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - 
     The instruction at 0x%08lx referenced memory at 0x%08lx. 
     The memory could not be %s.

FAULTING_IP: 
nt+701d1
fffff800`02e7c1d1 0fae55ac        ldmxcsr dword ptr [rbp-54h]

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  ffffffffffffffff

READ_ADDRESS: unable to get nt!MmSpecialPoolStart
unable to get nt!MmSpecialPoolEnd
unable to get nt!MmPoolCodeStart
unable to get nt!MmPoolCodeEnd
 ffffffffffffffff 

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced 
                            memory at 0x%08lx. The memory could not be %s.

BUGCHECK_STR:  0x1E_c0000005

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

CURRENT_IRQL:  0

LAST_CONTROL_TRANSFER:  from fffff80002ebda17 to fffff80002e7df00

STACK_TEXT:  
... : nt+0x71f00
... : nt+0xb1a17
... : 0x1e
... : 0xffffffff`c0000005
... : nt+0x701d1


STACK_COMMAND:  kb

FOLLOWUP_IP: 
nt+701d1
fffff800`02e7c1d1 0fae55ac        ldmxcsr dword ptr [rbp-54h]

SYMBOL_STACK_INDEX:  4

SYMBOL_NAME:  nt+701d1

FOLLOWUP_NAME:  MachineOwner

IMAGE_NAME:  ntoskrnU.exe

BUCKET_ID:  WRONG_SYMBOLS

Followup: MachineOwner
--------- 
 
See Also,
Windows Crash Dump Analysis
Troubleshooting Memory Errors

No comments:

Post a Comment