Thursday, August 4, 2011

Debugging Minifilters: Finding the FLT_CALLBACK_DATA - Part I

When debugging minifilters I frequently have to figure out what's wrong in the system at a point when something bughecks or fails in some OS component other than my minifilter. For example if I find something fails with a certain error code in NTFS I might put a breakpoint on where that error code is returned. In such a case I might be able to find the IRP representing the operation but finding the FLT_CALLBACK_DATA structure associated with that IRP is not that easy. I'll show a couple of techniques that I find useful.

First let me explain why finding the FLT_CALLBACK_DATA is in general more complicated than finding the IRP. As you probably expect, this is due to FltMgr's design. The IO model in NT is a call-through model. This means that each layer in an IO stack must call the next component directly. This design choice has a couple of side effects:

  • Each device in the IO stack knows what the device right below it is. This means that a device cannot be inserted at an arbitrary point in an IO stack but must always be attached on top. It also means that a device cannot be removed from an IO stack.
  • Considering that fact that each device must call the next device, if you have enough devices, that stack space used by these calls (local variables, SEH frames, function parameters and so on) will add up and it's possible to run out of stack.
  • Whoever allocates an IRP must know what the size of the IO stack is so that there are enough stack locations or the system will bugcheck when using that IRP. This makes preallocating IRPs for either pooling them or for forward progress a complicated proposition.
Now, these side effects are really unpleasant for file system filters for the following reasons:
  • A file system filter might need to be at a certain position in the IO stack and so the requirement that a device can only be attached on top of a stack can be very limiting. An example of this is an encryption filter that is not attached at boot time and that will end up on the top of the stack and will encrypt all data that the other filters see, including anti-virus filters, which prevents the anti-virus product from detecting viruses…
  • A file system filter might need to unload and it might not be able to do that if it's not at the top of the stack. Think about file system filters that need to run on enterprise servers that are never rebooted but that might need to have the filters updated every now and then.
  • A file system filter might reenter multiple times while processing one operation which means it needs to be very conservative with stack usage or risk running out of stack.
Anyway, in order to deal with all these problems FltMgr uses a callback model. This means that FltMgr will call each minifilter, then the minifilter processes the operation and when it's done processing it returns to FltMgr, unwinding the stack. This leads to better stack usage overall but it also means one cannot look on the stack and get to the stack frame corresponding to the minifilter and analyze it. Here is an example using a passthrough minifilter that I've rigged to break on some IRP_MJ_READ operations. The first stack is from when FltMgr is calling my minifilter's callback and the second one is on the same thread, when the request finally reaches NTFS after my minifilter is done with it. It’s the stack for the same thread at two different moments in time. While looking at this piece of debugger output, please note that because the minifilter returns control to FltMgr the ammount of used stack for file system filters (which is pretty much the stack used by FltMgr) is rather small by the time the IRP finally reaches NTFS. Another thing to note is that just by looking at the first two stack traces it's impossible to tell how many filters are attached on the volume because all the data on the stack frame is lost (the number of filters attached to this volume can be seen in the output ot the !fltkd.volume command).
1: kd> kf
  Memory  ChildEBP RetAddr  
          b3989ad4 92647627 PassThrough!PtPreReadPassThrough+0x90 [c:\temp6\passthrough\passthrough.c @ 989]
       78 b3989b4c 92622aeb fltmgr!FltvPreOperation+0x81
       6c b3989bb8 926259f0 fltmgr!FltpPerformPreCallbacks+0x34d
       18 b3989bd0 92625f01 fltmgr!FltpPassThroughInternal+0x40
       24 b3989bf4 926263ba fltmgr!FltpPassThrough+0x203
       30 b3989c24 82b816c3 fltmgr!FltpDispatch+0xb4
       24 b3989c48 8288e473 nt!IovCallDriver+0x258
       14 b3989c5c 82a8ff6e nt!IofCallDriver+0x1b
       20 b3989c7c 82aa1f32 nt!IopSynchronousServiceTail+0x1f8
       8c b3989d08 8289544a nt!NtReadFile+0x644
 …

1: kd> kf
  Memory  ChildEBP RetAddr  
          b3989b20 9267dbae Ntfs!NtfsCommonRead
       70 b3989b90 82b816c3 Ntfs!NtfsFsdRead+0x279
       24 b3989bb4 8288e473 nt!IovCallDriver+0x258
       14 b3989bc8 9262620c nt!IofCallDriver+0x1b
       24 b3989bec 926263cb fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
       38 b3989c24 82b816c3 fltmgr!FltpDispatch+0xc5
       24 b3989c48 8288e473 nt!IovCallDriver+0x258
       14 b3989c5c 82a8ff6e nt!IofCallDriver+0x1b
       20 b3989c7c 82aa1f32 nt!IopSynchronousServiceTail+0x1f8
       8c b3989d08 8289544a nt!NtReadFile+0x644
 ...

1: kd> !fltkd.volume 0x94de4ae0 

FLT_VOLUME: 94de4ae0 "\Device\HarddiskVolume2"
 ...
   NameCacheCtrl            : (94de4e60)
   InstanceList             : (94de4b30)
      FLT_INSTANCE: b4c22b40 "PassThrough Instance" "370030"
      FLT_INSTANCE: a3290a50 "luafv" "135000"
      FLT_INSTANCE: 98a8adc8 "FileInfo" "45000"


Ok, so now with the stack where the request reached NTFS we need to try to find the IRP and the FLT_CALLBACK_DATA structure. There is a particularly useful structure that we can use here, the IRP_CALL_CTRL (which I've talked about in a couple of other posts). This structure is undocumented and not available to minifilters, but it is available in the public symbols from Microsoft. This is the first parameter to the call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted, so we can get it from the stack. So here we go (I've highlighted the IRP, FLT_VOLUME, IRP_CALL_CTRL and FLT_CALLBACK_DATA):
1: kd> kb
ChildEBP RetAddr  Args to Child              
b3989b20 9267dbae 9d1acee0 b52c8e28 21f2a263 Ntfs!NtfsCommonRead
b3989b90 82b816c3 97914020 b52c8e28 00000000 Ntfs!NtfsFsdRead+0x279
b3989bb4 8288e473 00000000 b52c8e28 97914020 nt!IovCallDriver+0x258
b3989bc8 9262620c 979133a8 b52c8e28 00000000 nt!IofCallDriver+0x1b
b3989bec 926263cb b3989c0c 979133a8 00000000 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
b3989c24 82b816c3 979133a8 b52c8e28 985c8268 fltmgr!FltpDispatch+0xc5
b3989c48 8288e473 00000000 b52c8e28 979133a8 nt!IovCallDriver+0x258
b3989c5c 82a8ff6e b52c8e28 b52c8fdc 985c8268 nt!IofCallDriver+0x1b
b3989c7c 82aa1f32 979133a8 985c8268 00000001 nt!IopSynchronousServiceTail+0x1f8
b3989d08 8289544a 979133a8 b52c8e28 00000000 nt!NtReadFile+0x644
...
1: kd> !irp b52c8e28
Irp is active with 10 stacks 9 is current (= 0xb52c8fb8)
 No Mdl: No System Buffer: Thread 981d68e0:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0 10 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0 e0 97914020 985c8268 92625aae-b8ce4de0 Success Error Cancel 
        \FileSystem\Ntfs fltmgr!FltpPassThroughCompletion
   Args: 00001000 00000000 00000000 00000000
 [  3, 0]   0  1 979133a8 985c8268 00000000-00000000    pending
        \FileSystem\FltMgr
   Args: 00001000 00000000 00000000 00000000
1: kd> dt b3989c0c fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0x94de4ae0 _FLT_VOLUME
   +0x004 Irp              : 0xb52c8e28 _IRP
   +0x008 IrpCtrl          : 0xb8ce4de0 _IRP_CTRL
   +0x00c StartingCallbackNode : 0xffffffff _CALLBACK_NODE
   +0x010 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x014 Flags            : 0x204 (No matching name)
1: kd> !fltkd.cbd 0xb8ce4de0 

IRP_CTRL: b8ce4de0  READ (3) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : b52c8e28 
DeviceObject             : 979133a8 "\Device\HarddiskVolume2"
FileObject               : 985c8268 
CompletionNodeStack      : b8ce4e98   Size=5  Next=2
SyncEvent                : (b8ce4df0)
InitiatingInstance       : 00000000 
Icc                      : b3989c0c 
PendingCallbackNode      : ffffffff 
PendingCallbackContext   : 00000000 
PendingStatus            : 0x00000000 
CallbackData             : (b8ce4e40)
 Flags                    : [00000001] Irp
 Thread                   : 981d68e0 
 Iopb                     : b8ce4e6c 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0xffffffff 
 IoStatus.Information     : 00000000 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4fb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4f70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4f28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [3,0]    40060900  00   0002   98a8adc8 985c8268 92647760-00000000   b8ce4ee0
            ("FileInfo","FileInfo")  fileinfo!FIPostReadWriteCallback  (verified)
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
 [3,0]    40060900  00   0002   b4c22b40 985c8268 92647760-00000000   b8ce4e98
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough  (verified)
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
Working IOPB:
>[3,0]    40060900  00          98a8adc8 985c8268                     b8ce4e6c
            ("FileInfo","FileInfo")  
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
Another thing to note is that the IRP_CALL_CTRL structure is allocated on the stack. So if for whatever reason you can't get it from the stack (amd64 for example where parameters are passed in registers), you can simply search the stack for it. The first member of the structure is a pointer to the FLT_VOLUME structure for the volume and the second member is a pointer to the IRP so you can just search for those two (i wrote it in a more generic fashion to automatically search the stack of the current thread; it also assumes that the default expression evaluator is masm). Please note that the proper size must be used (d for x86, q for amd64) because for whatever reason the search command doesn't implement a 'p' type which would have been useful (that would be the pointer size on the machine like used in the dp command):
1: kd>  s -d @$csp L(@@(((_KTHREAD*)@$thread)->InitialStack) - @$csp) 0x94de4ae0 0xb52c8e28 
b3989c0c  94de4ae0 b52c8e28 b8ce4de0 ffffffff  .J..(.,..M......
And here is what this looks like on an amd64 machine. Please note that the IRP is in a different location on the stack. Also, I'm not showing how I got the FLT_VOLUME in this case (eyeballing the output of !fltkd.volume). The same color scheme but of course the values are different:
0: kd> kb
RetAddr           : Args to Child                                                           : Call Site
fffff880`0121ba68 : fffffa80`1a9c4010 fffff880`046608f8 fffff880`046608e0 fffffa80`1abb6b00 : Ntfs!NtfsCommonRead
fffff880`010f7bcf : fffffa80`1b2c4bd8 fffffa80`1b2c4880 fffffa80`1abb6b90 00000000`00000001 : Ntfs!NtfsFsdRead+0x1b8
fffff880`010f66df : fffffa80`19a94350 00000000`00000001 fffffa80`19a94300 fffffa80`1b2c4880 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff800`02ba171b : 00000000`00000000 fffffa80`1aff75b0 00000000`00000001 fffffa80`1b2c4880 : fltmgr!FltpDispatch+0xcf
fffff800`02b830b3 : fffffa80`1aff75b0 fffffa80`1aff75b0 fffffa80`1aff75b0 fffff800`02a0be80 : nt!IopSynchronousServiceTail+0xfb
fffff800`028988d3 : fffffa80`1b2d2060 00000000`00000000 00000000`00000000 00000000`00000000 : nt!NtReadFile+0x631
0: kd> s -q @$csp L(@@(((_KTHREAD*)@$thread)->InitialStack) - @$csp) fffffa8019ab0450 fffffa801b2c4880
fffff880`04660a10  fffffa80`19ab0450 fffffa80`1b2c4880
0: kd> dt fffff880`04660a10 fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0xfffffa80`19ab0450 _FLT_VOLUME
   +0x008 Irp              : 0xfffffa80`1b2c4880 _IRP
   +0x010 IrpCtrl          : 0xfffffa80`1abb6b90 _IRP_CTRL
   +0x018 StartingCallbackNode : 0xffffffff`ffffffff _CALLBACK_NODE
   +0x020 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x028 Flags            : 0x204 (No matching name)
0: kd> !fltkd.cbd 0xfffffa80`1abb6b90

IRP_CTRL: fffffa801abb6b90  READ (3) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : fffffa801b2c4880 
DeviceObject             : fffffa8019a94350 "\Device\HarddiskVolume1"
FileObject               : fffffa801aff75b0 
CompletionNodeStack      : fffffa801abb6ce0   Size=5  Next=2
SyncEvent                : (fffffa801abb6ba8)
InitiatingInstance       : 0000000000000000 
Icc                      : fffff88004660a10 
PendingCallbackNode      : ffffffffffffffff 
PendingCallbackContext   : 0000000000000000 
PendingStatus            : 0x00000000 
CallbackData             : (fffffa801abb6c40)
 Flags                    : [00000001] Irp
 Thread                   : fffffa801aff3660 
 Iopb                     : fffffa801abb6c98 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 0000000000000000 
 TagData                  : 0000000000000000 
 FilterContext[0]         : 0000000000000000 
 FilterContext[1]         : 0000000000000000 
 FilterContext[2]         : 0000000000000000 
 FilterContext[3]         : 0000000000000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6ee0
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6e60
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6de0
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [3,0]    00060900  00   0002   fffffa8019b40bb0 fffffa801aff75b0 fffff880011422f4-0000000000000000   fffffa801abb6d60
            ("FileInfo","FileInfo")  fileinfo!FIPostReadWriteCallback 
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
 [3,0]    00060900  00   0002   fffffa801b365010 fffffa801aff75b0 fffff88003b4b1a0-0000000000000000   fffffa801abb6ce0
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough 
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
Working IOPB:
>[3,0]    00060900  00          fffffa8019b40bb0 fffffa801aff75b0                     fffffa801abb6c98
            ("FileInfo","FileInfo")  
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
This is getting too long so I'll split the post here. In the next post I'll show how to do this (get the FLT_CALLBACK_DATA) when the IRP was pended and therefore there might not be a call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted at all or even worse the thread can be gone altogether so we can't search the stack.