Thursday, August 25, 2011

Using EX_RUNDOWN_REF

This time I'd like to talk about a pretty cool OS primitive, the RundownReference. For some reason there doesn't seem to be an MSDN page on it so as usual with undocumented stuff, use at your own risk. Still, it's available in the WDK.H file and it's simple enough that the function declarations are pretty much all one needs to figure out how it works.

I imagine everyone reading this blog is familiar with reference counting. However, there is one particular access pattern that I run into a lot where reference counting is almost right but not quite. When writing filters I often find myself in a position where I get an object and I have some structure (a context) associated with it. The context contains some member variables or pointers to structures that are not necessarily valid for the lifetime of the context. Lets say that at some point I want to tear those down while the rest of the context is still around. I guess an example would explain thing a lot better:

//
//  Stream context data structure
//

typedef struct _MY_STREAM_CONTEXT {

    ...

    PVOID MyBuffer;
    ...


} MY_STREAM_CONTEXT, *PMY_STREAM_CONTEXT;

The scenario I have in mind is one where I have a buffer that I store in my StreamContext. However, when some event happens I need to change the buffer (possibly reallocate it). The problem I have is that when I get the event I can't tell who is using the buffer. One way to deal with the situation would be to add a lock to the structure and acquire the lock (could even be a shared lock) and use the buffer and then release the lock, but that places some restrictions on what I can do with the buffer (IO with a lock held should be avoided where possible). So if I can't hold a lock while using the buffer, the only alternative is to count how many threads are using it. So now I need to add a ref count. However, if it's just a regular ULONG then I have no mechanism to specify that I'm waiting for people to stop using my buffer and that they shouldn't use it anymore. So now I'd need to add a flag that I'd set when I'm waiting to free the buffer and then whoever gets a reference to the buffer would need to check the flag first and if the flag is set then they'd know that I'm waiting for it to tear down. Of course, the buffer and the refcount would have to be kept in sync so setting the flag and incrementing the refcount will need to be protected by a lock. Moreover, I'd like to wait for the threads that already got a reference to the buffer to release it but I don't want to busy-wait so I'll need an event or some other sort of signaling mechanism.. So as you can see, things have already gotten out of hand and I'd need to build a lot of additional infrastructure for such a simple thing… Enter the RundownReference!

The RundownReference works like a regular reference count, except that at some point someone can say "ok, no more references, let's wait for all the references to be released". After that point all new attempts to get a reference fail and the caller can block until all the references are gone…

This is the structure definition:

typedef struct _EX_RUNDOWN_REF {

#define EX_RUNDOWN_ACTIVE      0x1
#define EX_RUNDOWN_COUNT_SHIFT 0x1
#define EX_RUNDOWN_COUNT_INC   (1<<EX_RUNDOWN_COUNT_SHIFT)

    union {
        __volatile ULONG_PTR Count;
        __volatile PVOID Ptr;
    };
} EX_RUNDOWN_REF, *PEX_RUNDOWN_REF;

You can see that it's a very small structure, basically the size of a pointer. The APIs one can use with it are:

  • ExInitializeRundownProtection() - well, this initializes it...
  • ExAcquireRundownProtection() - this simply takes a reference. It is a BOOLEAN and it can fail if the RundownRef doesn't allow more references to be taken.
  • ExReleaseRundownProtection() - this releases a reference.
  • ExWaitForRundownProtectionRelease() - this sets the RundownRef into a "draining" mode, where no more references can be taken (ExAcquireRundownProtection() returns FALSE) and it blocks until the last reference is returned.
  • ExRundownCompleted() - this should be called after ExWaitForRundownProtectionRelease() returns. Calls to ExAcquireRundownProtection() will still fail, but it performs some cleanup on the structure.
  • ExReInitializeRundownProtection() - finally this will reinitialize the structure when and if it should be reused. Though looking at it in the debugger it's not really all that different from ExInitializeRundownProtection()… still, there it is.

So now all I need to do in my StreamContext is to add a RundownRef and I'm set:

typedef struct _MY_STREAM_CONTEXT {

    ...

    EX_RUNDOWN_REF MyBufferRundownRef;
    PVOID MyBuffer;
    ...


} MY_STREAM_CONTEXT, *PMY_STREAM_CONTEXT;

This is a pretty versatile primitive in that it can be allocated from regular pool (if you don't need to access it at high IRQL). I also think the routines should work at DISPATCH_LEVEL but I've never actually used them in that way (I see some calls for KeGetCurrentIrql() but I've not looked at everything going on so maybe it'll just check for DISPATCH_LEVEL and fail in a very elaborate way). All the functions are using the FASTCALL calling convention (no stack frame) which makes then pretty quick.

For completeness I'll post the declarations here but you should check WDM.H and you'll get the latest and greatest from there..

NTKERNELAPI
VOID
FASTCALL
ExInitializeRundownProtection (
    __out PEX_RUNDOWN_REF RunRef
    );

NTKERNELAPI
VOID
FASTCALL
ExReInitializeRundownProtection (
    __inout PEX_RUNDOWN_REF RunRef
    );

__checkReturn
__drv_valueIs(==0;==1)
NTKERNELAPI
BOOLEAN
FASTCALL
ExAcquireRundownProtection (
    __inout PEX_RUNDOWN_REF RunRef
    );

__checkReturn
__drv_valueIs(==0;==1)
NTKERNELAPI
BOOLEAN
FASTCALL
ExAcquireRundownProtectionEx (
    __inout PEX_RUNDOWN_REF RunRef,
    __in ULONG Count
    );

NTKERNELAPI
VOID
FASTCALL
ExReleaseRundownProtection (
    __inout PEX_RUNDOWN_REF RunRef
    );

NTKERNELAPI
VOID
FASTCALL
ExReleaseRundownProtectionEx (
    __inout PEX_RUNDOWN_REF RunRef,
    __in ULONG Count
    );

NTKERNELAPI
VOID
FASTCALL
ExRundownCompleted (
    __out PEX_RUNDOWN_REF RunRef
    );

NTKERNELAPI
VOID
FASTCALL
ExWaitForRundownProtectionRelease (
    __inout PEX_RUNDOWN_REF RunRef
    );

Thursday, August 18, 2011

Handling Handles in Minifilters

For the purpose of this article I'm going to use the very basic definition that a handle is simply an index in a table of objects. This definition is accurate but not particularly useful. It doesn't explain why handles are necessary or anything else about them for that matter. However, it focused on what's important for this post. In addition, there are a couple of other things about handles that are important for this post:

  • Kernel components use handles only when they need to talk to other kernel components using a well defined interface. This means that if the registry (for example) wants to read a file from disk, it shouldn't need to care about the internal implementation of a FILE_OBJECT and its size. Likewise, any kernel component using tokens shouldn't need to care about token structures and their sizes and so on. For one, just think about what the included header list would look like if everything had to have definitions for all the structures of the other kernel components it might interface with.
  • Some handle tables are global (the kernel handle table) while some are not (process handle tables). Since the handle is simply an index (a number) it doesn't have any information about which table it belongs to; that is except for kernel handles which are marked in a special way. So just by looking at a user mode handle it's impossible to tell which table it belongs to and so a user handle can only be resolved to the right object in the right process context.
  • Even though they are numbers, handles are actually defined as pointers.

So now let's see what the implications of all these things are for file system writers. File systems (and file system filters) need handles to various other OS components (like tokens in the above example) but one thing they don't need handles for are the files they manage. The normal approach is that the NT layer converts the handle it receives into the actual object it refers to before sending the request to the appropriate component for processing. In fact, in some cases the NT layer doesn't even know what the appropriate component is until it resolves the handle to the actual object. So this means that a file system or a filter should never see a request from the IO manager that has a handle to a FILE_OBJECT, right ? Unfortunately this is not the case. There are some cases where a file system might indeed receive a handle to a FILE_OBJECT that it manages. Personally I'm not convinced it's necessary and I think it's pretty bad design but that's neither here nor there. A quick peek in the WDK (in ntifs.h) shows a couple of structures that are used by file systems and filters:

typedef struct _FILE_LINK_INFORMATION {
    BOOLEAN ReplaceIfExists;
    HANDLE RootDirectory;
    ULONG FileNameLength;
    WCHAR FileName[1];
} FILE_LINK_INFORMATION, *PFILE_LINK_INFORMATION;

typedef struct _FILE_MOVE_CLUSTER_INFORMATION {
    ULONG ClusterCount;
    HANDLE RootDirectory;
    ULONG FileNameLength;
    WCHAR FileName[1];
} FILE_MOVE_CLUSTER_INFORMATION, *PFILE_MOVE_CLUSTER_INFORMATION;

typedef struct _FILE_RENAME_INFORMATION {
    BOOLEAN ReplaceIfExists;
    HANDLE RootDirectory;
    ULONG FileNameLength;
    WCHAR FileName[1];
} FILE_RENAME_INFORMATION, *PFILE_RENAME_INFORMATION;

typedef struct _FILE_TRACKING_INFORMATION {
    HANDLE DestinationFile;
    ULONG ObjectInformationLength;
    CHAR ObjectInformation[1];
} FILE_TRACKING_INFORMATION, *PFILE_TRACKING_INFORMATION;

#if (_WIN32_WINNT >= 0x0400)
//
// Structure for FSCTL_MOVE_FILE
//

typedef struct {

    HANDLE FileHandle;
    LARGE_INTEGER StartingVcn;
    LARGE_INTEGER StartingLcn;
    ULONG ClusterCount;

} MOVE_FILE_DATA, *PMOVE_FILE_DATA;

typedef struct {

    HANDLE FileHandle;
    LARGE_INTEGER SourceFileRecord;
    LARGE_INTEGER TargetFileRecord;

} MOVE_FILE_RECORD_DATA, *PMOVE_FILE_RECORD_DATA;


#if defined(_WIN64)
//
//  32/64 Bit thunking support structure
//

typedef struct _MOVE_FILE_DATA32 {

    UINT32 FileHandle;
    LARGE_INTEGER StartingVcn;
    LARGE_INTEGER StartingLcn;
    ULONG ClusterCount;

} MOVE_FILE_DATA32, *PMOVE_FILE_DATA32;
#endif
#endif /* _WIN32_WINNT >= 0x0400 */


//
//  Structure for FSCTL_MARK_HANDLE
//

typedef struct {

    ULONG UsnSourceInfo;
    HANDLE VolumeHandle;
    ULONG HandleInfo;

} MARK_HANDLE_INFO, *PMARK_HANDLE_INFO;

#if defined(_WIN64)
//
//  32/64 Bit thunking support structure
//

typedef struct {

    ULONG UsnSourceInfo;
    UINT32 VolumeHandle;
    ULONG HandleInfo;

} MARK_HANDLE_INFO32, *PMARK_HANDLE_INFO32;
#endif

I haven't yet been able to figure out what the FILE_TRACKING_INFORMATION structure is used for so I'm going to ignore it for now.

So why and when should a minifilter care about these operations ? Well, it should care about them whenever the handle resolution in the file system might lead to incorrect results. For example, if a minifilter pends some operations and then reissues them in the context of a different process then the file system (or any filter below) might do the resolution in the wrong process context and either fail or it would get the wrong object. The same goes for minifilters that might replace a FILE_OBJECT with a different FILE_OBJECT in some circumstances. In this case the file system (or any filter below) might get a different FILE_OBJECT than the one the caller intended.

I've discussed how minifilters process FILE_RENAME_INFORMATION in a previous post. As far as I know, everything in that post applies to FILE_MOVE_CLUSTER_INFORMATION and FILE_LINK_INFORMATION as well. Since the handle in the structure is created through IopOpenLinkOrRenameTarget(), it is always a kernel handle. This means that a minifilter doesn't need to worry about the process context.

There is yet another problem that minifilters must deal with, as hinted by the 32bit/64bit structures. The problem is that since HANDLE is a pointer, it has different sizes between 32bit and 64bit OSes and binaries. In most cases this doesn’t really cause any issues but on a 64bit machine someone might run a 32bit process, which is naturally using 32bit handles. The OS manager normally handles converting the 32bit handles (WOW64), but there isn't anything it can do about handles that are passed as data structures in general since it doesn't know the internal structure of that information. It could know it for OS defined structures like MOVE_FILE_DATA and MARK_HANDLE_INFO but it doesn't. So a minifilter that needs to handle those must take the size of the handle into account. Fortunately the FastFat sample shows how to deal with this. The main idea is to see whether the data in the IRP originated in a 32bit or a 64bit user mode process and then use the appropriate size (please note that minifilters have their very own FltIs32bitProcess() function):

#if defined(_WIN64)
    if (IoIs32bitProcess( Irp )) {

        if (InputBuffer == NULL || InputBufferLength < sizeof(MOVE_FILE_DATA32)) {

            FatCompleteRequest( IrpContext, Irp, STATUS_BUFFER_TOO_SMALL );
            return STATUS_BUFFER_TOO_SMALL;
        }

        MoveFileData32 = (PMOVE_FILE_DATA32) InputBuffer;

        LocalMoveFileData.FileHandle = (HANDLE) LongToHandle( MoveFileData32->FileHandle );
        LocalMoveFileData.StartingVcn = MoveFileData32->StartingVcn;
        LocalMoveFileData.StartingLcn = MoveFileData32->StartingLcn;
        LocalMoveFileData.ClusterCount = MoveFileData32->ClusterCount;

        InputBuffer = &LocalMoveFileData;

    } else {
#endif
        if (InputBuffer == NULL || InputBufferLength < sizeof(MOVE_FILE_DATA)) {

            FatCompleteRequest( IrpContext, Irp, STATUS_BUFFER_TOO_SMALL );
            return STATUS_BUFFER_TOO_SMALL;
        }
#if defined(_WIN64)
    }
#endif

This code fragment shows how to use the data, which is fine for file systems (or the consumer of the data), but things are a lot trickier for filters. A filter that issues the request directly instead of calling the Zw function (ZwSetInformationFile()) to set the previous mode properly might end up using a kernel handle (which is always 64bit) while IoIs32bitProcess() might return TRUE and so the file system (or underlying filters) would tread the handle as 32bit and the structure as being the 32bit version of the structure. Please note that issuing the request directly is that same as calling FltSetInformationFile() in this case since FltSetInformationFile() doesn't change the previous mode.

In conclusion, what a minifilter should do to properly handle these operations is:

  • If a filter changes the process context it must allocate a new handle in the table of the process where the request will be sent down or replace the handle with a kernel handle.
  • If the request comes from a 32bit process and the filter needs to change it, it should use the 32bit sized version of the structure.
  • If a filter needs to issue its own request then it must take into account the process context and the PreviousMode and use the appropriate sized structure.

Thursday, August 11, 2011

Debugging Minifilters: Finding the FLT_CALLBACK_DATA - Part II

In the post last week I've shown how to get to the FLT_CALLBACK_DATA when the original thread where the minifilter's callback was called is still available. This week I want to talk about pending. There are multiple scenarios to discuss, depending on who pends the IRP (the minifilter or a legacy filter or the file system) and where processing is with regards to the IRP (FltMgr still processing it or after it's been sent down).

When a minifilter is pending a request and processes it in a different thread, the original thread might not be available anymore so we can't rely on that to find the call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted. A real-world case where this might happen is when after installing such a minifilter the system freezes. By looking at the various threads one can see that some thread is blocked waiting for a request to complete but it's hard to tell if that IRP is stuck in the queue of some minifilter or which minifilter that might be. For this example I'll use an unmodified WDK sample, cancelSafe, which shows how a minifilter should implement pending of requests. This is what the threads involved in the process look like (using the same color scheme from the previous post for the IRP, FLT_VOLUME, IRP_CALL_CTRL and FLT_CALLBACK_DATA):
0: kd> !thread 981d68e0  
THREAD 981d68e0  Cid 0b00.0b04  Teb: 7ffdf000 Win32Thread: fe78aa90 WAIT: (Executive) UserMode Non-Alertable
    921c8f14  NotificationEvent
IRP List:
    bc57ce28: (0006,01d8) Flags: 40060900  Mdl: 00000000
    b83dae28: (0006,01d8) Flags: 40060000  Mdl: 00000000
Not impersonating
DeviceMap                 94529c98
Owning Process            b62124b8       Image:         Far.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      11251          Ticks: 763 (0:00:00:11.921)
Context Switch Count      59079             
UserTime                  00:00:00.140
KernelTime                00:00:04.984
Win32 Start Address 0x004ed48a
Stack Init b3989fd0 Current b3989b58 Base b398a000 Limit b3987000 Call 0
Priority 9 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
ChildEBP RetAddr  Args to Child              
b3989b70 828c0b25 981d68e0 8297ef08 8297bd20 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
b3989ba8 828bf423 981d69a0 981d68e0 921c8f14 nt!KiSwapThread+0x266
b3989bd0 828b92cf 981d68e0 981d69a0 00000000 nt!KiCommitThreadWait+0x1df
b3989c48 82a8ffe6 921c8f14 00000000 00000101 nt!KeWaitForSingleObject+0x393
b3989c7c 82aa1f32 00000103 921c8eb8 00000001 nt!IopSynchronousServiceTail+0x270
b3989d08 8289544a 979133a8 bc57ce28 00000000 nt!NtReadFile+0x644
b3989d08 770564f4 979133a8 bc57ce28 00000000 nt!KiFastCallEntry+0x12a (FPO: [0,3] TrapFrame @ b3989d34)
WARNING: Frame IP not in any known module. Following frames may be wrong.
00129e68 00000000 00000000 00000000 00000000 0x770564f4

0: kd> !irp bc57ce28
Irp is active with 10 stacks 10 is current (= 0xbc57cfdc)
 No Mdl: No System Buffer: Thread 981d68e0:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0 10 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0  1 979133a8 921c8eb8 00000000-00000000    pending
        \FileSystem\FltMgr
   Args: 00001000 00000000 00000000 00000000
So we can see that the IRP is pending, and that it's pending somewhere inside FltMgr. This usually means that a minifilter pended it because FltMgr doesn't pend IRPs on its own. So let's look at what is in the IRP's DriverContext:

0: kd> dt bc57ce28 nt!_IRP -a Tail.Overlay.DriverContext
   +0x040 Tail                       : 
      +0x000 Overlay                    : 
         +0x000 DriverContext              : 
          [00] (null) 
          [01] 0xbb9a6e40 Void
          [02] (null) 
          [03] 0x98049ea8 Void
0: kd> !fltkd.cbd 0xbb9a6e40 

IRP_CTRL: bb9a6de0  READ (3) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : bc57ce28 
DeviceObject             : 979133a8 "\Device\HarddiskVolume2"
FileObject               : 921c8eb8 
CompletionNodeStack      : bb9a6e98   Size=5  Next=0
SyncEvent                : (bb9a6df0)
InitiatingInstance       : 00000000 
Icc                      : b3989c0c 
PendingCallbackNode      : baacefb4 
PendingCallbackContext   : 00000000 
PendingStatus            : 0x00000000 
CallbackData             : (bb9a6e40)
 Flags                    : [00000001] Irp
 Thread                   : 981d68e0 
 Iopb                     : bb9a6e6c 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 00000000 
 TagData                  : 00000000 
 FilterContext[0]         : b7c46fc4 
 FilterContext[1]         : b7c46fc4 
 FilterContext[2]         : 98049ea8 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   bb9a6fb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   bb9a6f70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   bb9a6f28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   bb9a6ee0
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   bb9a6e98
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
Working IOPB:
>[3,0]    40060900  00          baaceea0 921c8eb8                     bb9a6e6c
            ("CancelSafe","CancelSafe Instance")  
     Args: 00001000 00000000 00000000 00000000 0183e6d0 0000000000000000
So as you can see the Tail.Overlay.DriverContext[1] has a pointer to the IRP_CTRL structure associated with the IRP. As far as I can tell this is set both by the FltCbdqXxx routines as well as by FltSetCancelCompletion(). This means that minifilters can benefit from this even if they aren't using the FltCbdqXxx routines for whatever reason (though frankly they should use those routines pretty much every time they're delaying processing of a request). So this makes the process of finding an FLT_CALLBACK_DATA associated with an IRP that is pended by a minifilter a fairly straightforward thing.

Another question is what happens when the IRP reaches the file system or a different legacy filter after being pended by a minifilter. This is different from the case where the minifilter didn't pend the request (which we discussed last week) because this is a different stack from the original stack the IO was issued on, so there is no fltmgr!FltpDispatch to call fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted in this case. But nevertheless, as you may have guessed from the name, fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted will be called once all the preOp callbacks for the frame have been called anyway so we still can use the first parameter for it:

0: kd> kb
ChildEBP RetAddr  Args to Child              
94f13ba0 9267dbae b7deaee0 bc57ce28 069b05e3 Ntfs!NtfsCommonRead
94f13c10 82b816c3 97914020 bc57ce28 00000000 Ntfs!NtfsFsdRead+0x279
94f13c34 8288e473 00000000 bc57ce28 97914020 nt!IovCallDriver+0x258
94f13c48 9262620c 8288e3f1 00000000 baaceea0 nt!IofCallDriver+0x1b
94f13c6c 92629df7 94f13c84 979133a8 00000001 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
94f13c9c b08c0831 bb9a6e40 00000001 00000000 fltmgr!FltCompletePendedPreOperation+0x8b
94f13ccc 926453c8 bc5c4fc8 baaceea0 baaceea0 CancelSafe!PreReadWorkItemRoutine+0x131 [c:\temp6\cancelsafe\cancelsafe.c @ 1537]
94f13d00 828bff3b 00000000 00000000 82fae020 fltmgr!FltpProcessGenericWorkItem+0x38
94f13d50 82a606d3 00000001 ccd6e460 00000000 nt!ExpWorkerThread+0x10d
94f13d90 829120f9 828bfe2e 00000001 00000000 nt!PspSystemThreadStartup+0x9e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x19
0: kd> dt 94f13c84 fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0x94de4ae0 _FLT_VOLUME
   +0x004 Irp              : 0xbc57ce28 _IRP
   +0x008 IrpCtrl          : 0xbb9a6de0 _IRP_CTRL
   +0x00c StartingCallbackNode : 0xbaacefb4 _CALLBACK_NODE
   +0x010 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x014 Flags            : 1 ( ICCFL_SKIP_STARTING_NODE )
Finally, there is one more case to consider, the case of the IRP being pended after FltMgr called fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted, in either a legacy filter below the frame or in the file system. In this case there is no call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted on the stack and Tail.Overlay.DriverContext[1] has no reason to point to the FLT_CALLBACK_DATA. In this case the IRP_CTRL structure can be found as the context for the completion routine on the IRP:

0: kd> !irp 0xbc57ce28 
Irp is active with 10 stacks 9 is current (= 0xbc57cfb8)
 Mdl=b7dc2fd8: No System Buffer: Thread 981d68e0:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0 10 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0 e0 97914020 921c8eb8 92625aae-bb9a6de0 Success Error Cancel 
        \FileSystem\Ntfs fltmgr!FltpPassThroughCompletion
   Args: 00001000 00000000 00000000 00000000
 [  3, 0]   0  1 979133a8 921c8eb8 00000000-00000000    pending
        \FileSystem\FltMgr
   Args: 00001000 00000000 00000000 00000000
Finally there are a couple of things I'd like to mention:

  • There is a FLT_CALLBACK_DATA structure per frame, so if there are multiple frames you need to find the right frame for your filter before looking for the right FLT_CALLBACK_DATA. The IRP_CALL_CTRL structure actually starts with a FLT_VOLUME pointer which is also unique per frame and so if you find the IRP_CALL_CTRL you can quickly check if it is the right one by checking whether !fltkd.volume shows your filter as belonging to that volume.
  • Even when looking at the right FLT_CALLBACK_DATA you might not see any entry for your filter if you did not return FLT_PREOP_SUCCESS_WITH_CALLBACK or FLT_PREOP_SYNCHRONIZE from the preOp callback.
  • The FLT_CALLBACK_DATA has a pointer to the IRP which is why I focused only on finding the FLT_CALLBACK_DATA given the IRP.
  • I've been using the terms FLT_CALLBACK_DATA and IRP_CTRL interchangeably in this post, and i've even colored them with the same color. That's because from the perspective of this article we don't care about the differences between them and !fltkd.cbd can work with either one.

Thursday, August 4, 2011

Debugging Minifilters: Finding the FLT_CALLBACK_DATA - Part I

When debugging minifilters I frequently have to figure out what's wrong in the system at a point when something bughecks or fails in some OS component other than my minifilter. For example if I find something fails with a certain error code in NTFS I might put a breakpoint on where that error code is returned. In such a case I might be able to find the IRP representing the operation but finding the FLT_CALLBACK_DATA structure associated with that IRP is not that easy. I'll show a couple of techniques that I find useful.

First let me explain why finding the FLT_CALLBACK_DATA is in general more complicated than finding the IRP. As you probably expect, this is due to FltMgr's design. The IO model in NT is a call-through model. This means that each layer in an IO stack must call the next component directly. This design choice has a couple of side effects:

  • Each device in the IO stack knows what the device right below it is. This means that a device cannot be inserted at an arbitrary point in an IO stack but must always be attached on top. It also means that a device cannot be removed from an IO stack.
  • Considering that fact that each device must call the next device, if you have enough devices, that stack space used by these calls (local variables, SEH frames, function parameters and so on) will add up and it's possible to run out of stack.
  • Whoever allocates an IRP must know what the size of the IO stack is so that there are enough stack locations or the system will bugcheck when using that IRP. This makes preallocating IRPs for either pooling them or for forward progress a complicated proposition.
Now, these side effects are really unpleasant for file system filters for the following reasons:
  • A file system filter might need to be at a certain position in the IO stack and so the requirement that a device can only be attached on top of a stack can be very limiting. An example of this is an encryption filter that is not attached at boot time and that will end up on the top of the stack and will encrypt all data that the other filters see, including anti-virus filters, which prevents the anti-virus product from detecting viruses…
  • A file system filter might need to unload and it might not be able to do that if it's not at the top of the stack. Think about file system filters that need to run on enterprise servers that are never rebooted but that might need to have the filters updated every now and then.
  • A file system filter might reenter multiple times while processing one operation which means it needs to be very conservative with stack usage or risk running out of stack.
Anyway, in order to deal with all these problems FltMgr uses a callback model. This means that FltMgr will call each minifilter, then the minifilter processes the operation and when it's done processing it returns to FltMgr, unwinding the stack. This leads to better stack usage overall but it also means one cannot look on the stack and get to the stack frame corresponding to the minifilter and analyze it. Here is an example using a passthrough minifilter that I've rigged to break on some IRP_MJ_READ operations. The first stack is from when FltMgr is calling my minifilter's callback and the second one is on the same thread, when the request finally reaches NTFS after my minifilter is done with it. It’s the stack for the same thread at two different moments in time. While looking at this piece of debugger output, please note that because the minifilter returns control to FltMgr the ammount of used stack for file system filters (which is pretty much the stack used by FltMgr) is rather small by the time the IRP finally reaches NTFS. Another thing to note is that just by looking at the first two stack traces it's impossible to tell how many filters are attached on the volume because all the data on the stack frame is lost (the number of filters attached to this volume can be seen in the output ot the !fltkd.volume command).
1: kd> kf
  Memory  ChildEBP RetAddr  
          b3989ad4 92647627 PassThrough!PtPreReadPassThrough+0x90 [c:\temp6\passthrough\passthrough.c @ 989]
       78 b3989b4c 92622aeb fltmgr!FltvPreOperation+0x81
       6c b3989bb8 926259f0 fltmgr!FltpPerformPreCallbacks+0x34d
       18 b3989bd0 92625f01 fltmgr!FltpPassThroughInternal+0x40
       24 b3989bf4 926263ba fltmgr!FltpPassThrough+0x203
       30 b3989c24 82b816c3 fltmgr!FltpDispatch+0xb4
       24 b3989c48 8288e473 nt!IovCallDriver+0x258
       14 b3989c5c 82a8ff6e nt!IofCallDriver+0x1b
       20 b3989c7c 82aa1f32 nt!IopSynchronousServiceTail+0x1f8
       8c b3989d08 8289544a nt!NtReadFile+0x644
 …

1: kd> kf
  Memory  ChildEBP RetAddr  
          b3989b20 9267dbae Ntfs!NtfsCommonRead
       70 b3989b90 82b816c3 Ntfs!NtfsFsdRead+0x279
       24 b3989bb4 8288e473 nt!IovCallDriver+0x258
       14 b3989bc8 9262620c nt!IofCallDriver+0x1b
       24 b3989bec 926263cb fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
       38 b3989c24 82b816c3 fltmgr!FltpDispatch+0xc5
       24 b3989c48 8288e473 nt!IovCallDriver+0x258
       14 b3989c5c 82a8ff6e nt!IofCallDriver+0x1b
       20 b3989c7c 82aa1f32 nt!IopSynchronousServiceTail+0x1f8
       8c b3989d08 8289544a nt!NtReadFile+0x644
 ...

1: kd> !fltkd.volume 0x94de4ae0 

FLT_VOLUME: 94de4ae0 "\Device\HarddiskVolume2"
 ...
   NameCacheCtrl            : (94de4e60)
   InstanceList             : (94de4b30)
      FLT_INSTANCE: b4c22b40 "PassThrough Instance" "370030"
      FLT_INSTANCE: a3290a50 "luafv" "135000"
      FLT_INSTANCE: 98a8adc8 "FileInfo" "45000"


Ok, so now with the stack where the request reached NTFS we need to try to find the IRP and the FLT_CALLBACK_DATA structure. There is a particularly useful structure that we can use here, the IRP_CALL_CTRL (which I've talked about in a couple of other posts). This structure is undocumented and not available to minifilters, but it is available in the public symbols from Microsoft. This is the first parameter to the call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted, so we can get it from the stack. So here we go (I've highlighted the IRP, FLT_VOLUME, IRP_CALL_CTRL and FLT_CALLBACK_DATA):
1: kd> kb
ChildEBP RetAddr  Args to Child              
b3989b20 9267dbae 9d1acee0 b52c8e28 21f2a263 Ntfs!NtfsCommonRead
b3989b90 82b816c3 97914020 b52c8e28 00000000 Ntfs!NtfsFsdRead+0x279
b3989bb4 8288e473 00000000 b52c8e28 97914020 nt!IovCallDriver+0x258
b3989bc8 9262620c 979133a8 b52c8e28 00000000 nt!IofCallDriver+0x1b
b3989bec 926263cb b3989c0c 979133a8 00000000 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
b3989c24 82b816c3 979133a8 b52c8e28 985c8268 fltmgr!FltpDispatch+0xc5
b3989c48 8288e473 00000000 b52c8e28 979133a8 nt!IovCallDriver+0x258
b3989c5c 82a8ff6e b52c8e28 b52c8fdc 985c8268 nt!IofCallDriver+0x1b
b3989c7c 82aa1f32 979133a8 985c8268 00000001 nt!IopSynchronousServiceTail+0x1f8
b3989d08 8289544a 979133a8 b52c8e28 00000000 nt!NtReadFile+0x644
...
1: kd> !irp b52c8e28
Irp is active with 10 stacks 9 is current (= 0xb52c8fb8)
 No Mdl: No System Buffer: Thread 981d68e0:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0  0 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
 [  0, 0]   0 10 00000000 00000000 00000000-00000000    

   Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0 e0 97914020 985c8268 92625aae-b8ce4de0 Success Error Cancel 
        \FileSystem\Ntfs fltmgr!FltpPassThroughCompletion
   Args: 00001000 00000000 00000000 00000000
 [  3, 0]   0  1 979133a8 985c8268 00000000-00000000    pending
        \FileSystem\FltMgr
   Args: 00001000 00000000 00000000 00000000
1: kd> dt b3989c0c fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0x94de4ae0 _FLT_VOLUME
   +0x004 Irp              : 0xb52c8e28 _IRP
   +0x008 IrpCtrl          : 0xb8ce4de0 _IRP_CTRL
   +0x00c StartingCallbackNode : 0xffffffff _CALLBACK_NODE
   +0x010 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x014 Flags            : 0x204 (No matching name)
1: kd> !fltkd.cbd 0xb8ce4de0 

IRP_CTRL: b8ce4de0  READ (3) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : b52c8e28 
DeviceObject             : 979133a8 "\Device\HarddiskVolume2"
FileObject               : 985c8268 
CompletionNodeStack      : b8ce4e98   Size=5  Next=2
SyncEvent                : (b8ce4df0)
InitiatingInstance       : 00000000 
Icc                      : b3989c0c 
PendingCallbackNode      : ffffffff 
PendingCallbackContext   : 00000000 
PendingStatus            : 0x00000000 
CallbackData             : (b8ce4e40)
 Flags                    : [00000001] Irp
 Thread                   : 981d68e0 
 Iopb                     : b8ce4e6c 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0xffffffff 
 IoStatus.Information     : 00000000 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4fb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4f70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   b8ce4f28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [3,0]    40060900  00   0002   98a8adc8 985c8268 92647760-00000000   b8ce4ee0
            ("FileInfo","FileInfo")  fileinfo!FIPostReadWriteCallback  (verified)
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
 [3,0]    40060900  00   0002   b4c22b40 985c8268 92647760-00000000   b8ce4e98
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough  (verified)
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
Working IOPB:
>[3,0]    40060900  00          98a8adc8 985c8268                     b8ce4e6c
            ("FileInfo","FileInfo")  
     Args: 00001000 00000000 00000000 00000000 0183f930 0000000000000000
Another thing to note is that the IRP_CALL_CTRL structure is allocated on the stack. So if for whatever reason you can't get it from the stack (amd64 for example where parameters are passed in registers), you can simply search the stack for it. The first member of the structure is a pointer to the FLT_VOLUME structure for the volume and the second member is a pointer to the IRP so you can just search for those two (i wrote it in a more generic fashion to automatically search the stack of the current thread; it also assumes that the default expression evaluator is masm). Please note that the proper size must be used (d for x86, q for amd64) because for whatever reason the search command doesn't implement a 'p' type which would have been useful (that would be the pointer size on the machine like used in the dp command):
1: kd>  s -d @$csp L(@@(((_KTHREAD*)@$thread)->InitialStack) - @$csp) 0x94de4ae0 0xb52c8e28 
b3989c0c  94de4ae0 b52c8e28 b8ce4de0 ffffffff  .J..(.,..M......
And here is what this looks like on an amd64 machine. Please note that the IRP is in a different location on the stack. Also, I'm not showing how I got the FLT_VOLUME in this case (eyeballing the output of !fltkd.volume). The same color scheme but of course the values are different:
0: kd> kb
RetAddr           : Args to Child                                                           : Call Site
fffff880`0121ba68 : fffffa80`1a9c4010 fffff880`046608f8 fffff880`046608e0 fffffa80`1abb6b00 : Ntfs!NtfsCommonRead
fffff880`010f7bcf : fffffa80`1b2c4bd8 fffffa80`1b2c4880 fffffa80`1abb6b90 00000000`00000001 : Ntfs!NtfsFsdRead+0x1b8
fffff880`010f66df : fffffa80`19a94350 00000000`00000001 fffffa80`19a94300 fffffa80`1b2c4880 : fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff800`02ba171b : 00000000`00000000 fffffa80`1aff75b0 00000000`00000001 fffffa80`1b2c4880 : fltmgr!FltpDispatch+0xcf
fffff800`02b830b3 : fffffa80`1aff75b0 fffffa80`1aff75b0 fffffa80`1aff75b0 fffff800`02a0be80 : nt!IopSynchronousServiceTail+0xfb
fffff800`028988d3 : fffffa80`1b2d2060 00000000`00000000 00000000`00000000 00000000`00000000 : nt!NtReadFile+0x631
0: kd> s -q @$csp L(@@(((_KTHREAD*)@$thread)->InitialStack) - @$csp) fffffa8019ab0450 fffffa801b2c4880
fffff880`04660a10  fffffa80`19ab0450 fffffa80`1b2c4880
0: kd> dt fffff880`04660a10 fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0xfffffa80`19ab0450 _FLT_VOLUME
   +0x008 Irp              : 0xfffffa80`1b2c4880 _IRP
   +0x010 IrpCtrl          : 0xfffffa80`1abb6b90 _IRP_CTRL
   +0x018 StartingCallbackNode : 0xffffffff`ffffffff _CALLBACK_NODE
   +0x020 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x028 Flags            : 0x204 (No matching name)
0: kd> !fltkd.cbd 0xfffffa80`1abb6b90

IRP_CTRL: fffffa801abb6b90  READ (3) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : fffffa801b2c4880 
DeviceObject             : fffffa8019a94350 "\Device\HarddiskVolume1"
FileObject               : fffffa801aff75b0 
CompletionNodeStack      : fffffa801abb6ce0   Size=5  Next=2
SyncEvent                : (fffffa801abb6ba8)
InitiatingInstance       : 0000000000000000 
Icc                      : fffff88004660a10 
PendingCallbackNode      : ffffffffffffffff 
PendingCallbackContext   : 0000000000000000 
PendingStatus            : 0x00000000 
CallbackData             : (fffffa801abb6c40)
 Flags                    : [00000001] Irp
 Thread                   : fffffa801aff3660 
 Iopb                     : fffffa801abb6c98 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 0000000000000000 
 TagData                  : 0000000000000000 
 FilterContext[0]         : 0000000000000000 
 FilterContext[1]         : 0000000000000000 
 FilterContext[2]         : 0000000000000000 
 FilterContext[3]         : 0000000000000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6ee0
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6e60
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [0,0]    00000000  00   0000   0000000000000000 0000000000000000 0000000000000000-0000000000000000   fffffa801abb6de0
     Args: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 [3,0]    00060900  00   0002   fffffa8019b40bb0 fffffa801aff75b0 fffff880011422f4-0000000000000000   fffffa801abb6d60
            ("FileInfo","FileInfo")  fileinfo!FIPostReadWriteCallback 
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
 [3,0]    00060900  00   0002   fffffa801b365010 fffffa801aff75b0 fffff88003b4b1a0-0000000000000000   fffffa801abb6ce0
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough 
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
Working IOPB:
>[3,0]    00060900  00          fffffa8019b40bb0 fffffa801aff75b0                     fffffa801abb6c98
            ("FileInfo","FileInfo")  
     Args: 0000000000000004 0000000000000000 0000000000000000 0000000000268fd4 0000000000000000 0000000000000000
This is getting too long so I'll split the post here. In the next post I'll show how to do this (get the FLT_CALLBACK_DATA) when the IRP was pended and therefore there might not be a call to fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted at all or even worse the thread can be gone altogether so we can't search the stack.