Thursday, February 24, 2011

Tracking a minifilter's ActiveOpens files

I've recently done a bit of detective work that I thought might be an interesting thing to share, especially since we've been talking about contexts so much lately. The issue was how to find the files that were opened by a minifilter which prevent that minifilter from unloading. FltMgr actually keeps track of the files that were opened by a minifilter and if the minifilter gets unloaded it will wait for those files to be closed before it unloads the driver. You can see this in the debugger here:

1: kd> !fltkd.filter 94130008 

FLT_FILTER: 94130008 "luafv" "135000"
   FLT_OBJECT: 94130008  [02000000] Filter
      RundownRef               : 0x0000000a (5)
      PointerCount             : 0x00000001 
      PrimaryLink              : [922c75a4-92cdaa24] 
   Frame                    : 92cda9c8 "Frame 0" 
   Flags                    : [00000006] FilteringInitiated NameProvider
   DriverObject             : 9306bc50 
   FilterLink               : [922c75a4-92cdaa24] 
   PreVolumeMount           : 81fbe0cc  luafv!LuafvPreRedirect 
   PostVolumeMount          : 00000000  (null) 
   FilterUnload             : 00000000  (null) 
   InstanceSetup            : 81fca62b  luafv!LuafvInstanceSetup 
   InstanceQueryTeardown    : 00000000  (null) 
   InstanceTeardownStart    : 00000000  (null) 
   InstanceTeardownComplete : 00000000  (null) 
   ActiveOpens              : (941300dc)  mCount=1 
   Communication Port List  : (94130108)  mCount=0 
   Client Port List         : (94130134)  mCount=0 
   VerifierExtension        : 00000000 

So the task is to find that one file that ActiveOpens is tracking for LUAFV. Before we go any further, let's see what the chain of structures looks like, starting from the FILE_OBJECT. Some of the structures are documented while others are not. I've marked the undocumented structures with a '?' at the end of the name. We don't know (or care for this post) what the other members of the structures are.

Now, we need to walk the arrows backwards so the steps we need to follow are:

  1. From fltmgr!_FLT_FILTER->ActiveOpens find fltmgr!FO_CONTEXT?
  2. From fltmgr!FO_CONTEXT? find the nt!FILE_OBJECT_CONTEXTS_HEADER? Structure
  3. From nt!FILE_OBJECT_CONTEXTS_HEADER? Structure find the nt!_IOP_FILE_OBJECT_EXTENSION pointing to it
  4. From the nt!_IOP_FILE_OBJECT_EXTENSION find the nt!_FILE_OBJECT structure that points to it
  5. ???
  6. Profit!!!!

Starting with ActiveOpens, let's look at the structure. Please note that mCount appears to be shifted by 1. Also, please note that mList is a regular doubly linked list and we expect that it contains one entry (since mCount is 1):

1: kd> dt 941300dc _FLT_MUTEX_LIST_HEAD
fltmgr!_FLT_MUTEX_LIST_HEAD
   +0x000 mLock            : _FAST_MUTEX
   +0x020 mList            : _LIST_ENTRY [ 0x93712498 - 0x93712498 ]
   +0x028 mCount           : 2
   +0x028 mInvalid         : 0y0
1: kd> dl 0x93712498 
93712498  941300fc 941300fc 0421000e 706e5043
941300fc  93712498 93712498 00000002 00000001
1: kd> !pool 0x93712498 2
Pool page 93712498 region is Nonpaged pool
*93712430 size:   70 previous size:    8  (Allocated) *FMfc
  Pooltag FMfc : FLTMGR_FILE_OBJECT_CONTEXT structure, Binary : fltmgr.sys

So now we know that the structure that we've been calling fltmgr!FO_CONTEXT? is in fact called FLTMGR_FILE_OBJECT_CONTEXT. Step 1 is done and we're moving on to step 2. We also know the size isn't larger than 0x70. However, we don't know where the LIST_ENTRY is in there. The only other thing we know is that in that structure there must be a member that is of type FSRTL_PER_FILEOBJECT_CONTEXT (and we know this because that's how FILE_OBJECT contexts are implemented; see the documentation for FsRtlInsertPerFileObjectContext). Since the beginning of the _LIST_ENTRY is at address 0x93712498 and the pool block starts at 0x93712430, we can guess our LIST_ENTRY is towards the end of the structure, so we'll go back a bit and display the words. Then we'll look for something that looks like an FSRTL_PER_FILEOBJECT_CONTEXT. FSRTL_PER_FILEOBJECT_CONTEXT doesn't really contain much easily identifiable information but it does contain a LIST_ENTRY, which means we should find two words that look like kernel mode addresses next to each other. I've highlighted possible candidates and then we simply try "dl" on them (if anyone knows a better way I'd love to hear about it, please leave a comment). Of course, if you look at the number carefully you can see that it's very likely that there is a LIST_ENTRY at 93712480 if the list has only one element. But in most cases the list has more than one element so it's hard to tell at a quick glance. So we simply issue a "dl" on each candidate and hope that if they're not doubly linked lists they'll simply run into some invalid address sooner or later.

1: kd> dp 0x93712498-0x60 
93712438  48706345 00000000 00000000 00000000
93712448  00ac5851 45e4702b 2a5794ac 7e7ac1fe
93712458  00000000 00000047 00000068 96040c00
93712468  00000000 0034f110 001d0000 92cda9c8
93712478  94134ce8 93011008 9306b0c8 9306b0c8
93712488  938f1230 00000000 94130008 941307d8
93712498  941300fc 941300fc 0421000e 706e5043
937124a8  937395a8 937126b8 937699d8 93739ad8
1: kd> dl 92cda9c8
92cda9c8  0340f103 960409f8 960409f8 00000000
0340f103  00000000 00000000 00000000 00000000
1: kd> dl 94134ce8 
94134ce8  00800005 92f05788 92f06870 92fa4998
…
1: kd> dl 9306b0c8  
9306b0c8  93712480 93712480 00010006 e56c6946
93712480  9306b0c8 9306b0c8 938f1230 00000000
1: kd> !pool 9306b0c8 2
Pool page 9306b0c8 region is Nonpaged pool
*9306b0a0 size:   30 previous size:   10  (Allocated) *FOCX
  Pooltag FOCX : File System Run Time File Object Context structure, Binary : nt!fsrtl

It looks like we may have found our structure. Now we're starting our step 3 and we have a pointer into a structure of type nt!FILE_OBJECT_CONTEXTS_HEADER?, but since we don't know the type we don't know where the structure starts. Normally what I do is assume it starts right after the pool tag and search for that address, and if that fails search for the address at the next word boundary and so on. Let's see:

1: kd> db 9306b0a0 L0x20
9306b0a0  02 00 06 04 46 4f 43 58-01 00 00 00 00 00 00 00  ....FOCX........
9306b0b0  00 00 00 00 01 00 04 20-00 00 00 00 bc b0 06 93  ....... ........
1: kd> s -d 80000000 L?0x20000000 9306b0a8
9306ba78  9306b0a8 00000000 00000000 00000000  ................
1: kd> !pool 9306ba78  2
Pool page 9306ba78 region is Nonpaged pool
*9306ba60 size:   30 previous size:   10  (Allocated) *Io  
  Pooltag Io   : general IO allocations, Binary : nt!io

So actually this looks pretty good. In other cases I've found multiple random values that looked like references so I've just had to look at each one. But in this case it looks pretty clean. We expect that the pointer is in a structure that's allocated by the IO mgr, and that the structure size something about the size of nt!_IOP_FILE_OBJECT_EXTENSION. So this means we've completed step 3 and we have the nt!_IOP_FILE_OBJECT_EXTENSION structure. In our picture I've shown that in the _IOP_FILE_OBJECT_EXTENSION it is FoExtPerTypeExtension[3] that points to this structure that we just found. I've figured this out by experimenting. I simply got a FILE_OBJECT and added a context by calling FsRtlInsertPerFileObjectContext and watched which value changed. So now we know where the structure starts and we need to search a pointer to it. The pointer we would find here would be a FILE_OBJECT->FileObjectExtension so we expect the pool tag to be "File". Also, we expect the FILE_OBJECT to start 0x7c bytes before the pointer. Please note that the first two results returned by the search were invalid (!pool told me so):

1: kd> ? 0x9306ba78-0x10
Evaluate expression: -1828275608 = 9306ba68
1: kd> s -d 80000000 L?0x20000000 9306ba68
82b80008  9306ba68 9306ba68 00000000 abcddcba  h...h...........
82b8000c  9306ba68 00000000 abcddcba 00000001  h...............
94134d64  9306ba68 04530015 6661754c 00000000  h.....S.Luaf....
1: kd> !pool 94134d64  2
Pool page 94134d64 region is Nonpaged pool
*94134cc0 size:   a8 previous size:  2d8  (Allocated) *File (Protected)
  Pooltag File : File objects
1: kd> !fileobj 0x94134d64-0x7c  



Device Object: 0x92f05788   \Driver\volmgr
Vpb: 0x92f06870
Event signalled
Access: Read Write SharedRead SharedWrite SharedDelete 

Flags:  0x440008
 No Intermediate Buffering
 Handle Created
 Volume Open

FsContext: 0x92fa4998 FsContext2: 0x97e0cbc0
CurrentByteOffset: 0
Cache Data:
  Section Object Pointers: 92f0cd14
  Shared Cache Map: 00000000


File object extension is at 9306ba68:

So this is it, we know now that LUAFV opens the volume using FltCreateFile (if it didn't it would be on the ActiveOpens list). For the record, this is win7:

1: kd> vertarget
Windows 7 Kernel Version 7600 MP (2 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7600.16617.x86fre.win7_gdr.100618-1621
Machine Name:
Kernel base = 0x82809000 PsLoadedModuleList = 0x82951810
Debug session time: Thu Feb 17 07:44:10.641 2011 (UTC - 8:00)
System Uptime: 0 days 0:02:42.733

Thursday, February 17, 2011

Filter Layering and IO Targeting in FltMgr - part II

Let's look at how targeting in filter manager can fail and what it looks like when it happens. I'd like to say that while such things do happen and I've analyzed a couple of cases over the years, they are far less frequent with minifilters than with legacy filters. Anyway, layering violations in some cases can just go unnoticed, but if they do cause trouble most likely what will happen is an infinite recursion which will result in a bugcheck. Deadlocks can also happen, but the cases I've investigated so far were all infinite recursions.

One way layering can fail is when a FILE_OBJECT is used in postCreate by a minifilter and the minifilter is not using Flt APIs to perform the IO. For example, using the setup from the previous blog post, if minifilter1 in postCreate calls a Zw function or creates a handle for a user mode app and lets the user mode app use that handle to do something like scan the file or recall it or decompress it, what will happen is that the requests generated by the user mode service or the Zw calls will go to Frame0 because the FILE_OBJECT stores the information about the device hint, but FltMgr will not find its targeting structure and will show the requests all minifilters in the frame starting with Minifilter 3. However, this is a clear violation of the minifilter rules because minifilter1 broke the layering contract by sending IO to the top of the IO stack. If it wants to do this sort of thing it needs to call its own FltCreateFile and after that create succeeds it can either use a Zw API or create a handle for that FILE_OBJECT to be used by a user mode service.

Now, if filter manager would always use a device hint things wouldn't be too bad. However, there is a case where FltMgr violates the rule of never sending IO to the top of the stack. This case is in the naming path (and more precisely in the FltpExpandFilePathWorker function). When a minifilter calls FltGetFileNameInformation and requests a normalized name, FltMgr gets a name for the file and then proceeds to normalize it. It does so by opening folders along the path to the file and querying information that contains the long name for each component on the path. In this case however, FltMgr does use a targeting structure to identify which minifilters should see the create, but it does not use a DeviceHint for the IoCreateFileEx call. The reason, as far as I can tell, is that the request will fail if the name contains a reparse point that reparses to a different volume (remember that bit about IopCheckTopDeviceHint a couple of posts back ?) so FltMgr just sends it to the top of the stack.

So in this case the IRP_MJ_CREATE is not targeted in the IO manager (it will go to the top of the IO stack) but there will be FltMgr targeting information attached to the IRP_MJ_CREATE. Looking at our picture from the previous post, we can see that if the IRP_MJ_CREATE issued by (or on behalf of) minifilter2 is one of these creates then Frame1, Legacy Filter B and Legacy Filter A will all see the IRP_MJ_CREATE and the subsequent requests. Frame1 will find the targeting information and infer that no minifilter in that frame should see the request and just send it below. However, Legacy Filter B and Legacy Filter A will not be aware that they shouldn't see this request and will perform their usual functions. So, since the targeting information has not been attached to the FILE_OBJECT yet (it will only happen when IoCreateFileEx returns to FltMgr) and there is no DeviceHint on the FILE_OBJECT if Legacy Filter B issues any requests to the device below them, then not only will legacy filter A see that request (which is expected), but also when the request reaches Frame 0 all the minifilters will see it. But minifilter3 and minifilter2 should not have seen it an in fact they haven't seen the IRP_MJ_CREATE as well so there are quite a few things that can go wrong here.

Another interesting case I've seen was when a legacy filter (let's use Legacy Filter B again as an example) tried to implement something similar to FltReissueSynchronousIo in the minifilter world and in its postCreate, if the IRP_MJ_CREATE failed, it changed something in the request and sent it down again. This worked well on Vista and newer but in XP it failed. As you remember, in XP the targeting information is stored in an EA, and for whatever reason the EA mechanism was designed such that the structure is associated with an IRP_MJ_CREATE by storing the EaLength in the IO_STACK_LOCATION and the EA buffer in the IRP (Irp->AssociatedIrp.SystemBuffer). This is in my opinion a pretty poor design, because it suggests that the EaLength can be layered but the EA buffer cannot. It also means there is only one EA buffer per IRP and so FltMgr must use EA chaining. When FltMgr receives such a create it needs to remove its EA information from the buffer before it sends the request down. However, once the EA buffer has been changed if a legacy filter sends the request down again in the manner we explained then FltMgr (in Frame0 in our example) will not find the targeting information and will therefore show the information to all the minifilters in the frame, which can also result in a layering violation. Moreover, the EaLength and the EA buffer are now out of sync and the file system might not like this. For a very clear example of this issue looks like, please read this thread: http://www.osronline.com/showthread.cfm?link=187295. Please note that in this thread though we're not dealing with recursion but with the file system not being able to cope with the EaLength and the EA buffer being in an inconsistent state. Though infinite recursion could still have happened if there were more minifilters installed on the system.

Before ending this post, I'd like to point out that pretty much all layering issues require a legacy filter in the picture. In general FltMgr by itself with just minifilters is pretty good about it. Please note that even perfectly written legacy filters could trigger this issue (which is another way of saying that it isn't the legacy filter's fault), the real problem is that FltMgr breaks layering. What I would like FltMgr to do (in fact, what I wish it had done already) is to offer an API to legacy filters by which such a filter can tell whether a certain IRP or FILE_OBJECT is one they should ignore. Also, I think that FltMgr could address a large class of issues very easily by simply moving the targeting information to the FILE_OBJECT immediately after the IRP_MJ_CREATE completes in the file system.

Thursday, February 10, 2011

Filter Layering and IO Targeting in FltMgr

I've been talking about layering quite a lot on this blog. I've also mentioned how FltMgr performs IO targeting when a minifilter calls FltCreateFile in this post and how after such a FILE_OBJECT is created, targeting works even when using Zw apis in this post. However, let's take a more closer look at how it actually is implemented in FltMgr and what are some of the implications of the design.

As it might be apparent from the previous links, there are two different kinds of targeting going on. The IO manager targeting that directs an IRP at the appropriate device and FltMgr's targeting, which identifies the appropriate minifilter for that operation. Please take a look at the following picture, where the blue blocks represent devices (FltMgr's Frame0 and Frame1 and the attachments for the two legacy filters are all devices) and the red blocks are minifilters. The picture shows how an FltCreateFile request goes to the IO manager, and how then it find the minifilter below the one issuing that call.

The steps involved in this are as follows:

  1. Minifilter2 calls FltCreateFile
  2. FltMgr allocates targeting information (fltmgr calls it TargetedIoControl) and inserts it into an ECP structure and then it calls IoCreateFileEx with the ECP and a device hint that points to the device for Frame0.
  3. IoCreateFileEx goes through the usual steps in the OB manager, the OPEN_PACKET is initialized, and it eventually gets to IopParseDevice
  4. IoMgr in IopParseDevice allocates an IRP_MJ_CREATE , attaches the ECP and sends it directly to the hint device, Frame0.
  5. FltMgr get's the IRP_MJ_CREATE on the device for Frame0, looks for the targeting ECP and extracts the TargetedIoControl from it and then it analyzes it and figures out that the first minifilter that should see this request is Minifilter1.
  6. Minifilter1's preCreate callback get's called.
  7. the IRP_MJ_CREATE is processed further by the IO stack, file system and so on
  8. the IRP_MJ_CREATE completes to the IO manager
  9. IoCreateFileEx returns to FltMgr
  10. the original call to FltCreateFile returns control to Minifilter2

Please note that things are a bit different in XP, for example instead of an ECP FltMgr uses an EA and instead of IoCreateFileEx FltMgr calls IoCreateFileSpecifyDeviceObjectHint. I will only focus on the behavior in Vista and newer releases but XP should be pretty similar anyway.

Let's take a look at how IO manager's targeting is implemented. Each FILE_OBJECT structure has something called a FileObjectExtension and there are some functions in the IO manager that can set things in the extension. Please note that this is not the same as the FILE_OBJECT context support added in Vista (and which is available throug APIs like FsRtlLookupPerFileObjectContext and friends).:

0: kd> dt nt!_FILE_OBJECT
   +0x000 Type             : Int2B
   +0x002 Size             : Int2B
   +0x004 DeviceObject     : Ptr32 _DEVICE_OBJECT
   …
   +0x07c FileObjectExtension : Ptr32 Void

0: kd> x nt!*Extension*
...
828b1d09 nt!IopAllocateFileObjectExtension = 
...
828a40e2 nt!IopGetFileObjectExtension = 
...
828c8fe7 nt!IopSetTypeSpecificFoExtension = 
...
82a686f0 nt!IopDeleteFileObjectExtension = 
82aa3238 nt!IopSymlinkSetFoExtension = 
...
82a6c1e7 nt!IopAllocateFoExtensionsOnCreate = 
So these extensions are of different types, for internal use by various OS components. The interesting function here is nt!IopAllocateFoExtensionsOnCreate which initializes some extensions whenever a FILE_OBJECT is initialized. For example, if a DeviceHint was specified then some specific extension is allocated and then the IO manager will always use that extension on the FILE_OBJECT to figure out which device an IO request needs to be sent to. So IO manager's targeting information is associated with the FILE_OBJECT immediately upon creation.

FltMgr takes a different approach. For one, it is not involved in FILE_OBJECT creation and so it doesn't know when the FILE_OBJECT is created. So the approach it takes is a bit more complex. Looking at the steps above associated with the picture above, in step 9 FltMgr now takes the TargetedIoControl structure that was associated with the IRP_MJ_CREATE and associates it with the FILE_OBJECT, before returning from FltCreateFile. In fact, the flow in FltCreateFile looks something like this:

  1. Allocate TargetedIoControl
  2. Call IoCreateFileEx with the DeviceHint pointing to the current FltMgr device and the TargetedIoControl
  3. When IoCreateFileEx returns, if the create was successful, associate the TargetedIoControl with the FILE_OBJECT.
This algorithm is also employed in cases where FltMgr needs to open a file itself (mostly in the Naming code) because it doesn't internally call FltCreateFile and instead it simply follows these steps. In fact let's take one more look in the debugger at the functions in FltMgr that are associated with targeting:
0: kd> x fltmgr!*target*
96050f68 fltmgr!FreeTargetedIoCtrl = 
960394b6 fltmgr!FltpGetIoTargetFromFileObject = 
...
96051114 fltmgr!TargetedIOCtrlGenerateECP = 
9605132e fltmgr!TargetedIOCtrlAttachAsFoCtx = 
So as you can see, we have a function to add a TargetedIoControl as an ECP (undoubtedly for the IRP_MJ_CREATE case) and as a FILE_OBJECT context (after the IRP_MJ_CREATE is complete), as well as a function to get the target of an IO operation from the FILE_OBJECT context (fltmgr!FltpGetIoTargetFromFileObject). There doesn't seem to be a function that figures out the target from an ECP so that's probably only handled inline.

The really important thing to note here is that this mechanism is different from the IO manager mechanism in that the FILE_OBJECT doesn't have FltMgr's targeting information that should be associated with it until after IoCreateFile returns. So for a fair bit of time, between the moment when the IRP_MJ_CREATE is completed by the file system (and when the FILE_OBJECT becomes initialized) and the moment when the IoCreateFileEx call returns to FltMgr, the FILE_OBJECT is initialized but it doesn't have any FltMgr targeting information (it does however have IO manager's targeting information). We'll discuss the implications of this particular approach (and the whole class of issues it introduces) in the next blog post, as well as a couple of various different approach FltMgr could have used.

Thursday, February 3, 2011

More contexts: tracking hardlinks

In one of the comments to my previous post, Lyndon pointed out that there is not a lot of support from either the OS or FltMgr when it comes to tracking hardlinks. So I figured I'd explain why this is so complicated and explain what a filter would need to do to implement this. I'm not going to describe what hardlinks are or how they operate, focusing instead on what FltMgr does and what a filter might need to do as well.

However, there is one specific particularity about hardlinks that i'll keep referring to. Once a file is opened the file system remembers which link was used to open the file and it will return that name when querying the file name. If that linked is renamed, the FS will of course return the new name.

So the problem with hardlinks is, like Lyndon pointed out, that the SCB model isn't granular enough. The SCB is associated with the stream and it doesn't really matter how the stream was opened (by which name), the SCB is the same. So a StreamContext is the same, regardless of how many hardlinks were used. On the other hand, StreamHandleContexts are too granular, in that they simply track the FILE_OBJECT and different opens even from the same link (using the same name for the file) will obviously get different FILE_OBJECTs and thus different StreamHandleContexts.

Filter manager doesn't offer an additional type of context. However, it does need to deal with hardlinks because it implements a name cache. The name cache is pretty simple to implement for files that only have one name, the name is stored in a structure associated with the stream. However, for hardlinks, clearly the structure needs to be different so that opens for the same name are cached properly. FltMgr solves this problem by not caching the file name in a structure associated with the SCB if the file has more than one link (as reported by the FileStandardInformation information class) and instead it caches the name per FILE_OBJECT.

If a filter wanted to keep track of hardlinks it would need to, as Lyndon indicated in his comment, look at the name that it gets from the file system (the FileNameInformation class) and from that deduce which link was used. This is complicated because a link can be renamed at any time so that must be taken into account. A possible implementation would need to keep some structure in a perStream context that would map each FILE_OBJECT to a link (possibly introducing an artificial concept like linkID or linkGuid or something) and in postCreate would map the newly opened FILE_OBJECT to the appropriate link (which requires looking at link names using the FileHardLinkInformation class) while disabling renames for that stream.

I was planning on writing more on this topic and playing with hardlinks some more, but I'm busy at work and it'll have to wait for a future post.

Thursday, January 27, 2011

Contexts in legacy filters and FSRTL_ADVANCED_FCB_HEADER

I've seen some questions about how a legacy filter can implement contexts similar to the ones fltmgr provides for a minifilter.

So what is a context ? A context is a structure that is owned by some system component (in our case a filter, legacy or mini) that is associated with some other structure. In a very general way, a context is a "value" and at the object that it is associated with is a "key". In general contexts are necessary when the flow of execution is controlled by some other component in the system than the one that implements the actual code (for example for callbacks and services and library functions, where the code is provided by the library or service, but when the code is called depends on something else). Anyway, because the context is simply a key-value pair, anyone can implement a generic context mechanism by using hashes, and this allows great flexibility in what one can attach a context to. For example, one can associate a context with a thread or a logged on user or even a sector on a volume if they feel so inclined. One issue with this approach is how to know when the underlying object is released so that the context can be released as well. For example, if a context is associated with thread 128 and then thread 128 terminates and then at some later point in time another thread is created with the same ID of 128, clearly the context should be released since it's not referring to the same underlying object, but unless the entity implementing the context is notified that thread 128 was terminated, it won't know to release it.

So returning to filters, filter manager offers support for the following types of contexts (at least, these the ones that are typically interesting; the other contexts can usually be implemented fairly easily by legacy filters): Streams, StreamHandles and Files. Let's look at how each of these contexts can be implemented. These are just examples about how it could be done with little support from the OS, but it's definitely not the best way it can be done… I'll address that after this section.
StreamHandle contexts

In terms of implementation in a legacy filter, the StreamHandle is probably the easiest to implement since the key is the FILE_OBJECT and the time to remove the context is during IRP_MJ_CLOSE. Of course the context can be created either the first time the FILE_OBJECT is seen by the filter in an operation or when the filter processes the IRP_MJ_CREATE. Because of stream file objects the filter can't assume that it will always see an IRP_MJ_CREATE for each FILE_OBJECT, so a filter must always be prepared to get a FILE_OBJECT that it hasn't seen an IRP_MJ_CREATE for.

Stream contexts

The key for this type of context is the SCB, so whatever the FILE_OBJECT->FsContext member points to is a good key. Unfortunately, FILE_OBJECT->FsContext is not initialized until the file system processes the IRP_MJ_CREATE and opens the stream on disk, which means that a Stream context isn't available in preCreate (the same restriction as for minifilters). The more complicated part is how to know when the SCB is freed by the FILE_SYSTEM. One way to do this is to simply keep track of all the FILE_OBJECTs that the filter is interested in that all reference that SCB and then when the last FILE_OBJECT is processing it's IRP_MJ_CLOSE, free the context associated with the SCB. This is a bit more complicated than in the StreamHandle context, but not much more so. One notable thing is that since the SCB is a structure that belongs to the file system, it is possible that some file system is implemented in such a way that the address of the SCB changes throught the lifetime of an SCB (for example, the FS can copy the SCB to a different memory location under some circumstances). I haven't seen this in practice and there may be other issues with it (since the OS uses some fields in the FSRTL_COMMON_FCB_HEADER) but I haven't either seen anything definitive that disallows it.


File contexts

For file systems that implement alternate data streams (ADS) it might be important to know whether a stream belongs to the same file or not. In this case, the key for the context must be something that identifies the file. For example, if the file ID is guaranteed to be unique for the lifetime of the file (which is true for NTFS for example but is not true for the FASTFAT implementation; however, FASTFAT doesn't support alternate data streams so it doesn't really matter from this perspective) then the file ID can be used as a key. In terms of removing the context, it depends on the structure that was used as the key. For example, if the file ID is used, then the context would need to be removed when the file is deleted (and detecting that is a complicated problem in itself).


Fortunately the nice folks at MS decided to offer some help to the filters writers and developed some support APIs. They are covered in the MSDN pages "Tracking Per-Stream Context in a Legacy File System Filter Driver" (which is currently here) and "Tracking Per-File Context in a Legacy File System Filter Driver" (which is here). These APIs rely on the file system implementing support for the FSRTL_ADVANCED_FCB_HEADER structure. Please note that a file system is not required to implement this support but if it doesn't then it won't work with Filter Manager. Anyway, these APIs allow any kernel component (filter or not) to associate a context with an SCB and to be notified when the SCB itself is torn down. Please note that the SCB might not be torn down immediately when the last FILE_OBJECT for it is closed, because some file systems implement SCB caching and the filter might be able to benefit from this (benefit from it because it can keep its context and if someone opens a new handle to the same stream the filter's context is also cached).

There is another useful structure when implementing contexts, the RTL_GENERIC_TABLE (MSDN page currently here). A generic table is an OS structure that can be used as a general purpose hash, so that the filter doesn't need to implement their own. However, please note that it is implemented as a tree so if performance must be really good then a custom hash might still be necessary.

To wrap it up, in order for a filter to implement a similar scheme to FltMgr's contexts it can use the following scheme:
  • Use OS support for stream contexts (FsRtlInsertPerStreamContext, FsRtlLookupPerStreamContext and so on)
  • Use OS support for file contexts (FsRtlInsertPerFileContext, FsRtlLookupPerFileContext and so on)
  • Implement a hash for per FILE_OBJECT context. Either use a straight hash or use a per Stream structure which includes a hash for FILE_OBJECTS for that stream (which is useful because the number of entries in each hash is much smaller so the RTL_GENERIC_TABLE might be a good fit).

Finally, I'd like to point out that any filter (legacy or mini) that implements its own streams (that completes an IRP_MJ_CREATE and puts something in FILE_OBJECT->FsContext) should implement support for FSRTL_ADVANCED_FCB_HEADER otherwise contexts won't work for those files and it might cause problems for other filters. This should be fairly easy to implement though following the MSDN documentation.

Thursday, January 20, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part VI

I'm pretty much done with what I wanted to cover about IRP_MJ_CREATE. I'd just like to go through a couple more things that I think are important before closing this topic.

FILE_DELETE_ON_CLOSE create option

This flag sets a flag associated with the current FILE_OBJECT, in a file system structure associated with the FILE_OBJECT itself and not the stream. There is no way to query whether this flag was set after the fact. Once the FILE_OBJECT is cleaned up, the flag moves to the SCB (a per stream structure) and it can be queried using  IRP_MJ_QUERY_INFORMATION and FileStandardInformation. The same flag can be set on a stream by an IRP_MJ_SET_INFORMATION with the FileDispositionInformation information class. Please note that if at the time when the FILE_OBJECT that was created with FILE_DELETE_ON_CLOSE is closed there are no other FILE_OBJECTs for that same stream, then the flag will be moved to the SCB and then the stream will immediately be deleted, so there is no opportunity for a filter to query the flag or remove it. Filters that want to to be able to potentially clear the "delete intent" from a file can do something like remove the FILE_DELETE_ON_CLOSE flag from the CreateOptions and then in postCreate set it to the stream with an IRP_MJ_SET_INFORMATION. This is not exactly the same as FILE_DELETE_ON_CLOSE, but it's a pretty good approximation. It also allows the delete on close flag in the SCB to be queried and possibly reset at any time.

STATUS_REPARSE in postCreate

A filter can return STATUS_REPARSE in postCreate. It can do so if the create failed or even if it was successful, provided that the filter takes care of undoing what was done in the file system (see FltCancelFileOpen and IoCancelFileOpen).

FltGetFileNameInformation behavior

FltGetFileNameInformation can be called during a create, both in preCreate and postCreate. Calling FltGetFileNameInformation might result in the fltmgr actually opening the file if the file doesn't have a path (open by ID), but there should be no open to the actual file in any other case. If the caller is asking for a normalized path in preCreate, FltMgr will try to open the parent directory and enumerate its entries in order to get the long file name. However, if the file doesn't exist (if the IRP_MJ_CREATE is actually trying to create a file) then it is possible that even the normalized name contains a short name as the final component (for example, if a filter is trying to create "/Foo/Ba~1.txt" then the normalized path will have Ba~1 as a final component; everything else in the path should be normalized though). However, there is a really big performance hit associated with requesting a normalized name in preCreate and so it should be avoided if possible (might not be possible in all cases, but perhaps it can be moved to postCreate or maybe the opened name will do). The perf hit is much smaller when getting a normalized path in postCreate, primarily because of the cache.

Contexts in preCreate

Since before the IRP_MJ_CREATE hits the file system the FILE_OBJECT is not associated with a file system stream, any mechanism that requires the SCB will not function. For minifilters this includes file related contexts (stream, streamhandle, file), the name cache (hence the perf penalty when getting a normalized name) and possibly other things. Please note that because of renames, even if a minifilter opens a file with the same name in preCreate and then lets the IRP_MJ_CREATE continue there is no guarantee that they're going to be opening the same stream. This is one reason security products should not attempt to scan files in preCreate (because there is no way to guarantee that what they scanned will be the stream that original IRP_MJ_CREATE will end up opening).

Opening a new FILE_OBJECT for an existing FILE_OBJECT

Sometimes a minifilter needs a new handle to the same FILE_OBJECT that a user has opened(FO1). Rather than getting the file name of the user's file and then calling FltCreateFile with that name, a minifilter can simply call create (IoCreateFile, ZwCreateFile ) with an empty name and use a handle to FO1 as the RootDirectory handle when setting up the OBJECT_ATTRIBUTES structure. This results in an IRP_MJ_CREATE where the FILE_OBJECT->FileName is empty and FILE_OBJECT->RelatedFileObject is FO1  and the file system will simply open a new handle to the same stream. This is a much better approach because it doesn't require using file names so there is no hit associated with FltGetFileNameInformation and also it is not vulnerable to renames of the original file. Of course, the user's FILE_OBJECT must be opened.

Writing to read-only files

A pretty interesting behavior of file systems is that when an IRP_MJ_CREATE creates a read-only file the handle associated with that IRP_MJ_CREATE can be used to write to the file. This is interesting because if a filters tries to open the same file the user has opened in postCreate  and it is using the same parameters, it doesn't necessarily mean it will get the same rights, depending on whether the file existed before that IRP_MJ_CREATE or not.

FileObject->FileName is not meaningful after a successful IRP_MJ_CREATE

Because FileObject->FileName is only a vehicle to pass the name information from the IO manager to the file system, once the IRP_MJ_CREATE actually reaches a file system and a stream is opened, it should be ignored. This is because once the FILE_OBJECT is associated with an SCB, the name of that SCB can immediately change (a rename on another FILE_OBJECT for that SCB) and the entity that knows the name of the SCB at all times is the file system, but there is no mechanism for a file system to go in and update all FILE_OBJECTs associated with an SCB when the name changes. As a side note, I still believe that the FILE_OBJECT structure would have been better off without a FileName member and that the FileName should have been a member of the IRP_MJ_CREATE.

FILE_OBJECT->RelatedFileObject is not recursive

In an IRP_MJ_CREATE if FILE_OBJECT->RelatedFileObject is not null, then that FILE_OBJECT (RFO) cannot also have a FILE_OBJECT->RelatedFileObject. However, since the RelatedFileObject has already been opened it means one cannot rely on its FileName member (see above) and so whether it had a RelatedFileObject or not is irrelevant.

SL_OPEN_TARGET_DIRECTORY in preCreate

SL_OPEN_TARGET_DIRECTORY means that this create is actually targeted at the parent directory of the FILE_OBJECT->FileName path ( if FILE_OBJECT->FileName is "\foo\bar\baz" and SL_OPEN_TARGET_DIRECTORY is set then the SCB that will be associated with this FILE_OBJECT is for "\foo\bar"). FltGetFileNameInformation in preCreate is aware of this and it will actually return the name "\foo\bar". So if a minifilter needs to get the full path even when SL_OPEN_TARGET_DIRECTORY is set, they must remove this flag before calling FltGetFileNameInformation (and set it back before sending the IRP_MJ_CREATE down, of course).

Thursday, January 13, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part V

In this post I want to talk about the IoCreateStreamFileObject API, as well as the difference between what is normally referred to as an FCB or an SCB and a FILE_OBJECT. But before that I'm going to rant a bit about my favorite subject, namespaces :).

Any stateful communication protocol needs a way to identify the connection once it has been established. In a lot of cases the requestor of the service initiates the connection and receives back a token that identifies the connection to the provider of the service. This token belongs to the namespace of the service provider. This is the case with file handles (a caller requests a file to be opened and they get back a token which they can then use when requesting reads and writes), network connections (where the token is a socket), web pages (the user logs on to a server, receives a session token or a cookie) and many other things. However, the protocol could also go the other way around, where the user could create a token (which would then belong to the user's namespace) and pass it in with the session initiation request. In such a scenario, opening a file would more like "hey file system, I plan to read from a file and I will use the value 123 for it, so whenever you see value 123 know that I'm talking about that file". So as you can see an important question when designing a protocol is who should learn the other's context? Should it be the service provider or the server requestor ? It's pretty easy with people because they can never remember some else's context (a token that someone else gives them) so in that case whichever end of the protocol needs human interaction should be the one that generates the token.  For example, each file in a file system can be opened by ID, but humans still prefer names even though they could in theory learn the ID. Or when browsing a web page people don't remember the URL at the top for each page they visit, even for web pages where that URL doesn't change. In fact the reason Google is such a big company is because they figured out that people don't remember URLs even when they aren't random characters, but instead people remember key phrases about the page they're looking for (that's their token, not the URL). Going even further one could argue that the whole history of computer science is the history of building contexts for people . Assembly language was a way to associate names that were meaningful to people with memory locations (so instead of the human operator remembering the address, which is the machine's context, the human operation would remember a name and the assembler would convert that name into the address). A file system is a way to associate disk locations with a name and so on.

Anyway, now that my rant is over, let's get back to IRP_MJ_CREATE. IRP_MJ_CREATE is the type of protocol where the IO manager tells the file system to open something and it also tells it the token it's going to use to refer to it in the future, the FILE_OBJECT. IO manager allocates a FILE_OBJECT that it will use to identify that stream and it needs to tell the file system about it. However, the file system also needs a context associated with that stream. It needs to know where the stream is located on disk, whether it is encrypted and so on. This is all very specific to the file system (clearly only a file system that supports encryption will need to know whether the stream is encrypted) and so there is no general purpose structure that all the file systems can use. Therefore each file system needs to keep its own internal structure for streams it opens and it needs to learn how to associate the FILE_OBJECT with its internal structure. It could implement a key-value structure where the FILE_OBJECT is the key and the file system internal structure is the value, but that would potentially be time consuming (the key lookup would need to happen for each operation). The decision was made to allocate a field in the FILE_OBJECT to be used by the file system to store this context, and that field is FILE_OBJECT->FsContext. The way the protocol works is that during IRP_MJ_CREATE FsContext is NULL and when the request reaches the file system, the file system will allocate its internal context and store a pointer to it in the FsContext. In other words IRP_MJ_CREATE is a mechanism that allows a file system to initialize its fields in the FILE_OBJECT.

This internal context that identifies a stream to the file system is traditionally called an FCB (file control block) because there used to be only one stream per file. However, when file systems added the ability to associate multiple streams of data with a file (alternate data streams), the file system needed to be aware of the distinction between the stream and file and so in such file systems (NTFS and UDFS are examples of file systems that support alternate data streams (or ADS)) the FCB actually means "context associated with the file" and what traditionally used be the FCB is now called an SCB (stream control block).  On a final note about SCBs, it is worth mentioning that they aren't really completely private and that in fact the OS cares about some information associated with the SCB. As such, all SCBs should start with an FSRTL_ADVANCED_FCB_HEADER. So for any file system developers, please make sure to implement this. I've done it for file systems that didn't support it and it can probably be done in a couple of hours. Without this your file system won't support FltMgr and minifilters and probably break some legacy filters as well (there are other negative side effects as well).

So now that I mentioned that the IRP_MJ_CREATE can be seen as way to associate an a FILE_OBJECT with the SCB, let's talk about IoCreateStreamFileObject. A file system needs to track a lot of metadata. Some of it is related to user files (names, directory structures, permissions) and some is internal to the file system (transaction log, journal). This metadata can be split into logical units (the journal, the transaction log, the directory information, the permissions hash) and they must be dynamic. So it makes sense that a file system would treat most of this data as if they were user files. In this way it can reuse a lot of the code it already has implemented for reads and writes and so on. So when user does something like enumerate a directory, the file system can simply say "open the stream associated with the directory and read it all" using its internal functions that deal with converting file offsets into disk offsets and so on. However, reading metadata from disk for every user operation will make things very slow so a file system might prefer to cache things. It could of course allocate memory and remember which data was more frequently accessed and it might decide that if there is memory pressure the size of the cache should decrease but there is already a component in the system that does all that, the Cache Manager. So if the file system could use the Cache Manager then it could benefit from all the logic in there. But the Cache Manager is a system component and it doesn't know anything about SCBs (which are specific to each file system). So in order to keep things generic, it would be nice if the file system could use FILE_OBJECTs for its internal streams and then use all the facilities in the OS that use FILE_OBJECTs.

Of course, creating FILE_OBJECTs for internal streams is a noble goal, but how does one create such FILE_OBJECTs ? Allocating memory and initializing the structure by hand is just asking for trouble, since the structure is different between OS versions, and if we don't call ObCreateObject (which is not documented) then we're probably going to break some OB integration anyway. One possible solution would be to call IoCreateFile from the file system. However, not all internal streams have names and while the file system could do something like use ECPs or allocate GUIDs as file names and use those as keys, this would still be  a pretty ugly hack. Moreover, as we've discussed above, the IRP_MJ_CREATE is nothing but a way for the IO manager to tell the file system which internal stream to associate with a FILE_OBJECT, but since the file system already knows exactly which stream is wants to open, why even have an IRP_MJ_CREATE ? What a file system needs is a way to request a new FILE_OBJECT from the IO manager, which it then can associate with the right internal structure. IoCreateStreamFileObjectEx is an API to do just that. There are some examples in the WDK about how to call it and when it should be used.

This is a brief overview of what IoCreateStreamFileObjectEx does (IoCreateStreamFileObject simply calls IoCreateStreamFileObjectEx with a NULL handle).:
  1. Call ObCreateObject to create the actual FILE_OBJECT
  2. Setup a minimal set of the FILE_OBJECT fields
  3. Set the FO_STREAM_FILE flag in the FILE_OBJECT.
  4. Call ObInsertObjectEx to create a handle for the FILE_OBJECT
  5. If the caller passed in a NULL pointer, close the handle.

From a file system filtering perspective, the implication is that filters should expect IO and possibly other operations on FILE_OBJECTs that they haven't seen an IRP_MJ_CREATE for. Depending on the filter's functionality the filter might want to ignore such FILE_OBJECTS. Since all stream FILE_OBJECTs have the FO_STREAM_FILE flag set in the FILE_OBJECT->Flags, checking for this flag is a pretty reliable way to identify such FILE_OBJECTs.