Thursday, February 17, 2011

Filter Layering and IO Targeting in FltMgr - part II

Let's look at how targeting in filter manager can fail and what it looks like when it happens. I'd like to say that while such things do happen and I've analyzed a couple of cases over the years, they are far less frequent with minifilters than with legacy filters. Anyway, layering violations in some cases can just go unnoticed, but if they do cause trouble most likely what will happen is an infinite recursion which will result in a bugcheck. Deadlocks can also happen, but the cases I've investigated so far were all infinite recursions.

One way layering can fail is when a FILE_OBJECT is used in postCreate by a minifilter and the minifilter is not using Flt APIs to perform the IO. For example, using the setup from the previous blog post, if minifilter1 in postCreate calls a Zw function or creates a handle for a user mode app and lets the user mode app use that handle to do something like scan the file or recall it or decompress it, what will happen is that the requests generated by the user mode service or the Zw calls will go to Frame0 because the FILE_OBJECT stores the information about the device hint, but FltMgr will not find its targeting structure and will show the requests all minifilters in the frame starting with Minifilter 3. However, this is a clear violation of the minifilter rules because minifilter1 broke the layering contract by sending IO to the top of the IO stack. If it wants to do this sort of thing it needs to call its own FltCreateFile and after that create succeeds it can either use a Zw API or create a handle for that FILE_OBJECT to be used by a user mode service.

Now, if filter manager would always use a device hint things wouldn't be too bad. However, there is a case where FltMgr violates the rule of never sending IO to the top of the stack. This case is in the naming path (and more precisely in the FltpExpandFilePathWorker function). When a minifilter calls FltGetFileNameInformation and requests a normalized name, FltMgr gets a name for the file and then proceeds to normalize it. It does so by opening folders along the path to the file and querying information that contains the long name for each component on the path. In this case however, FltMgr does use a targeting structure to identify which minifilters should see the create, but it does not use a DeviceHint for the IoCreateFileEx call. The reason, as far as I can tell, is that the request will fail if the name contains a reparse point that reparses to a different volume (remember that bit about IopCheckTopDeviceHint a couple of posts back ?) so FltMgr just sends it to the top of the stack.

So in this case the IRP_MJ_CREATE is not targeted in the IO manager (it will go to the top of the IO stack) but there will be FltMgr targeting information attached to the IRP_MJ_CREATE. Looking at our picture from the previous post, we can see that if the IRP_MJ_CREATE issued by (or on behalf of) minifilter2 is one of these creates then Frame1, Legacy Filter B and Legacy Filter A will all see the IRP_MJ_CREATE and the subsequent requests. Frame1 will find the targeting information and infer that no minifilter in that frame should see the request and just send it below. However, Legacy Filter B and Legacy Filter A will not be aware that they shouldn't see this request and will perform their usual functions. So, since the targeting information has not been attached to the FILE_OBJECT yet (it will only happen when IoCreateFileEx returns to FltMgr) and there is no DeviceHint on the FILE_OBJECT if Legacy Filter B issues any requests to the device below them, then not only will legacy filter A see that request (which is expected), but also when the request reaches Frame 0 all the minifilters will see it. But minifilter3 and minifilter2 should not have seen it an in fact they haven't seen the IRP_MJ_CREATE as well so there are quite a few things that can go wrong here.

Another interesting case I've seen was when a legacy filter (let's use Legacy Filter B again as an example) tried to implement something similar to FltReissueSynchronousIo in the minifilter world and in its postCreate, if the IRP_MJ_CREATE failed, it changed something in the request and sent it down again. This worked well on Vista and newer but in XP it failed. As you remember, in XP the targeting information is stored in an EA, and for whatever reason the EA mechanism was designed such that the structure is associated with an IRP_MJ_CREATE by storing the EaLength in the IO_STACK_LOCATION and the EA buffer in the IRP (Irp->AssociatedIrp.SystemBuffer). This is in my opinion a pretty poor design, because it suggests that the EaLength can be layered but the EA buffer cannot. It also means there is only one EA buffer per IRP and so FltMgr must use EA chaining. When FltMgr receives such a create it needs to remove its EA information from the buffer before it sends the request down. However, once the EA buffer has been changed if a legacy filter sends the request down again in the manner we explained then FltMgr (in Frame0 in our example) will not find the targeting information and will therefore show the information to all the minifilters in the frame, which can also result in a layering violation. Moreover, the EaLength and the EA buffer are now out of sync and the file system might not like this. For a very clear example of this issue looks like, please read this thread: http://www.osronline.com/showthread.cfm?link=187295. Please note that in this thread though we're not dealing with recursion but with the file system not being able to cope with the EaLength and the EA buffer being in an inconsistent state. Though infinite recursion could still have happened if there were more minifilters installed on the system.

Before ending this post, I'd like to point out that pretty much all layering issues require a legacy filter in the picture. In general FltMgr by itself with just minifilters is pretty good about it. Please note that even perfectly written legacy filters could trigger this issue (which is another way of saying that it isn't the legacy filter's fault), the real problem is that FltMgr breaks layering. What I would like FltMgr to do (in fact, what I wish it had done already) is to offer an API to legacy filters by which such a filter can tell whether a certain IRP or FILE_OBJECT is one they should ignore. Also, I think that FltMgr could address a large class of issues very easily by simply moving the targeting information to the FILE_OBJECT immediately after the IRP_MJ_CREATE completes in the file system.