Thursday, September 8, 2011

File IO Redirection Between Volumes Using FltMgr

So what do I mean by IO redirection ? File system filters occasionally need to redirect a certain operation to a different file (and possibly on a different volume) than the file the request was originally issued for. Please note that if all operations need to be redirected then there are alternate mechanisms available. I'd say that in general one can classify redirection based on how often it needs to happen and there are methods specific to each class:

  • Always redirect to the same file - in this case one way to achieve this is to use hardlinks.
  • Always redirect to a file but the target of the redirection might change - symlinks might be just the thing for this.
  • If a file is opened in a certain way (for example opened by a special process or class of processes) then redirect all IO to a different file (but all IO goes to the same file) - a filter that detects when the file was opened in that way and returns STATUS_REPARSE.
  • After a file is opened redirect only certain operations to a different file (for example have just some reads from a certain region come from a different file) or to multiple different files - this is where IO redirection might be necessary...

In a legacy filter this can be achieved by changing the IO_STACK_LOCATION parameters (like the FileObject) and by sending the IO to a different device when calling IoCallDriver to send the IRP below.

FltMgr offers the same ability to redirect an operation to a different file or even to a different volume. Pretty much all a filter has to do is to set the FLT_CALLBACK_DATA->Iopb->TargetFileObject to a different FILE_OBJECT and FltMgr will make sure that all the layers below the filter will see the request on that new FILE_OBJECT. Unfortunately, things aren't that easy when trying to redirect to a different volume. The caller would need to change FLT_CALLBACK_DATA->Iopb->TargetInstance but that requires a bit more extra work that we're going to explore in this post.

Before we go on I'd like to point out a couple of things. First, a minifilter must never redirect IO to an instance of a different filter or to an instance of its own at a different altitude (which implies it must not redirect to an instance on the same volume as well). This is not supported (and I think Verifier will complain). Also, please note that the extra work involved is not specific to minifilters, legacy filters need to do the same thing when redirecting to a different device, it's just that it involves different APIs for minifilters.

So what is the problem with just changing the TargetInstance ? The problem is that instances are associated with frames, which are simply FltMgr devices attached to a volume. In one of my earlier posts I mentioned that the number of devices between frames on different volumes might be different. For example, if a legacy filter attaches only to the D: drive, it's possible that on the C: volume FltMgr's frames that are attached below and above the legacy filter are actually right on top of each other. For example see this picture:

So as you can see if you look at HarddiskVolume1 and HarddiskVolume2 the number of devices between Frame1 (the one in the middle, Frame0 is always on the bottom) and the file system is different. So what happens if a minifilter redirects IO from a stack with fewer devices (the top instance in Frame1 on \Device\HarddiskVolume1 in our example) to a stack with more devices (\Device\HarddiskVolume2, also in frame 1 since IO redirection can't go to a different frame) ? Well, it'll mostly work, but it's possible that in some cases the IRP will run out of IO_STACK_LOCATIONs (since it expected fewer devices on that stack) and bugcheck.

Legacy filters have more control over the number of IO_STACK_LOCATIONs they get from the IO manager because they can set up the proper number when they create their attachment devices. Also, they can tell if a certain IRP has enough stack locations to work on the target stack. Minifilters however don't have direct access to either FltMgr's DEVICE_OBJECT associated with the volume or to the IRP. So for this scenario FltMgr provides a couple of APIs:

NTSTATUS
FLTAPI
FltAdjustDeviceStackSizeForIoRedirection(
    __in PFLT_INSTANCE SourceInstance,
    __in PFLT_INSTANCE TargetInstance,
    __out_opt PBOOLEAN SourceDeviceStackSizeModified
    );

NTSTATUS
FLTAPI
FltIsIoRedirectionAllowed(
    __in PFLT_INSTANCE SourceInstance,   
    __in PFLT_INSTANCE TargetInstance,
    __out PBOOLEAN RedirectionAllowed 
    );

NTSTATUS
FLTAPI
FltIsIoRedirectionAllowedForOperation(
    __in PFLT_CALLBACK_DATA Data,
    __in PFLT_INSTANCE TargetInstance,
    __out PBOOLEAN RedirectionAllowedThisIo,    
    __out_opt PBOOLEAN RedirectionAllowedAllIo 
    );

The documentation for these functions is pretty good and considering how descriptive their names are, almost unnecessary :). FltIsIoRedirectionAllowed() tells the caller whether redirection is allowed with the current stack sizes and between the two instances, FltAdjustDeviceStackSizeForIoRedirection() updates the device size so that subsequent requests can be redirected and FltIsIoRedirectionAllowedForOperation() looks at the IRP to see if it has enough IO stack locations to be redirected.

Filters can decide where they'll redirect requests early on (either at instance setup or when a file is created) and update the stack size or they can do so on an operation by operation basis (in case they don't know exactly to which volume they'll redirect a certain request) . Either way, the pseudo-code would be something like this:

  • For each operation that needs to be redirected:
    • Check if the operation can be redirected by calling FltIsIoRedirectionAllowedForOperation().
    • If redirection is possible then just redirect (change the TargetInstance to the other instance)
    • If redirection is not possible then the only solution is to duplicate the FLT_CALLBACK_DATA structure and issue the operation on the other instance manually. Please note that currently FltMgr does not have a mechanism to do this automatically.
  • If the instance that will be used for redirection is known at instance setup time then the filter should call FltIsIoRedirectionAllowed() and if needs to it should then call FltAdjustDeviceStackSizeForIoRedirection(). The check on each operation that needs to redirect should still be performed.
  • If the instance that will be used for redirection is known at IRP_MJ_CREATE time then the minifilter should also call FltIsIoRedirectionAllowed() and if needs to it should then call FltAdjustDeviceStackSizeForIoRedirection(). The check on each operation that needs to redirect should still be performed.

Please note that there is one aspect that the documentation highlights. If you plan to redirect IO to a different volume you will need a FILE_OBJECT opened on that volume. Using a FILE_OBJECT that belongs to a certain volume on a different volume will result in very interesting (and broken) NTFS behavior. So that means that in some cases you'll need to issue your own create to get a FILE_OBJECT on that volume. It's impossible to make a hard rule about it because a filter might only be redirecting certain operations to only one existing file on a different volume (can't come up with a good example so I'll just leave it at that) so the minifilter won't need to call FltCreateFile(Ex(2)) to open a file on the other volume for each IRP_MJ_CREATE it processes. In this case, the call to FltIsIoRedirectionAllowed() should happen in the postCreate callback.

One more thing i'd like to point out is that i see quite a lot of code that uses Data->Iopb->TargetFileObject and Data->Iopb->TargetInstance to get the FILE_OBJECT and the FLT_INSTANCE for the current operation. While this isn't wrong, FltObjects->FileObject and FltObjects->Instance were meant for this purpose. Also, if the minifilter actually does change Data->Iopb->TargetFileObject or Data->Iopb->TargetInstance then using them in various functions and macros after they've been changed might lead to broken behavior so it's generally a good idea to either pass the FILE_OBJECT and the FLT_INSTANCE as parameters to functions or to pass the PFLT_RELATED_OBJECTS pointer as a parameter instead of relying on the Iopb.

2 comments:

  1. Great post!

    I'm confused about a thing. Suppose the same scenario described by the picture you used and you're implementing an I/O redirection from one volume to another on frame1 on both volumes. Wouldn't the redirected I/O skip the frame2's instances on the target volume? Isn't that an issue?

    ReplyDelete
    Replies
    1. Hi Fernando,

      You're right, if redirection happens in frame1 then the instances on the target volume in any frame above frame1 (and in fact even inside frame1 for instances that are above the altitude of the minifilter that's doing the redirection) will not see any IO on that volume. That shouldn't be a problem in itself since it's expected that any filter can perform IO which would only be seen by filters below itself, which is very similar to this case. However, for filters that complete IO on one volume by using files from a different volume it may also be necessary that the IO requests are sent to the top of the stack on the other volume (to make sure that all the filters on the target volume get a chance to process the IO). It really depends on the specifics of the filter, IO redirection isn't the only way of doing this but there are cases when it's the best way and that's when these APIs come into play.

      Delete