Thursday, February 9, 2012

Problems with STATUS_REPARSE - Part II

In this post I'd like to talk about some of more complicated issues that might arise when using STATUS_REPARSE, in particular that ones that are introduced when crossing a volume boundary (when reparsing from a file on a volume to a file on a different volume).

First I'd like to talk about names (yes, the eternal problem of the file system filter developer). If a filter simply uses STATUS_REPARSE to reparse from FileA to FileB then any name queries will return the proper name of the file, FileB. So an application that queries the name of file might get a different path than the one the user actually tried to open. It's generally pretty rare that an application will open a file and then query its name so that's not really a problem but there are other applications on the system that rely on the file path to make policy decisions. For example, a firewall application might have rules that identify application by their path. If a filter redirects one those opens, even if the file contents are exactly the same one as before (like in the case of a deduplication solution) the firewall rule might not match because the path is different. So a filter that uses STATUS_REPARSE will probably end up implementing a mechanism by which to return the actual path the user tried to open. This is complicated by the fact that in some cases it is impossible to know what that name was before the reparse (see my previous post). However, assuming that a filter can know what the path the user tried to open was it can return that to functions that query the name from the file system (like IRP_MJ_QUERY_INFORMATION with the FileNameInformation class). A minifilter will also need to implement name provider functionality because just completing IRP_MJ_QUERY_INFORMATION isn't enough for that to work (see the WDK SimRep minifilter sample). Now, if the reparse crossed a volume boundary (so FileA is on Volume1 while FileB is on Volume2) then things get a bit more complicated because IRP_MJ_QUERY_INFORMATION only returns a path relative to the volume but not the actual volume name (\FileA and not Volume1\FileA). In most cases the volume is derived from the FILE_OBJECT and the filter can't really change that (and in fact since the mapping between FILE_OBJECT and the volume device and the mapping between the volume device and the drive letter are internal to the object manager (OB) a file system filter won't even see any operations at all). So it is quite likely that if FileA and FileB have different paths then a filter or even an user mode app trying to resolve the file name for the file will get a path that isn't right (like Volume2\FileA or even Volume1\FileB).

Just to point out how big this problem is, the MSDN post on Obtaining a File Name From a File Handle uses a method that gets the actual NT name for a handle by calling GetMappedFileName(). Because the handle points to the target file (Volume2\FileB) the device that is returned is the device for Volume2 (and the call to QueryDosDevice() is used to resolve this device to Volume2; btw, QueryDosDevice() is NOT the proper way to resolve a device name to a drive letter, it doesn't cover many cases and I wish MSDN would point this out). Still the point to note here is that there is nothing a filter can do to change the name the application gets (Volume2). So if the filter would actually return the original path for name queries (FileA) then the name would be Volume2\FileA which is wrong. Another API that can be used to get the file (available only since Vista though) is GetFinalPathNameByHandle() . This functions works by querying the full path like GetMappedFileName() but it gets it directly from OB and doesn't involve the memory manager and then it figures out the device to drive letter mapping, though it does it in a much more complicated way than by calling QueryDosDevice(). However it has the same problems as the previous approach because it always returns the actual volume, Volume2.

Another problem with STATUS_REPARSE when it traverses volume boundaries is that renames stop working (see my posts on renames (PartI and PartII) for a quick refresher). It is possible that some renames actually continue to work because of the MOVEFILE_COPY_ALLOWED flag, but there are cases where this flag isn't set or when the rename is issued directly via IRP_MJ_SET_INFORMATION. This is also pretty big because handling this in a filter is extremely complicated.

One other problem that STATUS_REPARSE might introduce is only specific to cross-volume reparses. I've already mentioned this in my previous post on IRP_MJ_CREATE but I'll explain again because I believe this is really important. Basically, whenever a filter calls FltCreateFile FltMgr will issue a targeted IRP_MJ_CREATE by calling IoCreateFileEx() with a device hint (or a similar mechanism in OSes where IoCreateFileEx isn't available). The problem is that if a filter below that filter tries to reparse to a different volume then the IRP_MJ_CREATE request will fail because the device hint specified by FltMgr will not be attached to that stack. As such the call to FltCreateFile will itself fail. So, in a nutshell, whenever a filter reparses to a different volume any filter above that filter cannot open any file with a path that will be reparsed by the lower filter. In my opinion this is a really big issue because it's an interop issue (so it might be hard to find during testing unless testing with a lot of filters) and because FltCreateFile is a very frequent operation in minifilters (for example most antivirus filter open files for scanning by calling FltCreateFile()). Moreover, this is not a problem most filter developers are aware of and so most filters are not written to be able to handle FltCreateFile() failing in this way. Besides there really isn't a good way to work around the issue anyway and so in practice I have yet to see any filter that actually does anything more that fail the original operation they are processing with the error returned from FltCreateFile(). I believe that is pretty much the biggest problem with STATUS_REPARSE and I haven't been able so far to come up with any good workaround. Most of the other problems I've listed require more development effort (in some cases significantly so) or some clever workarounds, but this particular issue is a blocker.

So this is it. In my experience these are the major issues (but not all the issues) that a filter must be aware of (and work around) if it implements a solution using STATUS_REPARSE. In my opinion this long list of problems makes using STATUS_REPARSE in a project pretty complicated outside of a couple of few limited scenarios. The main reason I've seen people use STATUS_REPARSE so far has been because it's very easy to use and one can have a prototype rather quickly, but I don't think it pays off in the long run.