Thursday, September 29, 2011

FILE_OBJECT Names in IRP_MJ_CREATE

I get a lot of questions about what the FILE_OBJECT->FileName is in preCreate or when a legacy filter receives the IRP_MJ_CREATE IRP. This is pretty much then only time when the FILE_OBJECT->FileName is defined (file systems can and do change the name in the FILE_OBJECT sometimes immediately after they process the IRP_MJ_CREATE and so even during postCreate the name is not defined).

So what I'd like to do for this post is to through all the various values that i've seen and what they mean. Please note that these are file system specific and so this list only really applies to the current MS file systems (NTFS, FAT, CDFS, UDFS, ExFAT). Also, this is not necessarily a complete list, it's simply a list of things I've seen (and as such it might also be inaccurate so feel free to comment and mention other cases and such). I certainly plan to update this post whenever I run into new possible values. Please note that FltMgr offers an API (FltGetFileNameInformation()) which will give you the right name in all these circumstances so if you need the file name then use it. This post is not meant to show how to build a name parser but rather to show how the names in IRP_MJ_CREATE work in general so that a minifilter can issue its own creates that take advantage of these features or to help a minifilter identify certain cases without resorting to calling FltGetFileNameInformation() and then parsing the name in all cases (for example when it only cares about volume opens).

Other than FILE_OBJECT->FileName there is another member that is relevant to the name: FILE_OBJECT->RelatedFileObject. This is a pointer to an already opened FILE_OBJECT that the IRP_MJ_CREATE is somehow related to. It is the FILE_OBJECT referenced by the HANDLE that is passed in to the call to the InitializeObjectAttributes() macro in the RootDirectory parameter. Even though the name suggests it, it does not have to be a handle to a directory (as we'll see in the later examples). Also, please note that when I say that RelatedFileObject = "\foo" i don't mean that RelatedFileObject->FileName = "\foo" but rather that RelatedFileObject is pointing to the object "\foo". Since RelatedFileObject is an already opened FILE_OBJECT we can't rely on its FileName member to be relevant.

So anyway, here are the cases:

  • full path (FileObject->FileName = "\directory\file.bin" and RelatedFileObject is NULL). This is one of the most common cases. It simply means the caller is trying to open the file specifying the full path from the volume.
  • relative open (FileObject->FileName = "directory2\file.bin" and RelatedFileObject = "\directory1"). This is also pretty common, the intention is to open the path "\directory1\directory2\file.bin" relative to the a handle the user has for "\directory1". I'm guessing this is the case that gave the OBJECT_ATTRIBUTES->RootDirectory member its name.
  • reopen (FileObject->FileName is empty (Length == 0 and Buffer == NULL) but RelatedFileObject is not null). This is used when the caller wants to open a new handle to an existing FILE_OBJECT. This is not the same as opening a new handle to the existing FILE_OBJECT (duplicating the handle) because the end result of this is to open a new FILE_OBJECT for the same underlying stream, and the two FILE_OBJECTs are not linked in any other way (for example, a filter might want to open its own handle to a user file without bothering to figure out if the user has enough access (if the minifilter might want to write to the file and the original handle didn't allow it then duplicating the handle is more complicated) or without interfering with the additional information stored in the FILE_OBJECT (like the current pointer position) and so on. According to the FASTFAT source code this should work for volume opens as well. This actually happens occasionally so filters should be prepared to deal with it.
  • open an ADS for a file (FileObject->FileName = ":foo:$DATA" and RelatedFileObject is a file or directory). See my previous post on opening alternate data streams for more information on this scenario (why it's useful and so on).
  • opening a volume (both FileObject->FileName and RelatedFileObject are empty (Length == 0, Buffer == NULL)). FltMgr guarantees that even in preCreate the FILE_OBJECT will have the FO_VOLUME_OPEN flag set, but IIRC for a legacy filter that might not be true (in other words the IO manager won't necessarily always set this flag).

So these are pretty much all the cases. There are also some cases where the name needs to be interpreted in a special way and this is indicated by special flags:

  • opening a file by ID (FileObject->FileName is either an 64bit or an 128 bit identifier (Length == 8 or Length == 16 and Buffer should be treated as PVOID)). If I remember correctly the FileName might also be "\" followed by a 64 bit or 128 bit identifier (so the Length is now 10 or 18, respectively). The name should only be interpreted like an ID if the FLT_CALLBACK_DATA->Iopb->Parameters.Create.Options has the FILE_OPEN_BY_FILE_ID flag set. If called in this case FltGetFileNameInformation() will actually open the object specified by the ID and return its name.
  • opening the target of a rename (either full path or relative path). The key thing here is that the object that will actually be opened in the file system is the parent directory of the name specified by FILE_OBJECT->FileName and FILE_OBJECT->RelatedFileObject. In other words, if the full path is "\directory1\directory2\file.bin", then the actual object that will be opened is the directory "\directory1\director2". This is indicated by the SL_OPEN_TARGET_DIRECTORY flag set in FLT_CALLBACK_DATA->Iopb->OperationFlags. See my post on renames for details (http://fsfilters.blogspot.com/2011/06/rename-in-file-system-filters-part-i.html). If called for this case FltGetFileNameInformation() returns the name of the parent directory (if you want the full path including the file then just clear the SL_OPEN_TARGET_DIRECTORY flag before calling FltGetFileNameInformation() and set it back once it returns and you should get the actual name that the file system will receive).

Other than the name one important factor in the preCreate path is whether the user wants to open the path in a case insensitive or a case sensitive fashion. This is specified in FLT_CALLBACK_DATA->Iopb->OperationFlags, check for SL_CASE_SENSITIVE. Of course, once the IRP_MJ_CREATE is completed successfully and a FILE_OBJECT is opened the way to know whether the file was opened with a case sensitive open is to look at FO_OPENED_CASE_SENSITIVE.

One more thing to note is that when calling FltGetFileNameInformation() the name that is returned is a full path (that includes the volume and everything that is necessary to pass it in to an FltCreateFile call and open the stream). The code handles all the cases i mentioned and returns the name of the object that will actually be opened.

Thursday, September 22, 2011

Opening an Alternate Data Stream

Alternate Data Streams (ADS) are a pretty interesting feature of a filesystem, and one that can be quite useful for filter developers. ADS are an ideal way to store information related to a file, without interfering with operations on the main data stream. There are many reasons why such a feature is interesting to a file system filter developer so I'm not going to go into those. Instead I'd like to focus on one particular aspect of using them, opening the stream.

So why is opening an ADS such an interesting topic ? Because in almost all cases when a filter uses an ADS it wants to open it for a specific file that the user has open and is working on. Most implementations I've seen go about this by querying the name of the file that the user has opened ("\Device\HarddiskVolume1\test.bin") and then by appending the ADS name at the end of that name ("\Device\HarddiskVoluem1\test.bin:foo") and then by issuing an open for that. So what's wrong with this ? Well, as I've already said in one of my previous posts on names trying to open the same file that the user has open by name is problematic because the name of the file can change before the time the name is queried and the time the create reaches the file system (for example if there is a simple script that renames file a.bin to b.bin when a.bin exists and file b.bin to file a.bin when b.bin exists which in effect creates a scenario where a file very quickly changes name from a.bin to b. bin and back).

So I'd like to show a mechanism to open an ADS relative to an already open file. The local variable "openFileHandle" is a handle to the open file for which we want to open the ADS. Then the code would look something like this:


                    //
                    // this is the name of the ADS that we want to open the file.
                    //

                    UNICODE_STRING ADSName = RTL_CONSTANT_STRING(L":foo:$DATA");

		...

                    //
                    // initialize OBJECT_ATTRIBUTES with the handle we have and the name of the 
                    // ADS.
                    //

                    InitializeObjectAttributes( &objectAttributes,
                                                &ADSName,
                                                OBJ_KERNEL_HANDLE,
                                                openFileHandle,
                                                NULL );

                    //
                    // and now issue our open for the stream.
                    //

                    status = FltCreateFile( gFilterHandle,
                                            FltObjects->Instance,
                                            &ADSHandle,
                                            FILE_READ_DATA | FILE_READ_ATTRIBUTES,
                                            &objectAttributes,
                                            &ioStatus,
                                            0,
                                            FILE_ATTRIBUTE_NORMAL,
                                            FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                                            FILE_OPEN_IF, 
                                            FILE_OPEN_REPARSE_POINT,
                                            NULL,
                                            0,
                                            0 );

There are a couple of things worth mentioning about this approach:

  • Clearly this requires the main stream to be opened. So it can't be called from a preCreate for a file.
  • One cool feature is that it works regardless of whether openFileHandle is a handle to the main stream of a file or to another ADS. This is nice because normally building the name of the ADS to open when opening it by name requires special treatment for when the user has opened the ADS compared to when they have the main data stream.
  • Finally, please notice the FILE_OPEN_REPARSE_POINT flag. This is added because in case openFileHandle is a handle to a file that is a reparse point then our create will reparse which isn't what we want. So I put the FILE_OPEN_REPARSE_POINT flag just in case openFileHandle was opened with FILE_OPEN_REPARSE_POINT itself. The flag should have no effect otherwise.

Thursday, September 15, 2011

What's New in Win8 for File System Filters

So i guess i'm as excited as everyone about Windows 8 and i figured i'd get the preview and take a look at what's new. I haven't had much time to tinker with it, but here are some highlights of what to expect.

One thing that's been missing from FltMgr since the beginning and the developers have complained about was the ability to filter other devices that have file system semantics, and in particular named pipes (NPFS). It looks like in Win8 that's finally going to happen and we'll finally be able to write minifilters for NPFS. Moreover, it looks like the folks in Redmond went the extra mile and have added support for the Mailslot file system (MSFS) as well:

C:\Windows\system32>fltmc volumes                                                            
Dos Name                        Volume Name                              FileSystem   Status 
------------------------------  ---------------------------------------  ----------  --------
                                \Device\Mup                              Remote              
C:                              \Device\HarddiskVolume2                  NTFS                
                                \Device\HarddiskVolume1                  NTFS                
                                \Device\NamedPipe                        NamedPipe           
                                \Device\Mailslot                         Mailslot            
D:                              \Device\CdRom0                           UDFS        Detached
                                \Device\HarddiskVolumeShadowCopy1        NTFS                
D:                              \Device\CdRom0                           UDFS                

There is even an in-box filter attached to NPFS:

C:\Far>fltmc filters

Filter Name                     Num Instances    Altitude    Frame
------------------------------  -------------  ------------  -----
WdFilter                                6       328010         0
luafv                                   1       135000         0
npsvctrig                               1        46000         0
FileInfo                                6        45000         0

C:\Far>fltmc instances -f npsvctrig

Instances for npsvctrig filter:

Volume Name                              Altitude        Instance Name       Frame  VlStatus
-------------------------------------  ------------  ----------------------  -----  --------
\Device\NamedPipe                          46000     npsvctrig                 0

As you can see there are two more minifilters installed by default (at least in this version). There is npsvctrig, which seems to be (according to the INF file) the "Named pipe service trigger provider". There is also WdFilter, which i couldn't find the INF for but the entry in the services key seems to suggest it's related to some anti-malware protection feature.

One more thing I've done was to look at what's new in fltmgr.sys. It has increased in size from the Win7 version I have by about 15% (on the x86 version), so there's bound to be something interesting in there. Looking at the exports we can see a bunch of new APIs, which i've grouped by name into some categories. Some of them are actually documented in MSDN and I'll leave reading the documentation for them up to you:

FltGetContextsEx
FltReleaseContextsEx

FltGetSectionContext
FltCreateSectionForDataScan
FltRegisterForDataScan
FltCloseSectionForDataScan

FltFastIoMdlReadComplete
FltFastIoMdlWriteComplete
FltFastIoMdlRead
FltFastIoPrepareMdlWrite

FltPrepareToReuseEcp

FltOplockKeysEqual
FltOplockFsctrlEx

FltSetQuotaInformationFile
FltQueryQuotaInformationFile

FltCreateNamedPipeFile
FltCreateMailslotFile

FltGetActivityIdCallbackData
FltSetActivityIdCallbackData

FltEnumerateInstanceInformationByVolumeName
FltEnumerateInstanceInformationByDeviceObject

FltPropagateActivityIdToThread

FltWriteFileEx
FltReadFileEx

Please note that i'm pretty sure i'm barely scratching the surface here, for example some APIs might have changed their parameters and my simple perl script wouldn't be able to figure that out. I'm looking forward to the presentations and new samples in the WDK for these things and I wanted to get everybody as excited as I am.

Thursday, September 8, 2011

File IO Redirection Between Volumes Using FltMgr

So what do I mean by IO redirection ? File system filters occasionally need to redirect a certain operation to a different file (and possibly on a different volume) than the file the request was originally issued for. Please note that if all operations need to be redirected then there are alternate mechanisms available. I'd say that in general one can classify redirection based on how often it needs to happen and there are methods specific to each class:

  • Always redirect to the same file - in this case one way to achieve this is to use hardlinks.
  • Always redirect to a file but the target of the redirection might change - symlinks might be just the thing for this.
  • If a file is opened in a certain way (for example opened by a special process or class of processes) then redirect all IO to a different file (but all IO goes to the same file) - a filter that detects when the file was opened in that way and returns STATUS_REPARSE.
  • After a file is opened redirect only certain operations to a different file (for example have just some reads from a certain region come from a different file) or to multiple different files - this is where IO redirection might be necessary...

In a legacy filter this can be achieved by changing the IO_STACK_LOCATION parameters (like the FileObject) and by sending the IO to a different device when calling IoCallDriver to send the IRP below.

FltMgr offers the same ability to redirect an operation to a different file or even to a different volume. Pretty much all a filter has to do is to set the FLT_CALLBACK_DATA->Iopb->TargetFileObject to a different FILE_OBJECT and FltMgr will make sure that all the layers below the filter will see the request on that new FILE_OBJECT. Unfortunately, things aren't that easy when trying to redirect to a different volume. The caller would need to change FLT_CALLBACK_DATA->Iopb->TargetInstance but that requires a bit more extra work that we're going to explore in this post.

Before we go on I'd like to point out a couple of things. First, a minifilter must never redirect IO to an instance of a different filter or to an instance of its own at a different altitude (which implies it must not redirect to an instance on the same volume as well). This is not supported (and I think Verifier will complain). Also, please note that the extra work involved is not specific to minifilters, legacy filters need to do the same thing when redirecting to a different device, it's just that it involves different APIs for minifilters.

So what is the problem with just changing the TargetInstance ? The problem is that instances are associated with frames, which are simply FltMgr devices attached to a volume. In one of my earlier posts I mentioned that the number of devices between frames on different volumes might be different. For example, if a legacy filter attaches only to the D: drive, it's possible that on the C: volume FltMgr's frames that are attached below and above the legacy filter are actually right on top of each other. For example see this picture:

So as you can see if you look at HarddiskVolume1 and HarddiskVolume2 the number of devices between Frame1 (the one in the middle, Frame0 is always on the bottom) and the file system is different. So what happens if a minifilter redirects IO from a stack with fewer devices (the top instance in Frame1 on \Device\HarddiskVolume1 in our example) to a stack with more devices (\Device\HarddiskVolume2, also in frame 1 since IO redirection can't go to a different frame) ? Well, it'll mostly work, but it's possible that in some cases the IRP will run out of IO_STACK_LOCATIONs (since it expected fewer devices on that stack) and bugcheck.

Legacy filters have more control over the number of IO_STACK_LOCATIONs they get from the IO manager because they can set up the proper number when they create their attachment devices. Also, they can tell if a certain IRP has enough stack locations to work on the target stack. Minifilters however don't have direct access to either FltMgr's DEVICE_OBJECT associated with the volume or to the IRP. So for this scenario FltMgr provides a couple of APIs:

NTSTATUS
FLTAPI
FltAdjustDeviceStackSizeForIoRedirection(
    __in PFLT_INSTANCE SourceInstance,
    __in PFLT_INSTANCE TargetInstance,
    __out_opt PBOOLEAN SourceDeviceStackSizeModified
    );

NTSTATUS
FLTAPI
FltIsIoRedirectionAllowed(
    __in PFLT_INSTANCE SourceInstance,   
    __in PFLT_INSTANCE TargetInstance,
    __out PBOOLEAN RedirectionAllowed 
    );

NTSTATUS
FLTAPI
FltIsIoRedirectionAllowedForOperation(
    __in PFLT_CALLBACK_DATA Data,
    __in PFLT_INSTANCE TargetInstance,
    __out PBOOLEAN RedirectionAllowedThisIo,    
    __out_opt PBOOLEAN RedirectionAllowedAllIo 
    );

The documentation for these functions is pretty good and considering how descriptive their names are, almost unnecessary :). FltIsIoRedirectionAllowed() tells the caller whether redirection is allowed with the current stack sizes and between the two instances, FltAdjustDeviceStackSizeForIoRedirection() updates the device size so that subsequent requests can be redirected and FltIsIoRedirectionAllowedForOperation() looks at the IRP to see if it has enough IO stack locations to be redirected.

Filters can decide where they'll redirect requests early on (either at instance setup or when a file is created) and update the stack size or they can do so on an operation by operation basis (in case they don't know exactly to which volume they'll redirect a certain request) . Either way, the pseudo-code would be something like this:

  • For each operation that needs to be redirected:
    • Check if the operation can be redirected by calling FltIsIoRedirectionAllowedForOperation().
    • If redirection is possible then just redirect (change the TargetInstance to the other instance)
    • If redirection is not possible then the only solution is to duplicate the FLT_CALLBACK_DATA structure and issue the operation on the other instance manually. Please note that currently FltMgr does not have a mechanism to do this automatically.
  • If the instance that will be used for redirection is known at instance setup time then the filter should call FltIsIoRedirectionAllowed() and if needs to it should then call FltAdjustDeviceStackSizeForIoRedirection(). The check on each operation that needs to redirect should still be performed.
  • If the instance that will be used for redirection is known at IRP_MJ_CREATE time then the minifilter should also call FltIsIoRedirectionAllowed() and if needs to it should then call FltAdjustDeviceStackSizeForIoRedirection(). The check on each operation that needs to redirect should still be performed.

Please note that there is one aspect that the documentation highlights. If you plan to redirect IO to a different volume you will need a FILE_OBJECT opened on that volume. Using a FILE_OBJECT that belongs to a certain volume on a different volume will result in very interesting (and broken) NTFS behavior. So that means that in some cases you'll need to issue your own create to get a FILE_OBJECT on that volume. It's impossible to make a hard rule about it because a filter might only be redirecting certain operations to only one existing file on a different volume (can't come up with a good example so I'll just leave it at that) so the minifilter won't need to call FltCreateFile(Ex(2)) to open a file on the other volume for each IRP_MJ_CREATE it processes. In this case, the call to FltIsIoRedirectionAllowed() should happen in the postCreate callback.

One more thing i'd like to point out is that i see quite a lot of code that uses Data->Iopb->TargetFileObject and Data->Iopb->TargetInstance to get the FILE_OBJECT and the FLT_INSTANCE for the current operation. While this isn't wrong, FltObjects->FileObject and FltObjects->Instance were meant for this purpose. Also, if the minifilter actually does change Data->Iopb->TargetFileObject or Data->Iopb->TargetInstance then using them in various functions and macros after they've been changed might lead to broken behavior so it's generally a good idea to either pass the FILE_OBJECT and the FLT_INSTANCE as parameters to functions or to pass the PFLT_RELATED_OBJECTS pointer as a parameter instead of relying on the Iopb.

Thursday, September 1, 2011

Using EX_PUSH_LOCK

After talking about EX_RUNDOWN_REF I'd like to talk about another primitive that is pretty cool yet also rather undocumented, the EX_PUSH_LOCK. Even though they are OS primitives (you can see the Exf functions if you look at the exports from the kernel) they are not documented or even declared in the WDK headers. However, there are some wrapper functions for them exported by FltMgr and there is some documentation about them there. The cool thing is that you don't have to be a minifilter to use them, though you'll still need to access them through FltMgr's functions (but those don't require any minifilter specific structures like instances or such). Anyway, here is the link to FltInitializePushLock() and then there is this thread on OSR's list that is pretty interesting.

So let's talk briefly about what pushlocks actually are. They are shared-exclusive locks with similar semantics to ERESOURCEs, but that have a couple of different properties:

  • They are smaller than ERESOURCEs (size of a machine pointer)
  • They are more efficient for mostly shared access
  • They can live in paged pool.
  • They CANNOT be acquired recursively.
  • They have different fairness guarantees (or no guarantees if you prefer)

A very important difference (and a major drawback in my oppinion) is the fact that they are not at all convenient to debug. There is no !locks debugger extension for them and looking at the structures directly isn't easy (or at least I've had a really hard time trying). Still, the structure is available in the debugger:

0: kd> dt nt!_EX_PUSH_LOCK
   +0x000 Locked           : Pos 0, 1 Bit
   +0x000 Waiting          : Pos 1, 1 Bit
   +0x000 Waking           : Pos 2, 1 Bit
   +0x000 MultipleShared   : Pos 3, 1 Bit
   +0x000 Shared           : Pos 4, 28 Bits
   +0x000 Value            : Uint4B
   +0x000 Ptr              : Ptr32 Void

And here is a list of FltMgr's functions that operate on pushlocks:

VOID
FLTAPI
FltInitializePushLock(
    __out PEX_PUSH_LOCK PushLock
    );

VOID
FLTAPI
FltDeletePushLock(
    __in PEX_PUSH_LOCK PushLock
    );

VOID
FLTAPI
FltAcquirePushLockExclusive(
    __inout __deref __drv_acquiresExclusiveResource(ExPushLockType)
    PEX_PUSH_LOCK PushLock
    );

VOID
FLTAPI
FltAcquirePushLockShared(
    __inout __deref __drv_acquiresExclusiveResource(ExPushLockType)        
    PEX_PUSH_LOCK PushLock
    );

VOID
FLTAPI
FltReleasePushLock(
    __inout __deref __drv_releasesExclusiveResource(ExPushLockType)        
    PEX_PUSH_LOCK PushLock
    );

So one thing to note is that all the Flt wrappers are pretty thin and in general all they do is make sure that the Exf function that does the actual work is called while in a critical region (which makes things easier for the caller and makes them almost drop-in replacements for the FltXxxResource functions that have similar semantics). Another interesting aspect is that even though there are functions for initialization and cleanup, they don't seem to be doing much. Looking in the debugger we can see that FltDeletePushLock is empty and FltInitializePushLock and ExInitializePushLock (which is the only pushlock function actually declared in the headers in the WDK) do nothing more than zero out the pushlock. In fact, the FsRtlSetupAdvancedHeader() function (which is an inline) has this bit of code which confirms this:

//
//  API not avaialble down level
//  We want to support a driver compiled to the last version running downlevel,
//  so continue to use use the direct init of the push lock and not call
//  ExInitializePushLock.
//

    *((PULONG_PTR)(&localAdvHdr->PushLock)) = 0;
    /*ExInitializePushLock( &localAdvHdr->PushLock ); API not avaialble down level*/

So they seem to be pretty low cost (small size and low initialization overhead) which makes them pretty attractive if you want to have shared-exclusive lock in many structures in case you'll need them (like something you add to each context for example where you don't want to pay the price of an ERESOURCE).

Because of how complicated they are to debug what I've done (and I've seen others do the same thing as well) was to use ERESOURCEs in debug builds and in initial releases of a product in order to make sure there are no deadlocks and such and only once the code is thoroughly tested switch to using pushlocks. You can even use a runtime flag to enable your code to use ERESOURCEs instead of pushlocks in case you need the option to run with a primitive that's easier to debug for whatever reason (and I can actually guess that reason :)).

Finally, please note that since these are not magical (not being invented by Apple) they won't make crappy code that uses suboptimal implementations of algorithms any faster. In most cases your performance buck is better spent improving the logic and algorithms used by your driver instead of focusing on faster primitives. Also, please note that the guarantees are not the same as with ERESOURCEs so if you rely on the ordering guarantees of ERESOURCEs or you need to be able to acquire them recursively you are better of with ERESOURCEs.

The best way to describe them is "use at your own risk" (especially since they seem to be changing in behavior from one OS release to the other, as indicated by the OSR post I mentioned earlier). Still, if you need a very small and fast shared-exclusive lock you should give these guys a try.