## Thursday, January 27, 2011

I've seen some questions about how a legacy filter can implement contexts similar to the ones fltmgr provides for a minifilter.

So what is a context ? A context is a structure that is owned by some system component (in our case a filter, legacy or mini) that is associated with some other structure. In a very general way, a context is a "value" and at the object that it is associated with is a "key". In general contexts are necessary when the flow of execution is controlled by some other component in the system than the one that implements the actual code (for example for callbacks and services and library functions, where the code is provided by the library or service, but when the code is called depends on something else). Anyway, because the context is simply a key-value pair, anyone can implement a generic context mechanism by using hashes, and this allows great flexibility in what one can attach a context to. For example, one can associate a context with a thread or a logged on user or even a sector on a volume if they feel so inclined. One issue with this approach is how to know when the underlying object is released so that the context can be released as well. For example, if a context is associated with thread 128 and then thread 128 terminates and then at some later point in time another thread is created with the same ID of 128, clearly the context should be released since it's not referring to the same underlying object, but unless the entity implementing the context is notified that thread 128 was terminated, it won't know to release it.

So returning to filters, filter manager offers support for the following types of contexts (at least, these the ones that are typically interesting; the other contexts can usually be implemented fairly easily by legacy filters): Streams, StreamHandles and Files. Let's look at how each of these contexts can be implemented. These are just examples about how it could be done with little support from the OS, but it's definitely not the best way it can be done… I'll address that after this section.
StreamHandle contexts

In terms of implementation in a legacy filter, the StreamHandle is probably the easiest to implement since the key is the FILE_OBJECT and the time to remove the context is during IRP_MJ_CLOSE. Of course the context can be created either the first time the FILE_OBJECT is seen by the filter in an operation or when the filter processes the IRP_MJ_CREATE. Because of stream file objects the filter can't assume that it will always see an IRP_MJ_CREATE for each FILE_OBJECT, so a filter must always be prepared to get a FILE_OBJECT that it hasn't seen an IRP_MJ_CREATE for.

Stream contexts

The key for this type of context is the SCB, so whatever the FILE_OBJECT->FsContext member points to is a good key. Unfortunately, FILE_OBJECT->FsContext is not initialized until the file system processes the IRP_MJ_CREATE and opens the stream on disk, which means that a Stream context isn't available in preCreate (the same restriction as for minifilters). The more complicated part is how to know when the SCB is freed by the FILE_SYSTEM. One way to do this is to simply keep track of all the FILE_OBJECTs that the filter is interested in that all reference that SCB and then when the last FILE_OBJECT is processing it's IRP_MJ_CLOSE, free the context associated with the SCB. This is a bit more complicated than in the StreamHandle context, but not much more so. One notable thing is that since the SCB is a structure that belongs to the file system, it is possible that some file system is implemented in such a way that the address of the SCB changes throught the lifetime of an SCB (for example, the FS can copy the SCB to a different memory location under some circumstances). I haven't seen this in practice and there may be other issues with it (since the OS uses some fields in the FSRTL_COMMON_FCB_HEADER) but I haven't either seen anything definitive that disallows it.

File contexts

For file systems that implement alternate data streams (ADS) it might be important to know whether a stream belongs to the same file or not. In this case, the key for the context must be something that identifies the file. For example, if the file ID is guaranteed to be unique for the lifetime of the file (which is true for NTFS for example but is not true for the FASTFAT implementation; however, FASTFAT doesn't support alternate data streams so it doesn't really matter from this perspective) then the file ID can be used as a key. In terms of removing the context, it depends on the structure that was used as the key. For example, if the file ID is used, then the context would need to be removed when the file is deleted (and detecting that is a complicated problem in itself).

Fortunately the nice folks at MS decided to offer some help to the filters writers and developed some support APIs. They are covered in the MSDN pages "Tracking Per-Stream Context in a Legacy File System Filter Driver" (which is currently here) and "Tracking Per-File Context in a Legacy File System Filter Driver" (which is here). These APIs rely on the file system implementing support for the FSRTL_ADVANCED_FCB_HEADER structure. Please note that a file system is not required to implement this support but if it doesn't then it won't work with Filter Manager. Anyway, these APIs allow any kernel component (filter or not) to associate a context with an SCB and to be notified when the SCB itself is torn down. Please note that the SCB might not be torn down immediately when the last FILE_OBJECT for it is closed, because some file systems implement SCB caching and the filter might be able to benefit from this (benefit from it because it can keep its context and if someone opens a new handle to the same stream the filter's context is also cached).

There is another useful structure when implementing contexts, the RTL_GENERIC_TABLE (MSDN page currently here). A generic table is an OS structure that can be used as a general purpose hash, so that the filter doesn't need to implement their own. However, please note that it is implemented as a tree so if performance must be really good then a custom hash might still be necessary.

To wrap it up, in order for a filter to implement a similar scheme to FltMgr's contexts it can use the following scheme:
• Use OS support for stream contexts (FsRtlInsertPerStreamContext, FsRtlLookupPerStreamContext and so on)
• Use OS support for file contexts (FsRtlInsertPerFileContext, FsRtlLookupPerFileContext and so on)
• Implement a hash for per FILE_OBJECT context. Either use a straight hash or use a per Stream structure which includes a hash for FILE_OBJECTS for that stream (which is useful because the number of entries in each hash is much smaller so the RTL_GENERIC_TABLE might be a good fit).

Finally, I'd like to point out that any filter (legacy or mini) that implements its own streams (that completes an IRP_MJ_CREATE and puts something in FILE_OBJECT->FsContext) should implement support for FSRTL_ADVANCED_FCB_HEADER otherwise contexts won't work for those files and it might cause problems for other filters. This should be fairly easy to implement though following the MSDN documentation.

## Thursday, January 20, 2011

### About IRP_MJ_CREATE and minifilter design considerations - Part VI

I'm pretty much done with what I wanted to cover about IRP_MJ_CREATE. I'd just like to go through a couple more things that I think are important before closing this topic.

FILE_DELETE_ON_CLOSE create option

This flag sets a flag associated with the current FILE_OBJECT, in a file system structure associated with the FILE_OBJECT itself and not the stream. There is no way to query whether this flag was set after the fact. Once the FILE_OBJECT is cleaned up, the flag moves to the SCB (a per stream structure) and it can be queried using  IRP_MJ_QUERY_INFORMATION and FileStandardInformation. The same flag can be set on a stream by an IRP_MJ_SET_INFORMATION with the FileDispositionInformation information class. Please note that if at the time when the FILE_OBJECT that was created with FILE_DELETE_ON_CLOSE is closed there are no other FILE_OBJECTs for that same stream, then the flag will be moved to the SCB and then the stream will immediately be deleted, so there is no opportunity for a filter to query the flag or remove it. Filters that want to to be able to potentially clear the "delete intent" from a file can do something like remove the FILE_DELETE_ON_CLOSE flag from the CreateOptions and then in postCreate set it to the stream with an IRP_MJ_SET_INFORMATION. This is not exactly the same as FILE_DELETE_ON_CLOSE, but it's a pretty good approximation. It also allows the delete on close flag in the SCB to be queried and possibly reset at any time.

STATUS_REPARSE in postCreate

A filter can return STATUS_REPARSE in postCreate. It can do so if the create failed or even if it was successful, provided that the filter takes care of undoing what was done in the file system (see FltCancelFileOpen and IoCancelFileOpen).

FltGetFileNameInformation behavior

FltGetFileNameInformation can be called during a create, both in preCreate and postCreate. Calling FltGetFileNameInformation might result in the fltmgr actually opening the file if the file doesn't have a path (open by ID), but there should be no open to the actual file in any other case. If the caller is asking for a normalized path in preCreate, FltMgr will try to open the parent directory and enumerate its entries in order to get the long file name. However, if the file doesn't exist (if the IRP_MJ_CREATE is actually trying to create a file) then it is possible that even the normalized name contains a short name as the final component (for example, if a filter is trying to create "/Foo/Ba~1.txt" then the normalized path will have Ba~1 as a final component; everything else in the path should be normalized though). However, there is a really big performance hit associated with requesting a normalized name in preCreate and so it should be avoided if possible (might not be possible in all cases, but perhaps it can be moved to postCreate or maybe the opened name will do). The perf hit is much smaller when getting a normalized path in postCreate, primarily because of the cache.

Contexts in preCreate

Since before the IRP_MJ_CREATE hits the file system the FILE_OBJECT is not associated with a file system stream, any mechanism that requires the SCB will not function. For minifilters this includes file related contexts (stream, streamhandle, file), the name cache (hence the perf penalty when getting a normalized name) and possibly other things. Please note that because of renames, even if a minifilter opens a file with the same name in preCreate and then lets the IRP_MJ_CREATE continue there is no guarantee that they're going to be opening the same stream. This is one reason security products should not attempt to scan files in preCreate (because there is no way to guarantee that what they scanned will be the stream that original IRP_MJ_CREATE will end up opening).

Opening a new FILE_OBJECT for an existing FILE_OBJECT

Sometimes a minifilter needs a new handle to the same FILE_OBJECT that a user has opened(FO1). Rather than getting the file name of the user's file and then calling FltCreateFile with that name, a minifilter can simply call create (IoCreateFile, ZwCreateFile ) with an empty name and use a handle to FO1 as the RootDirectory handle when setting up the OBJECT_ATTRIBUTES structure. This results in an IRP_MJ_CREATE where the FILE_OBJECT->FileName is empty and FILE_OBJECT->RelatedFileObject is FO1  and the file system will simply open a new handle to the same stream. This is a much better approach because it doesn't require using file names so there is no hit associated with FltGetFileNameInformation and also it is not vulnerable to renames of the original file. Of course, the user's FILE_OBJECT must be opened.

A pretty interesting behavior of file systems is that when an IRP_MJ_CREATE creates a read-only file the handle associated with that IRP_MJ_CREATE can be used to write to the file. This is interesting because if a filters tries to open the same file the user has opened in postCreate  and it is using the same parameters, it doesn't necessarily mean it will get the same rights, depending on whether the file existed before that IRP_MJ_CREATE or not.

FileObject->FileName is not meaningful after a successful IRP_MJ_CREATE

Because FileObject->FileName is only a vehicle to pass the name information from the IO manager to the file system, once the IRP_MJ_CREATE actually reaches a file system and a stream is opened, it should be ignored. This is because once the FILE_OBJECT is associated with an SCB, the name of that SCB can immediately change (a rename on another FILE_OBJECT for that SCB) and the entity that knows the name of the SCB at all times is the file system, but there is no mechanism for a file system to go in and update all FILE_OBJECTs associated with an SCB when the name changes. As a side note, I still believe that the FILE_OBJECT structure would have been better off without a FileName member and that the FileName should have been a member of the IRP_MJ_CREATE.

FILE_OBJECT->RelatedFileObject is not recursive

In an IRP_MJ_CREATE if FILE_OBJECT->RelatedFileObject is not null, then that FILE_OBJECT (RFO) cannot also have a FILE_OBJECT->RelatedFileObject. However, since the RelatedFileObject has already been opened it means one cannot rely on its FileName member (see above) and so whether it had a RelatedFileObject or not is irrelevant.

SL_OPEN_TARGET_DIRECTORY in preCreate

SL_OPEN_TARGET_DIRECTORY means that this create is actually targeted at the parent directory of the FILE_OBJECT->FileName path ( if FILE_OBJECT->FileName is "\foo\bar\baz" and SL_OPEN_TARGET_DIRECTORY is set then the SCB that will be associated with this FILE_OBJECT is for "\foo\bar"). FltGetFileNameInformation in preCreate is aware of this and it will actually return the name "\foo\bar". So if a minifilter needs to get the full path even when SL_OPEN_TARGET_DIRECTORY is set, they must remove this flag before calling FltGetFileNameInformation (and set it back before sending the IRP_MJ_CREATE down, of course).

## Thursday, January 13, 2011

### About IRP_MJ_CREATE and minifilter design considerations - Part V

In this post I want to talk about the IoCreateStreamFileObject API, as well as the difference between what is normally referred to as an FCB or an SCB and a FILE_OBJECT. But before that I'm going to rant a bit about my favorite subject, namespaces :).

Anyway, now that my rant is over, let's get back to IRP_MJ_CREATE. IRP_MJ_CREATE is the type of protocol where the IO manager tells the file system to open something and it also tells it the token it's going to use to refer to it in the future, the FILE_OBJECT. IO manager allocates a FILE_OBJECT that it will use to identify that stream and it needs to tell the file system about it. However, the file system also needs a context associated with that stream. It needs to know where the stream is located on disk, whether it is encrypted and so on. This is all very specific to the file system (clearly only a file system that supports encryption will need to know whether the stream is encrypted) and so there is no general purpose structure that all the file systems can use. Therefore each file system needs to keep its own internal structure for streams it opens and it needs to learn how to associate the FILE_OBJECT with its internal structure. It could implement a key-value structure where the FILE_OBJECT is the key and the file system internal structure is the value, but that would potentially be time consuming (the key lookup would need to happen for each operation). The decision was made to allocate a field in the FILE_OBJECT to be used by the file system to store this context, and that field is FILE_OBJECT->FsContext. The way the protocol works is that during IRP_MJ_CREATE FsContext is NULL and when the request reaches the file system, the file system will allocate its internal context and store a pointer to it in the FsContext. In other words IRP_MJ_CREATE is a mechanism that allows a file system to initialize its fields in the FILE_OBJECT.

So now that I mentioned that the IRP_MJ_CREATE can be seen as way to associate an a FILE_OBJECT with the SCB, let's talk about IoCreateStreamFileObject. A file system needs to track a lot of metadata. Some of it is related to user files (names, directory structures, permissions) and some is internal to the file system (transaction log, journal). This metadata can be split into logical units (the journal, the transaction log, the directory information, the permissions hash) and they must be dynamic. So it makes sense that a file system would treat most of this data as if they were user files. In this way it can reuse a lot of the code it already has implemented for reads and writes and so on. So when user does something like enumerate a directory, the file system can simply say "open the stream associated with the directory and read it all" using its internal functions that deal with converting file offsets into disk offsets and so on. However, reading metadata from disk for every user operation will make things very slow so a file system might prefer to cache things. It could of course allocate memory and remember which data was more frequently accessed and it might decide that if there is memory pressure the size of the cache should decrease but there is already a component in the system that does all that, the Cache Manager. So if the file system could use the Cache Manager then it could benefit from all the logic in there. But the Cache Manager is a system component and it doesn't know anything about SCBs (which are specific to each file system). So in order to keep things generic, it would be nice if the file system could use FILE_OBJECTs for its internal streams and then use all the facilities in the OS that use FILE_OBJECTs.

Of course, creating FILE_OBJECTs for internal streams is a noble goal, but how does one create such FILE_OBJECTs ? Allocating memory and initializing the structure by hand is just asking for trouble, since the structure is different between OS versions, and if we don't call ObCreateObject (which is not documented) then we're probably going to break some OB integration anyway. One possible solution would be to call IoCreateFile from the file system. However, not all internal streams have names and while the file system could do something like use ECPs or allocate GUIDs as file names and use those as keys, this would still be  a pretty ugly hack. Moreover, as we've discussed above, the IRP_MJ_CREATE is nothing but a way for the IO manager to tell the file system which internal stream to associate with a FILE_OBJECT, but since the file system already knows exactly which stream is wants to open, why even have an IRP_MJ_CREATE ? What a file system needs is a way to request a new FILE_OBJECT from the IO manager, which it then can associate with the right internal structure. IoCreateStreamFileObjectEx is an API to do just that. There are some examples in the WDK about how to call it and when it should be used.

This is a brief overview of what IoCreateStreamFileObjectEx does (IoCreateStreamFileObject simply calls IoCreateStreamFileObjectEx with a NULL handle).:
1. Call ObCreateObject to create the actual FILE_OBJECT
2. Setup a minimal set of the FILE_OBJECT fields
3. Set the FO_STREAM_FILE flag in the FILE_OBJECT.
4. Call ObInsertObjectEx to create a handle for the FILE_OBJECT
5. If the caller passed in a NULL pointer, close the handle.

From a file system filtering perspective, the implication is that filters should expect IO and possibly other operations on FILE_OBJECTs that they haven't seen an IRP_MJ_CREATE for. Depending on the filter's functionality the filter might want to ignore such FILE_OBJECTS. Since all stream FILE_OBJECTs have the FO_STREAM_FILE flag set in the FILE_OBJECT->Flags, checking for this flag is a pretty reliable way to identify such FILE_OBJECTs.

## Thursday, January 6, 2011

### About IRP_MJ_CREATE and minifilter design considerations - Part IV

It is pretty common for a miniflter to attempt to redirect an open to a file to a different file. For example, when the user is trying to open "c:\temp\foo.txt" they would instead end up opening "d:\bar.txt". By far the easiest way to achieve this behavior is by using STATUS_REPARSE. There are some disadvantages to this method as compared to other methods (that I plan to discuss in a future post in this series) but it is widely used nevertheless because it is quite simple.

As described in one of the previous posts, ObpLookupObjectName is the OB function that is responsible for resolving a name to an actual OB object. Because the OB namespace design requires symbolic links, there needed to be a mechanism to implement this functionality and STATUS_REPARSE happens to be that mechanism. The whole OB symbolic link resolution code is encapsulated into just one function, ObpLookupObjectName. The contract is that when the object found has a parse routine (symbolic link objects, file objects and device objects do), the OB manager will call that function with a pointer to a UNICODE_STRING describing the path. If the parse procedure returns STATUS_REPARSE then the function must restart the lookup with the new name supplied in the UNICODE_STRING.

This is a pretty clean mechanism and it is fairly easy to use by things plugging into the OB namespace, like file systems and file system filters. In the latest WDK there is a minifilter sample that is an example of how a minifilter can use STATUS_REPARSE to redirect the create to a file. Here are the steps in the function SimRepPreCreate(which are similar to most other filters doing this):
1. Eliminate cases where we don't want to reparse (paging files and volume opens)
2. Get the name of the file that the user is trying to open (please note that this is a preCreate callback and the request is for the opened name, which is much faster to get in preCreate than the normalized name)
3. Replace the name in the FILE_OBJECT (allocate a new buffer for the UNICODE_STRING if needed)
4. Return STATUS_REPARSE

This is a list of things worth mentioning about using STATUS_REPARSE as a redirection mechanism:
• ObpLookupObjectName will use the name that is returned in the FILE_OBJECT->FileName as a completely new name. This means that a filter (or a file system) must return a full path to the new file, complete with a device path (since this is an OB name after all). This can be seen in the the SimRepPreCreate function where the new name is built. However, SimRep reparses a path to a different path on the same volume, so the device name is the same. If the new path needs to be on a different device, the name in the FILE_OBJECT can either be something like "\Device\HarddiskVolume1\bar.txt" or even "\??\D:\bar.txt". This last path with the device name written as "\??\D:" works because ObpLookupObjectName restarts the lookup before the point where it resolves the "\??\" shortcut.
• Sometimes the FILE_OBEJCT contains a RelatedFileObject member and the name in the FILE_OBJECT is relative to that RelatedFileObject. If STATUS_REPARSE is returned then the IO manager will simply ignore the RelatedFileObject from that point on and assume that the path that was returned in the FILE_OBJECT is a full path. This also simplifies things for the filter writer since it means that they don't need to care about RelatedFileObjects at all when returning STATUS_REPARSE, the path is always a full path.
• It is possible to specify a DEVICE_OBJECT hint when calling IoCreateFileSpecifyDeviceObjectHint or IoCreateFileEx (so only kernel mode callers). When this happens the device specified is stored in the OPEN_PACKET and it is evaluated in IopParseDevice (by calling nt!IopCheckTopDeviceHint). This will fail if the path returned in the STATUS_REPARSE points to a different device. Moreover, the IRP_MJ_CREATE will be sent to the device specified in the device hint, so the file name must be a name that is meaningful at that layer, which might be different from the name at the top of the file system stack on that volume. This isn't generally a problem since the filter must be below that device in order to even see the request so it can know what the file system namespace looks like below the hint device level.
• Another request that comes up a lot is how to track a request that a filter reparsed. For example, if my filter returns STATUS_REPARSE, I might not want to process that request when it comes down again (assuming that I reparse to another place that my filter filters as well). This is pretty complicated to do because all this happens before a stream is opened in the file system so stream-based contexts (like FltMgr contexts) will not work. In fact, in order for a filter to be able to track a create they must find a variable with the following properties:
1. The variable must be accessible to the filter (this is pretty obvious but important nevertheless)…
2. The variable must persist (keep either its value or its address the same) for the same call to IopCreateFile.
3. The variable must be unique enough so that there is no chance of confusion between two IRP_MJ_CREATEs that happen at the same time.
4. The variable must be changed (freed or released or it must get a new value) at the end of the IopCreateFile scope, so that new calls to IopCreateFile will not get the same value (otherwise a filter might record the variable value, return STATUS_REPARSE and then it might see a completely unrelated future create with the same value and assume it is the reparse it's been waiting for all along).
5. The variable must be torn down cleanly even if the filter never receives it back, because there are no guarantees that after returning STATUS_REPARSE there will actually be another IRP_MJ_CREATE (maybe there was a device hint and the reparse was for a different stack, or maybe the maximum number of reparses was hit and so on).

So it is easy to see that most variables that a filter has access to won't work:
• the IRP doesn't work because it isn't persistent (it is freed at the end of the IopParseDevice call, so subsequent calls to IopParseDevice will likely get a new IRP)
• the FILE_OBJECT doesn't work because its scope is also the IopParseDevice call.
• the OPEN_PACKET would be nice, but it might be the same between different calls, and besides a filter doesn't have access to it anyway.
• the thread will be the same, but it is not unique enough. There may a new completely unrelated create sent down on this same thread.
• Finally, allocating some filter structure and sticking a pointer to it into an unused field someplace won't work because if there is never a new IRP sent to the filter then the structure will be leaked.

So for Vista Microsoft decided to do something about this and introduced a bunch of new calls and a couple of new structures. In the new model, in the OPEN_PACKET there is a new structure nt!_IO_DRIVER_CREATE_CONTEXT:
1: kd> dt
nt!_OPEN_PACKET DriverCreateContext.   +0x05c DriverCreateContext  :       +0x000 Size                 : Int2B      +0x004 ExtraCreateParameter : Ptr32
_ECP_LIST      +0x008 DeviceObjectHint     : Ptr32 Void      +0x00c TxnParameters        : Ptr32 _TXN_PARAMETER_BLOCK1: kd> dt nt!_ECP_LIST   +0x000 Signature        : Uint4B   +0x004 Flags            : Uint4B   +0x008 EcpList          : _LIST_ENTRY


In this structure there is another structure, the _ECP_LIST, which stores an unlimited number of other structures called ECPs (extra create parameters). These structures (in fact the list containing the structure) can be passed in as a parameter to IoCreateFileEx and, more importantly for our case, can be added by filters (both legacy and minifilters) to an existing IRP_MJ_CREATE request (there are two largely similar sets of APIs, FsRtl and Flt, with functions such as FsRtlAllocateExtraCreateParameter and FltAllocateExtraCreateParameter, respectively). The guarantee is that these ECPs will be passed to all IRP_MJ_CREATE IRPs associated with a call to IoCreateFile. They are guaranteed to be unique because each ECP is identified by a GUID and are also guaranteed to be torn down along with the OPEN_PACKET, at the end of the create operation.
This mechanism allows the filter (legacy or mini) to allocate a new structure, associate it with an IRP_MJ_CREATE for which it wants to return STATUS_REPARSE, and then be able to know for any subsequent IRP_MJ_CREATE if it is related to this create it has already processed.
Incidentally, another good use for this mechanism is to send some additional information with a create request to a filter or file system. A scenario where this might be useful is when a more complex product that also has a filter component (like an anti-virus product) wants to open a file but would like to tell the filter that this create originated from the product so perhaps the usual rules (for example scanning the file) might not apply.
There two downsides to this mechanism. It is only available to kernel mode callers (you cannot pass in an ECP or an ECP list to NtCreateFile) and it is not available for XP...