Of Filesystems And Other Demons

Thursday, January 20, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part VI

I'm pretty much done with what I wanted to cover about IRP_MJ_CREATE. I'd just like to go through a couple more things that I think are important before closing this topic.

FILE_DELETE_ON_CLOSE create option

This flag sets a flag associated with the current FILE_OBJECT, in a file system structure associated with the FILE_OBJECT itself and not the stream. There is no way to query whether this flag was set after the fact. Once the FILE_OBJECT is cleaned up, the flag moves to the SCB (a per stream structure) and it can be queried using IRP_MJ_QUERY_INFORMATION and FileStandardInformation. The same flag can be set on a stream by an IRP_MJ_SET_INFORMATION with the FileDispositionInformation information class. Please note that if at the time when the FILE_OBJECT that was created with FILE_DELETE_ON_CLOSE is closed there are no other FILE_OBJECTs for that same stream, then the flag will be moved to the SCB and then the stream will immediately be deleted, so there is no opportunity for a filter to query the flag or remove it. Filters that want to to be able to potentially clear the "delete intent" from a file can do something like remove the FILE_DELETE_ON_CLOSE flag from the CreateOptions and then in postCreate set it to the stream with an IRP_MJ_SET_INFORMATION. This is not exactly the same as FILE_DELETE_ON_CLOSE, but it's a pretty good approximation. It also allows the delete on close flag in the SCB to be queried and possibly reset at any time.

STATUS_REPARSE in postCreate

A filter can return STATUS_REPARSE in postCreate. It can do so if the create failed or even if it was successful, provided that the filter takes care of undoing what was done in the file system (see FltCancelFileOpen and IoCancelFileOpen).

FltGetFileNameInformation behavior

FltGetFileNameInformation can be called during a create, both in preCreate and postCreate. Calling FltGetFileNameInformation might result in the fltmgr actually opening the file if the file doesn't have a path (open by ID), but there should be no open to the actual file in any other case. If the caller is asking for a normalized path in preCreate, FltMgr will try to open the parent directory and enumerate its entries in order to get the long file name. However, if the file doesn't exist (if the IRP_MJ_CREATE is actually trying to create a file) then it is possible that even the normalized name contains a short name as the final component (for example, if a filter is trying to create "/Foo/Ba~1.txt" then the normalized path will have Ba~1 as a final component; everything else in the path should be normalized though). However, there is a really big performance hit associated with requesting a normalized name in preCreate and so it should be avoided if possible (might not be possible in all cases, but perhaps it can be moved to postCreate or maybe the opened name will do). The perf hit is much smaller when getting a normalized path in postCreate, primarily because of the cache.

Contexts in preCreate

Since before the IRP_MJ_CREATE hits the file system the FILE_OBJECT is not associated with a file system stream, any mechanism that requires the SCB will not function. For minifilters this includes file related contexts (stream, streamhandle, file), the name cache (hence the perf penalty when getting a normalized name) and possibly other things. Please note that because of renames, even if a minifilter opens a file with the same name in preCreate and then lets the IRP_MJ_CREATE continue there is no guarantee that they're going to be opening the same stream. This is one reason security products should not attempt to scan files in preCreate (because there is no way to guarantee that what they scanned will be the stream that original IRP_MJ_CREATE will end up opening).

Opening a new FILE_OBJECT for an existing FILE_OBJECT

Sometimes a minifilter needs a new handle to the same FILE_OBJECT that a user has opened(FO1). Rather than getting the file name of the user's file and then calling FltCreateFile with that name, a minifilter can simply call create (IoCreateFile, ZwCreateFile ) with an empty name and use a handle to FO1 as the RootDirectory handle when setting up the OBJECT_ATTRIBUTES structure. This results in an IRP_MJ_CREATE where the FILE_OBJECT->FileName is empty and FILE_OBJECT->RelatedFileObject is FO1 and the file system will simply open a new handle to the same stream. This is a much better approach because it doesn't require using file names so there is no hit associated with FltGetFileNameInformation and also it is not vulnerable to renames of the original file. Of course, the user's FILE_OBJECT must be opened.

Writing to read-only files

A pretty interesting behavior of file systems is that when an IRP_MJ_CREATE creates a read-only file the handle associated with that IRP_MJ_CREATE can be used to write to the file. This is interesting because if a filters tries to open the same file the user has opened in postCreate and it is using the same parameters, it doesn't necessarily mean it will get the same rights, depending on whether the file existed before that IRP_MJ_CREATE or not.

FileObject->FileName is not meaningful after a successful IRP_MJ_CREATE

Because FileObject->FileName is only a vehicle to pass the name information from the IO manager to the file system, once the IRP_MJ_CREATE actually reaches a file system and a stream is opened, it should be ignored. This is because once the FILE_OBJECT is associated with an SCB, the name of that SCB can immediately change (a rename on another FILE_OBJECT for that SCB) and the entity that knows the name of the SCB at all times is the file system, but there is no mechanism for a file system to go in and update all FILE_OBJECTs associated with an SCB when the name changes. As a side note, I still believe that the FILE_OBJECT structure would have been better off without a FileName member and that the FileName should have been a member of the IRP_MJ_CREATE.

FILE_OBJECT->RelatedFileObject is not recursive

In an IRP_MJ_CREATE if FILE_OBJECT->RelatedFileObject is not null, then that FILE_OBJECT (RFO) cannot also have a FILE_OBJECT->RelatedFileObject. However, since the RelatedFileObject has already been opened it means one cannot rely on its FileName member (see above) and so whether it had a RelatedFileObject or not is irrelevant.

SL_OPEN_TARGET_DIRECTORY in preCreate

SL_OPEN_TARGET_DIRECTORY means that this create is actually targeted at the parent directory of the FILE_OBJECT->FileName path ( if FILE_OBJECT->FileName is "\foo\bar\baz" and SL_OPEN_TARGET_DIRECTORY is set then the SCB that will be associated with this FILE_OBJECT is for "\foo\bar"). FltGetFileNameInformation in preCreate is aware of this and it will actually return the name "\foo\bar". So if a minifilter needs to get the full path even when SL_OPEN_TARGET_DIRECTORY is set, they must remove this flag before calling FltGetFileNameInformation (and set it back before sending the IRP_MJ_CREATE down, of course).

Thursday, January 13, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part V

In this post I want to talk about the IoCreateStreamFileObject API, as well as the difference between what is normally referred to as an FCB or an SCB and a FILE_OBJECT. But before that I'm going to rant a bit about my favorite subject, namespaces :).

Any stateful communication protocol needs a way to identify the connection once it has been established. In a lot of cases the requestor of the service initiates the connection and receives back a token that identifies the connection to the provider of the service. This token belongs to the namespace of the service provider. This is the case with file handles (a caller requests a file to be opened and they get back a token which they can then use when requesting reads and writes), network connections (where the token is a socket), web pages (the user logs on to a server, receives a session token or a cookie) and many other things. However, the protocol could also go the other way around, where the user could create a token (which would then belong to the user's namespace) and pass it in with the session initiation request. In such a scenario, opening a file would more like "hey file system, I plan to read from a file and I will use the value 123 for it, so whenever you see value 123 know that I'm talking about that file". So as you can see an important question when designing a protocol is who should learn the other's context? Should it be the service provider or the server requestor ? It's pretty easy with people because they can never remember some else's context (a token that someone else gives them) so in that case whichever end of the protocol needs human interaction should be the one that generates the token. For example, each file in a file system can be opened by ID, but humans still prefer names even though they could in theory learn the ID. Or when browsing a web page people don't remember the URL at the top for each page they visit, even for web pages where that URL doesn't change. In fact the reason Google is such a big company is because they figured out that people don't remember URLs even when they aren't random characters, but instead people remember key phrases about the page they're looking for (that's their token, not the URL). Going even further one could argue that the whole history of computer science is the history of building contexts for people . Assembly language was a way to associate names that were meaningful to people with memory locations (so instead of the human operator remembering the address, which is the machine's context, the human operation would remember a name and the assembler would convert that name into the address). A file system is a way to associate disk locations with a name and so on.

Anyway, now that my rant is over, let's get back to IRP_MJ_CREATE. IRP_MJ_CREATE is the type of protocol where the IO manager tells the file system to open something and it also tells it the token it's going to use to refer to it in the future, the FILE_OBJECT. IO manager allocates a FILE_OBJECT that it will use to identify that stream and it needs to tell the file system about it. However, the file system also needs a context associated with that stream. It needs to know where the stream is located on disk, whether it is encrypted and so on. This is all very specific to the file system (clearly only a file system that supports encryption will need to know whether the stream is encrypted) and so there is no general purpose structure that all the file systems can use. Therefore each file system needs to keep its own internal structure for streams it opens and it needs to learn how to associate the FILE_OBJECT with its internal structure. It could implement a key-value structure where the FILE_OBJECT is the key and the file system internal structure is the value, but that would potentially be time consuming (the key lookup would need to happen for each operation). The decision was made to allocate a field in the FILE_OBJECT to be used by the file system to store this context, and that field is FILE_OBJECT->FsContext. The way the protocol works is that during IRP_MJ_CREATE FsContext is NULL and when the request reaches the file system, the file system will allocate its internal context and store a pointer to it in the FsContext. In other words IRP_MJ_CREATE is a mechanism that allows a file system to initialize its fields in the FILE_OBJECT.

This internal context that identifies a stream to the file system is traditionally called an FCB (file control block) because there used to be only one stream per file. However, when file systems added the ability to associate multiple streams of data with a file (alternate data streams), the file system needed to be aware of the distinction between the stream and file and so in such file systems (NTFS and UDFS are examples of file systems that support alternate data streams (or ADS)) the FCB actually means "context associated with the file" and what traditionally used be the FCB is now called an SCB (stream control block). On a final note about SCBs, it is worth mentioning that they aren't really completely private and that in fact the OS cares about some information associated with the SCB. As such, all SCBs should start with an FSRTL_ADVANCED_FCB_HEADER. So for any file system developers, please make sure to implement this. I've done it for file systems that didn't support it and it can probably be done in a couple of hours. Without this your file system won't support FltMgr and minifilters and probably break some legacy filters as well (there are other negative side effects as well).

So now that I mentioned that the IRP_MJ_CREATE can be seen as way to associate an a FILE_OBJECT with the SCB, let's talk about IoCreateStreamFileObject. A file system needs to track a lot of metadata. Some of it is related to user files (names, directory structures, permissions) and some is internal to the file system (transaction log, journal). This metadata can be split into logical units (the journal, the transaction log, the directory information, the permissions hash) and they must be dynamic. So it makes sense that a file system would treat most of this data as if they were user files. In this way it can reuse a lot of the code it already has implemented for reads and writes and so on. So when user does something like enumerate a directory, the file system can simply say "open the stream associated with the directory and read it all" using its internal functions that deal with converting file offsets into disk offsets and so on. However, reading metadata from disk for every user operation will make things very slow so a file system might prefer to cache things. It could of course allocate memory and remember which data was more frequently accessed and it might decide that if there is memory pressure the size of the cache should decrease but there is already a component in the system that does all that, the Cache Manager. So if the file system could use the Cache Manager then it could benefit from all the logic in there. But the Cache Manager is a system component and it doesn't know anything about SCBs (which are specific to each file system). So in order to keep things generic, it would be nice if the file system could use FILE_OBJECTs for its internal streams and then use all the facilities in the OS that use FILE_OBJECTs.

Of course, creating FILE_OBJECTs for internal streams is a noble goal, but how does one create such FILE_OBJECTs ? Allocating memory and initializing the structure by hand is just asking for trouble, since the structure is different between OS versions, and if we don't call ObCreateObject (which is not documented) then we're probably going to break some OB integration anyway. One possible solution would be to call IoCreateFile from the file system. However, not all internal streams have names and while the file system could do something like use ECPs or allocate GUIDs as file names and use those as keys, this would still be a pretty ugly hack. Moreover, as we've discussed above, the IRP_MJ_CREATE is nothing but a way for the IO manager to tell the file system which internal stream to associate with a FILE_OBJECT, but since the file system already knows exactly which stream is wants to open, why even have an IRP_MJ_CREATE ? What a file system needs is a way to request a new FILE_OBJECT from the IO manager, which it then can associate with the right internal structure. IoCreateStreamFileObjectEx is an API to do just that. There are some examples in the WDK about how to call it and when it should be used.

This is a brief overview of what IoCreateStreamFileObjectEx does (IoCreateStreamFileObject simply calls IoCreateStreamFileObjectEx with a NULL handle).:

Call ObCreateObject to create the actual FILE_OBJECT
Setup a minimal set of the FILE_OBJECT fields
Set the FO_STREAM_FILE flag in the FILE_OBJECT.
Call ObInsertObjectEx to create a handle for the FILE_OBJECT
If the caller passed in a NULL pointer, close the handle.

From a file system filtering perspective, the implication is that filters should expect IO and possibly other operations on FILE_OBJECTs that they haven't seen an IRP_MJ_CREATE for. Depending on the filter's functionality the filter might want to ignore such FILE_OBJECTS. Since all stream FILE_OBJECTs have the FO_STREAM_FILE flag set in the FILE_OBJECT->Flags, checking for this flag is a pretty reliable way to identify such FILE_OBJECTs.

Thursday, January 6, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part IV

It is pretty common for a miniflter to attempt to redirect an open to a file to a different file. For example, when the user is trying to open "c:\temp\foo.txt" they would instead end up opening "d:\bar.txt". By far the easiest way to achieve this behavior is by using STATUS_REPARSE. There are some disadvantages to this method as compared to other methods (that I plan to discuss in a future post in this series) but it is widely used nevertheless because it is quite simple.

As described in one of the previous posts, ObpLookupObjectName is the OB function that is responsible for resolving a name to an actual OB object. Because the OB namespace design requires symbolic links, there needed to be a mechanism to implement this functionality and STATUS_REPARSE happens to be that mechanism. The whole OB symbolic link resolution code is encapsulated into just one function, ObpLookupObjectName. The contract is that when the object found has a parse routine (symbolic link objects, file objects and device objects do), the OB manager will call that function with a pointer to a UNICODE_STRING describing the path. If the parse procedure returns STATUS_REPARSE then the function must restart the lookup with the new name supplied in the UNICODE_STRING.

This is a pretty clean mechanism and it is fairly easy to use by things plugging into the OB namespace, like file systems and file system filters. In the latest WDK there is a minifilter sample that is an example of how a minifilter can use STATUS_REPARSE to redirect the create to a file. Here are the steps in the function SimRepPreCreate(which are similar to most other filters doing this):

Eliminate cases where we don't want to reparse (paging files and volume opens)
Get the name of the file that the user is trying to open (please note that this is a preCreate callback and the request is for the opened name, which is much faster to get in preCreate than the normalized name)
Replace the name in the FILE_OBJECT (allocate a new buffer for the UNICODE_STRING if needed)
Return STATUS_REPARSE

This is a list of things worth mentioning about using STATUS_REPARSE as a redirection mechanism:

ObpLookupObjectName will use the name that is returned in the FILE_OBJECT->FileName as a completely new name. This means that a filter (or a file system) must return a full path to the new file, complete with a device path (since this is an OB name after all). This can be seen in the the SimRepPreCreate function where the new name is built. However, SimRep reparses a path to a different path on the same volume, so the device name is the same. If the new path needs to be on a different device, the name in the FILE_OBJECT can either be something like "\Device\HarddiskVolume1\bar.txt" or even "\??\D:\bar.txt". This last path with the device name written as "\??\D:" works because ObpLookupObjectName restarts the lookup before the point where it resolves the "\??\" shortcut.
Sometimes the FILE_OBEJCT contains a RelatedFileObject member and the name in the FILE_OBJECT is relative to that RelatedFileObject. If STATUS_REPARSE is returned then the IO manager will simply ignore the RelatedFileObject from that point on and assume that the path that was returned in the FILE_OBJECT is a full path. This also simplifies things for the filter writer since it means that they don't need to care about RelatedFileObjects at all when returning STATUS_REPARSE, the path is always a full path.
It is possible to specify a DEVICE_OBJECT hint when calling IoCreateFileSpecifyDeviceObjectHint or IoCreateFileEx (so only kernel mode callers). When this happens the device specified is stored in the OPEN_PACKET and it is evaluated in IopParseDevice (by calling nt!IopCheckTopDeviceHint). This will fail if the path returned in the STATUS_REPARSE points to a different device. Moreover, the IRP_MJ_CREATE will be sent to the device specified in the device hint, so the file name must be a name that is meaningful at that layer, which might be different from the name at the top of the file system stack on that volume. This isn't generally a problem since the filter must be below that device in order to even see the request so it can know what the file system namespace looks like below the hint device level.
Another request that comes up a lot is how to track a request that a filter reparsed. For example, if my filter returns STATUS_REPARSE, I might not want to process that request when it comes down again (assuming that I reparse to another place that my filter filters as well). This is pretty complicated to do because all this happens before a stream is opened in the file system so stream-based contexts (like FltMgr contexts) will not work. In fact, in order for a filter to be able to track a create they must find a variable with the following properties:
1. The variable must be accessible to the filter (this is pretty obvious but important nevertheless)…
2. The variable must persist (keep either its value or its address the same) for the same call to IopCreateFile.
3. The variable must be unique enough so that there is no chance of confusion between two IRP_MJ_CREATEs that happen at the same time.
4. The variable must be changed (freed or released or it must get a new value) at the end of the IopCreateFile scope, so that new calls to IopCreateFile will not get the same value (otherwise a filter might record the variable value, return STATUS_REPARSE and then it might see a completely unrelated future create with the same value and assume it is the reparse it's been waiting for all along).
5. The variable must be torn down cleanly even if the filter never receives it back, because there are no guarantees that after returning STATUS_REPARSE there will actually be another IRP_MJ_CREATE (maybe there was a device hint and the reparse was for a different stack, or maybe the maximum number of reparses was hit and so on).
So it is easy to see that most variables that a filter has access to won't work:
- the IRP doesn't work because it isn't persistent (it is freed at the end of the IopParseDevice call, so subsequent calls to IopParseDevice will likely get a new IRP)
- the FILE_OBJECT doesn't work because its scope is also the IopParseDevice call.
- the OPEN_PACKET would be nice, but it might be the same between different calls, and besides a filter doesn't have access to it anyway.
- the thread will be the same, but it is not unique enough. There may a new completely unrelated create sent down on this same thread.
- Finally, allocating some filter structure and sticking a pointer to it into an unused field someplace won't work because if there is never a new IRP sent to the filter then the structure will be leaked.
So for Vista Microsoft decided to do something about this and introduced a bunch of new calls and a couple of new structures. In the new model, in the OPEN_PACKET there is a new structure nt!_IO_DRIVER_CREATE_CONTEXT:
```
1: kd> dt
nt!_OPEN_PACKET DriverCreateContext.
   +0x05c DriverCreateContext  : 
      +0x000 Size                 : Int2B
      +0x004 ExtraCreateParameter : Ptr32
_ECP_LIST
      +0x008 DeviceObjectHint     : Ptr32 Void
      +0x00c TxnParameters        : Ptr32 _TXN_PARAMETER_BLOCK
1: kd> dt nt!_ECP_LIST
   +0x000 Signature        : Uint4B
   +0x004 Flags            : Uint4B
   +0x008 EcpList          : _LIST_ENTRY
```
In this structure there is another structure, the _ECP_LIST, which stores an unlimited number of other structures called ECPs (extra create parameters). These structures (in fact the list containing the structure) can be passed in as a parameter to IoCreateFileEx and, more importantly for our case, can be added by filters (both legacy and minifilters) to an existing IRP_MJ_CREATE request (there are two largely similar sets of APIs, FsRtl and Flt, with functions such as FsRtlAllocateExtraCreateParameter and FltAllocateExtraCreateParameter, respectively). The guarantee is that these ECPs will be passed to all IRP_MJ_CREATE IRPs associated with a call to IoCreateFile. They are guaranteed to be unique because each ECP is identified by a GUID and are also guaranteed to be torn down along with the OPEN_PACKET, at the end of the create operation.
This mechanism allows the filter (legacy or mini) to allocate a new structure, associate it with an IRP_MJ_CREATE for which it wants to return STATUS_REPARSE, and then be able to know for any subsequent IRP_MJ_CREATE if it is related to this create it has already processed.
Incidentally, another good use for this mechanism is to send some additional information with a create request to a filter or file system. A scenario where this might be useful is when a more complex product that also has a filter component (like an anti-virus product) wants to open a file but would like to tell the filter that this create originated from the product so perhaps the usual rules (for example scanning the file) might not apply.
There two downsides to this mechanism. It is only available to kernel mode callers (you cannot pass in an ECP or an ECP list to NtCreateFile) and it is not available for XP...

Thursday, December 30, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part III

An interesting topic when discussing creates is the context (thread and process context) in which the create happens. This isn't really interesting from the OS perspective (since the OS always receives the request in the context of the requestor) but from a filtering perspective. In the previous post we discussed how the OS takes the request and eventually sends an IRP to the file system. There are some things to note:

CREATE operations must be synchronized by the OS. I think this is true for any stateful protocol (and stateless protocols don't really have a CREATE operation anyway). The CREATE operation simply means "hey everyone, there will be some requests in this context for this object so you'd better set up your contexts so you know what we're talking about when you get the next request". So the requestor can't really do anything until the request is complete since they don't even have a handle. This means that the IO manager will pretty much execute in a single thread and when it needs to wait for some other service (like the FS) it will send a request (the IRP_MJ_CREATE IRP) and wait for it to come back.
The FS stack however is layered. The implication of this is that while the user can treat the CREATE operation as synchronous, the layers involved in processing that create can't. For file system filters (legacy and minifilters), there are 3 distinct steps:
1. Before the request makes to the minifilter (before the preCreate callback is called)
2. After the request is seen by the minifilter, but before the minifilter knows the request has been completed by the lower layers (after the preCreate callback but before the postCreate callback)
3. After the minifilter knows the request has completed, but before the IO manager knows about it (after the postCreate callback)
This is important to understand because there are certain limitations, depending on what each layer of the OS knows about the request. For example, during a preCreate callback, the IO manager knows someone wants to open a file but the FS doesn't yet know about that file. So even though the minifilter has a FILE_OBJECT structure (which comes from the IO manager), trying to use it to request something from the FS (like reading or writing or even queries) cannot work since the FS has not yet seen the request and has no idea what the FILE_OBJECT is supposed to represent (the information about which stream on disk the FILE_OBJECT will represent is stored in the create IRP and not in the FILE_OBJECT). In a similar fashion, during the postCreate callback the filter knows how the FS handled the request (whether it was a successful request or not) but the IO manager doesn't, so trying to call a function that involves the IO manager for that FILE_OBJECT (for example ObOpenObjectByPointer, which will create a HANDLE given an OBJECT) will fail.
FltMgr will also synchronize IRP_MJ_CREATE requests for a couple of reasons. From a minifilter perspective, this is beneficial because it simplifies the model quite a bit. In general synchronized operations are somewhat simpler to handle in the postOp callback but synchronizing every operation will have a negative impact on the system. So FltMgr won't synchronize by default any operation except CREATE, where there is no negative impact because the IO manager synchronizes it already. While this is guaranteed by documentation, minifilters should still always return FLT_PREOP_SYNCHRONIZE instead of FLT_PREOP_SUCCESS_WITH_CALLBACK for IRP_MJ_CREATE just so this behavior is made obvious.
This brings us to the most important point. FltMgr documentation mentions in a bunch of different places that the postCreate callback will be called in the same context as the preCreate callback. In some cases I've this statement being interpreted as "FltMgr guarantees that the postCreate will be called in the same thread where the user request was issued". However, this is not the case. FltMgr makes no guarantees about what thread the preCreate callback will be called on, just that it will call postCreate on the same thread. What can happen is that a filter (legacy or minifilter) can return STATUS_PENDING for an IRP_MJ_CREATE and the continue the request on a different thread, in a different process altogether. This is a legal option and what happens is that the filter below the filter that returned pending will have its preCreate callback called on the new thread, in the new process context. This is a brief example of what happens in this case (let's say the FS will return STATUS_REPARSE):
1. The IO manager receives the CREATE request on Thread1 and issues an IRP_MJ_CREATE on the same thread.
2. FilterA (let's say it's a legacy filter) sees IRP_MJ_CREATE request on Thread1 and pends it and then sends it down on a different thread, Thread2 .
3. MinifilterB (below FilterA) sees the IRP_MJ_CREATE request (i.e. minifilter B's preCreate callback is called) on Thread2, where it queues the request and returns FLT_PREOP_PENDING.
4. MinifilterB then dequeues the request on a different thread (Thread3) and it sends it down (calls FltCompletePendedPreOperation with FLT_PREOP_SYNCHRONIZE for example)
5. The FS receives the IRP_MJ_CREATE on Thread3, processes and discovers it is a reparse point and so it returns STATUS_REPARSE.
6. FltMgr's completion routine gets called on Thread3 and since FltMgr knows the operation is synchronized, it simply signals Event2.
7. FltMgr resumes the operation on Thread2 where it was waiting for the event and calls the postCreate callback for minifilterB.
8. Minifilter B does whatever processing it does for STATUS_REPARSE and returns FLT_POSTOP_FINISHED_PROCESSING.
9. FltMgr completes the request (we're still on Thread2).
10. FilterA's IoCompletion routine gets called on Thread2 and FilterA performs whatever processing it needs before completing the IRP.
11. the IO manager's IoCompletion routine gets called (still on Thread2), but the IO manager is synchronizing the operation so it signals Event1.
12. IO manager's wait on Thread1 returns so the IO manager can inspect the result of the call. Since the FS returned STATUS_PENDING, it might return back to OB and restart parsing from there… This in turn might come down the same path and issue a new IRP_MJ_CREATE on Thread1 and so on...
Here is a picture of what this would look like.

As you can see, it is impossible for a filter to guarantee that its preCreate callback will be called on the thread of the original request. So what can a file system filter (or a file system) do ? Well, there are largely three reasons why a file system (or filter) might care about the context of a certain operation:

The operation refers to some buffer and the VA is only valid in the process context of the originator.
The operation refers to some other variable that is process specific (for example , a handle), like IRP_MJ_SET_INFORMATION with FileRenameInformation or FileLinkInformation, where the parameters contain a handle.
The operation needs to evaluate security so it needs to know who is the requestor for the operation.

IRP_MJ_CREATE doesn't care about user buffers or other process dependent variables (they are all captured before getting to the IO manager) so file systems and filters don't need to worry about that. However, security is a really big part of IRP_MJ_CREATE processing so filters often need to know who is requesting the operation. However, as I mentioned in the previous post in this series, the security context is captured in nt!ObOpenObjectByName and sent in the IRP parameters (Parameters.Create.SecurityContext) and so the file system and the filters can simply use the context there to decide who is requesting the operation.
In conclusion, the fact that a filter can't guarantee that it will be called in the context of the thread where the original request was issued doesn't matter much.

Thursday, December 23, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part II

Since we've discussed the concepts last time we can finally start looking at the debugger. Because we're mostly interested in the create operation from a filter perspective, I put a breakpoint on fltmgr!FltpCreate so that we can see exactly what the stack looks like when the request reaches a filter. Let's say we're trying open the file "C:\Foo\Bar.txt". Here is what the stack looks like.

00 9b5c5a70 828484bc fltmgr!FltpCreate
01 9b5c5a88 82a4c6ad nt!IofCallDriver+0x63
02 9b5c5b60 82a2d26b nt!IopParseDevice+0xed7
03 9b5c5bdc 82a532d9 nt!ObpLookupObjectName+0x4fa
04 9b5c5c38 82a4b62b nt!ObOpenObjectByName+0x165
05 9b5c5cb4 82a56f42 nt!IopCreateFile+0x673
06 9b5c5d00 8284f44a nt!NtCreateFile+0x34

In order to discuss the flow of the IO through the OS we're going to look at what each of these functions does.

nt!NtCreateFile

This is how the OS receives a request to open a file or a device (at this level there is no distinction between the two yet). NtCreateFile doesn't really do much, it's just a wrapper over an internal OS function (IopCreateFile). The file name here is something like "\??\C:\Foo\Bar.txt".

nt!IopCreateFile

This is the function to open a device (or a file) at the IO manager level. This is an internal function where most requests to open a file or a device end up (NtOpenFile, IoCreateFile and friends and so on). This is what happens here:

The parameters for the operation are validated and checked to see if they make sense. Here is where STATUS_INVALID_PARAMETER is returned if you do something like ask for DELETE_ON_CLOSE but not ask for DELETE access… There are a lot of checks to validate the parameters, but no actual security or sharing checks.

A very important structure is allocated, the OPEN_PACKET. This is an internal structure to the IO manager and it is the context that the IO manager has for this create. The create parameters are copied in initially. This is a structure that's available in the debugger:

1: kd> dt nt!_OPEN_PACKET
    +0x000 Type             : Int2B
    +0x002 Size             : Int2B
    +0x004 FileObject       : Ptr32 _FILE_OBJECT
    +0x008 FinalStatus      : Int4B
    +0x00c Information      : Uint4B
    +0x010 ParseCheck       : Uint4B
    +0x014 RelatedFileObject : Ptr32 _FILE_OBJECT
    +0x018 OriginalAttributes : Ptr32 _OBJECT_ATTRIBUTES
    +0x020 AllocationSize   : _LARGE_INTEGER
    +0x028 CreateOptions    : Uint4B
    +0x02c FileAttributes   : Uint2B
    +0x02e ShareAccess      : Uint2B
    +0x030 EaBuffer         : Ptr32 Void
    +0x034 EaLength         : Uint4B
    +0x038 Options          : Uint4B
    +0x03c Disposition      : Uint4B
    +0x040 BasicInformation : Ptr32 _FILE_BASIC_INFORMATION
    +0x044 NetworkInformation : Ptr32 _FILE_NETWORK_OPEN_INFORMATION
    +0x048 CreateFileType   : _CREATE_FILE_TYPE
    +0x04c MailslotOrPipeParameters : Ptr32 Void
    +0x050 Override         : UChar
    +0x051 QueryOnly        : UChar
    +0x052 DeleteOnly       : UChar
    +0x053 FullAttributes   : UChar
    +0x054 LocalFileObject  : Ptr32 _DUMMY_FILE_OBJECT
    +0x058 InternalFlags    : Uint4B
    +0x05c DriverCreateContext : _IO_DRIVER_CREATE_CONTEXT

This structure is pretty important to the flow of the IO operation but there is no way to access it as a developer so it's going to be just an important concept to remember later on.

Finally, since we've copied all internal parameters and all the IO manager has at this point is an OB manager path (in the ObjectAttributes paramater to the call), it must call the OB manager to open the device (ObOpenObjectByName, see below).
After ObOpenObjectByName returns this function cleans up and returns.

nt!ObOpenObjectByName

This the call to have the OB manager create a handle for object when we know the name. This isn't a public interface since 3rd party drivers only need to open objects that have their own create or open APIs (for example ZwCreateFile, ZwOpenKey, ZwOpenSection, ZwCreateSection, ZwOpenProcess and so on). Another thing to note about the OB APIs is that they fall largely into two classes:

Functions that reference objects (that just operate on the reference count of objects), like ObReferenceObject, ObReferenceObjectByName and ObReferenceObjectByPointer.
Function that create handles to object in addition to referencing them (which is called an "open"), like ObOpenObjectByName and ObOpenObjectByPointer.

Anyway, this is roughly what goes on in here:

Capture the security context for this open, so that whoever needs to open the actual object can perform access checks. This also means that the file system itself doesn't rely on the thread context being the same and instead uses the context captured here. So minifilters should to the same when they care about the security context of a create.
Call the actual function that looks up the path in the namespace (ObpLookupObjectName, see below)
If ObpLookupObjectName was able to find an object then a handle is created for that object (since this was an open type function).

nt!ObpLookupObjectName

This is the function where the OB manager actually looks in the namespace for the path it needs to open (which at this point is still "\??\C:\Foo\Bar.txt"). One thing to note is that the OB namespace has a hierarchical structure, with DIRECTORY_OBJECT types of objects that hold other objects. The root of the namespace ("\") is such a DIRECTORY_OBJECT.
Anyway this is what happens in this function. The parsing starts at the root at the namespace, "\". This is a loop until we find the final object to return to the user or find that there is no object by that name (and therefore fail the request):

If the current directory is the root directory then check if the name starts with "\??\" and make it point to the \GLOBAL?? directory. This is a hardcoded hack in IO manager (which is why calling "!object \" in WinDbg doesn't show a "??" folder). (so our name becomes "\GLOBAL??\C:\Foo\Bar.txt")
Find the first component in the path (which is GLOBAL??) in the current directory.
If the component found is a DIRECTORY_OBJECT, open it and continue parsing from that point using the rest of the name (in our case, "C:\Foo\Bar.txt" is the remaining name). Continue the loop with remaining path.
if the object has a parse procedure, call that parse procedure and give it the rest of the path. if the parse procedure returns STATUS_REPARSE (and it hasn't reparsed too many times already), start again at the root of the namespace with the new name returned by the parse procedude. Otherwise the parse procedure should either return STATUS_SUCCESS and return an object or a failure status.

Some notable things are:

OB will do a case sensitive or a case insensitive search of the OB namespace, depending on the OBJ_CASE_INSENSITIVE flag that is passed into the OBJECT_ATTRIBUTES, which is why it's important to set this correctly when calling FltCreateFile in a filter (specifically from a NormalizeNameComponent callback) since if it's not correctly set the request might not make it down the IO stack at all
the OB namespace uses symlinks quite a lot. OB symlinks are a special type of object that has a string member that points to a different point in the namespace, and a parse procedure:
```
0: kd> dt _OBJECT_SYMBOLIC_LINK
 nt!_OBJECT_SYMBOLIC_LINK
    +0x000 CreationTime     : _LARGE_INTEGER
    +0x008 LinkTarget       : _UNICODE_STRING
    +0x010 DosDeviceDriveIndex : Uint4B
 
```
So in our example, when OB gets to "\GLOBAL??\C:" it discovers it is a symlink and it calls the parse procedure with the rest of the remaining name ("\Foo\Bar.txt"). In The symlink for "\GLOBAL??\C:" points to "\Device\HarddiskVolume2" and the symlink's parse procedure concatenates that name with the remaining path that it got and so the new name after the symlink is "\Device\HarddiskVolume2\Foo\Bar". See this:
```
0: kd> !object \GLOBAL??\C:
 Object: 96f7f188  Type: (922b7f78) SymbolicLink
     ObjectHeader: 96f7f170 (new version)
     HandleCount: 0  PointerCount: 1
     Directory Object: 96e08f38  Name: C:
     Target String is '\Device\HarddiskVolume2'
     Drive Letter Index is 3 (C:)
 
```
The parse procedure of a symlink always returns STATUS_REPARSE.
Once we get to the "\Device\HarddiskVolume2\Foo\Bar.txt" path, while parsing OB will find that "\Device\HarddiskVolume2" is a DEVICE_OBJECT type of object and that it has a parse procedure. The parse procedure for a DEVICE_OBJECT is IopParseDevice, so that function gets called.
Another thing to note that there is a limit to the number of times OB will reparse and each time it sees a STATUS_REPARSE counts against that limit (so it doesn't matter whether it was a reparse from a symlink or a DEVICE_OBJECT, everything counts). So it is possible to reparse to the point where OB won't reparse anymore.

nt!IopParseDevice

The name here is just "\Foo\Bar.txt" and the parse procedure gets a reference to the device where the path should be searched. This is where the difference between a file and a device becomes relevant. If there is no remaining path, this is treated as an open to the device. If there is a path, then this is assumed to be a file (or directory) open. This is a pretty involved function with many special cases. However, there are only a couple of steps that we're going to talk about:

Get the context for this create, which is the OPEN_PACKET structure from before. This works because the OPEN_PACKET is IO manager's structure passed from IopCreateFile to IopParseDevice. This is important because this is a nice way to have context across calls through other subsystems (OB manager) and still keep context that is opaque to those subsystems. This isn't always the case unfortunately and whenever two subsystems share the same structure the architecture gets complicated.
Check to see if a file system is mounted on this device and if not then mount it.
Process the device hint if there was any.
Allocate the IRP_MJ_CREATE irp
Allocate the FILE_OBJECT that will represent the open file.
Call the FastIoQueryOpen function (which minifilters see as the IRP_MJ_NETWORK_QUERY_OPEN). The IRP parameter to this call is the IRP that was just allocated.
If the FastIoQueryOpen didn't work, send the full Irp to the file system stack by calling IoCallDriver.
Wait for IRP to complete (i.e. the IRP is synchronized by the IO manager).
If the request was a STATUS_REPARSE, then first check if it is a directory junction or a symlink and do some additional processing for those. Anyway, copy the new name to open from the FILE_OBJECT (the actual name to open is passed in and out this function through a parameter).
If the status from the Irp was not a success status or it was a STATUS_REPARSE, cleanup the FILE_OBJECT and release the references associated with it. The irp is always released anyway.
Return the status. If this was successful, the FILE_OBJECT will be the one used to represent the file.

This is a pretty high level view of the process but it should explain why some of the things we're going to talk in future posts work the way they do.

Thursday, December 16, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part I

This is the first in a series of posts where I'll try to address various common questions about IRP_MJ_CREATE. My plan is to address the following topics:

What exactly is it that IRP_MJ_CREATE creates ? (a bit of rambling on one of my favorite topics, operating systems design)
Why is there no IRP_MJ_OPEN ? Surely MS could afford one more IRP :)...
Flow of a file open request through the OS.
What is the difference between a stream and a file from an FS perspective
What does STATUS_REPARSE do ?
What is name tunneling ? How does it affect creates ?
How to open the same stream as an existing FILE_OBJECT in a name-safe way.
What are stream file objects and why are they necessary ?
Various strategies to redirect a file open to a different file.
How to track a create when reparsing ?

In order to address this properly, I'd like to explain some things about operating systems. This is a rather dry topic but in my opinion the things I'm going to talk about are fundamental for understanding not only how IRP_MJ_CREATE works, but also why it works the way it does.

There are many ways to define an operating system but for this topic I think that a very useful way to describe it is as a hardware abstraction layer. It is a library of functions combined with a machine abstraction. As such, OS code is pretty much dedicated to either "abstract stuff that people use a lot" (allocate memory, create a window, draw strings and so on) or "hardware interaction code" (talk to the disk, talk to the memory controller hardware, talk to the graphics hardware). As such it should come as no surprise that the kernel part of OS is designed around interaction with hardware (as opposed to the user mode part which in general implements more abstract services).

File systems (and the whole file system stack including legacy filters and minifilters) are "higher level drivers" (since they don't usually talk to hardware directly). However, they must fit into the OS model which is built around hardware. This is why file system still create device objects and when calling FltGetFileNameInformation the name it returns starts with "\Device\....".

One other very important concept that plays into why IRP_MJ_CREATE functions the way it does is that the OS itself is implemented as a set of "services". Each service has its own protocol, usually described by an API set (the memory manager has it's own command set, the object manager has its own set and so does the IO manager). Most (if not all) of these protocols are stateful. The caller issues an "initialize" command (ExAllocatePool, ZwCreateFile, FltRegisterFilter) and they get back a more or less opaque handle (for ExAllocatePool, the pointer serves as the handle; ZwCreateFile -> an actual handle; FltRegisterFilters -> a PFLT_FILTER pointer and so on) and they can then issue additional commands that require that handle to be passed in (ExFreePool, ZwReadFile, FltStartFiltering). For stateful protocols the service (or server) has a blob of data that describes the internal state of each object and based on that data it knows how to satisfy each request. The opaque handle is a key that helps the service find that data. For example, for ExAllocatePool the internal data blob is the nt!_POOL_HEADER, for ZwCreateFile the context is pretty much a set of granted access rights for that handle and a reference to the FILE_OBJECT and for FltStartFiltering the FLT_FILTER structure. From this point on I'll call that blob of data a context (as in MM's context, IO manager's context, FltMgr's filter context). For services that already provide support for caller defined contexts (like FltMgr) I'll use the terms "internal context" and "user's context" to differentiate the two. The conclusion here is that any stateful protocol must have some context in the service (or server) side that the service can use to keep track of the state of communication with the client.

The important thing I wanted to get to is that sometimes some operations require multiple OS components to work together to satisfy a user request and as such multiple contexts might need to be created by each component. For example, for a ZwCreateFile call there might need to be created some of the following contexts: a handle, a FILE_OBJECT, a FltMgr internal context, some minifilter contexts, one or more file system contexts and a couple of MM contexts (where all the other contexts will be stored).

So with all these things in place, we can start talking about IRP_MJ_CREATE. As I said above, the OS has an abstract interface which consists mainly of OBJECTs for various things. When someone needs to talk to a device (physical or a virtual device, like a file system; anything that can be represented internally by a DEVICE_OBJECT), the OS context is a FILE_OBJECT. So in other terms, the FILE_OBJECT simply represents the state associated with the OS communicating to a DEVICE_OBJECT. The "create" word in ZwCreateFile and IRP_MJ_CREATE simply refers to FILE_OBJECT itself. There is no IRP_MJ_OPEN because there is no way to open an existing FILE_OBJECT. In order to get a FILE_OBJECT one must either create it or already have a reference to it (pointer or handle) and must call either ObReferenceObject or ObReferenceObjectByHandle to get another reference to that FILE_OBJECT.

The next topic, which is the flow of a create operation through the OS is pretty long so I'll save for next week. In the mean time please fell free to let me know what other topics related to the IRP_MJ_CREATE path you have that you'd like to address.

Thursday, December 9, 2010

More on IRPs and IRP_CTRLs

Sometimes I see posts on discussion lists about how a callback is not being called for some operation that a minifilter registered for. In most (possibly all) cases it turns out that that's not what the problem is and that the callback is in fact called, it's just that the poster can't tell it happened. It's happened to me a couple of times, but since I have a lot of confidence in FltMgr (having worked on it and all) I start of with the assumption that it must be something I'm doing wrong.

However, I've been wondering why people seem so keen on assuming that they don't get to see the callback for minifilters. And then I've realized that it might have something to do with the fact that minifilters use a callback model whereas the NT IO model is call-through. I'll talk a bit the call-through model and the limitations it has. I'll start with a brief refresh of the NT IO model and then explain the limitations and how the minifilter model tries to address them. Then I'll explain some of the downsides and how to work around them.

When an IO request (open a file, read or write and so on) reaches the IO manager, the information about the request is put in an IO request packet (IRP). Then the IO manager calls the driver that should process that IRP by calling IoCallDriver. There may be multiple drivers needed in order to complete a single operation, for example when the user opens a remote file so the IO request goes to a file system which then needs to send something to the network, so now there are at least two drivers involved in this. One could design the OS so that the drivers could go back to the IO manager and let it dispatch the request to the appropriate driver again or let the two drivers communicate directly. NT was designed to let the drivers communicate directly. Moreover, in many cases it one request may pass through many drivers that make up an IO stack (like the file system stack or the storage stack or the network stack), where each driver performs a specific role. So the IRP is potentially modified by each driver and sent to the next driver by calling IoCallDriver.

This is a call-through model. In the debugger it can sometimes look like this (please note that the IRP model allows the request to be completely decoupled from the thread but in practice you still see a lot of cases where a lot of drivers simply call the next driver in the same thread):

1: kd> kn
 # ChildEBP RetAddr  
00 a204bb10 828734bc volmgr!VmReadWrite
01 a204bb28 963bc475 nt!IofCallDriver+0x63
02 a204bb34 963bc548 fvevol!FveRequestPassThrough+0x31
03 a204bb50 963bc759 fvevol!FveReadWrite+0x4e
04 a204bb80 963bc7a9 fvevol!FveFilterRundownReadWrite+0x197
05 a204bb90 828734bc fvevol!FveFilterRundownWrite+0x33
06 a204bba8 9639a76e nt!IofCallDriver+0x63
07 a204bc88 9639a8a5 rdyboost!SmdProcessReadWrite+0xa14
08 a204bca8 828734bc rdyboost!SmdDispatchReadWrite+0xcb
09 a204bcc0 965a0fd9 nt!IofCallDriver+0x63
0a a204bce8 965a12fd volsnap!VolsnapWriteFilter+0x265
0b a204bcf8 828734bc volsnap!VolSnapWrite+0x21
0c a204bd10 960b091c nt!IofCallDriver+0x63
0d a204bd1c 828a711e Ntfs!NtfsStorageDriverCallout+0x14
0e a204bd1c 828a7215 nt!KiSwapKernelStackAndExit+0x15a
0f 981c964c 828c711d nt!KiSwitchKernelStackAndCallout+0x31
10 981c96c0 960af939 nt!KeExpandKernelStackAndCalloutEx+0x29d
11 981c96ec 960b05a6 Ntfs!NtfsCallStorageDriver+0x2d
12 981c9730 960af0a0 Ntfs!NtfsMultipleAsync+0x4d
13 981c9860 960ae0a6 Ntfs!NtfsNonCachedIo+0x413
14 981c9978 960af85f Ntfs!NtfsCommonWrite+0x1ebd
15 981c99f0 828734bc Ntfs!NtfsFsdWrite+0x2e1
16 981c9a08 9605f20c nt!IofCallDriver+0x63
17 981c9a2c 9605f3cb fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
18 981c9a64 828734bc fltmgr!FltpDispatch+0xc5
19 981c9a7c 82a74f6e nt!IofCallDriver+0x63
1a 981c9a9c 82a75822 nt!IopSynchronousServiceTail+0x1f8
1b 981c9b38 8287a44a nt!NtWriteFile+0x6e8
1c 981c9b38 828798b5 nt!KiFastCallEntry+0x12a
1d 981c9bd4 82a266a8 nt!ZwWriteFile+0x11

So here we can see how a write (ZwWriteFile) goes through FltMgr, NTFS, volsnap, rdyboost, fvevol and volmgr (where I set my breakpoint for this blog post).

One big problem with this approach is that the size of the kernel stack in NT is pretty small (depends on the architecture and so on but it's something like 12K or 20K..) and so if there are enough drivers, each of them using some stack space then it is possible to run out of stack. This in fact happens in some cases (AV filters were notorious for using a lot of stack) and the outcome is a bugcheck. Please note that in the example above, most filters were just letting the request pass through them, without necessarily doing anything to it. So they still use stack space even if they don't care about the operation at all…

Another problem with this approach is that it is almost impossible to unload a driver because very often each driver remembers which driver they need to send the IRP to next, so they are either referencing it (so it will never go away) or just using it without referencing it and so immediately after it goes away there is a bugcheck.

FltMgr's main goal was designed to increase system reliability (yeah, making file system filters development easier was just a secondary objective) and it tried to address this issue by making the minifilter model a callback model. This addresses both problems. Unloading a minifilter works because now each filter doesn't need to know which is the next filter to call and so the only component that must reference a minifilter is FltMgr, which then allows a minifilter to go away by informing only FltMgr about it.

The way this takes care of stack usage is a bit more interesting. When the minifilter callback is done it returns to FltMgr a status that instructs FltMgr whether they want to be notified when the request completes or not (or a couple of other statuses) but that's it. The stack space associated with the call to the minifilter's callback (the stack frame) is released and can be reused. This is why in the stack above, the IRP simply goes from IO manager to FltMgr and then to the filesystem. It doesn't matter how many minifilters were attached to the volume, they all use no stack space at all at this time.

Now, let's look in more detail at filter manager's stack frame. There are no minifilters functions on the frame because they all returned nicely to FltMgr and no longer use any stack space. This is the most confusing thing about this, that the minifilters cannot be seen on the stack so it looks like they have never been called at all… However, now that we know that FltMgr must have called some minifilters, is there a way to see which minifilters were called and so on ? In a previous post I explained that FltMgr has an internal structure that wraps the IRP called the IRP_CTRL (also known as a CALLBACK_DATA), and all the information about the request is stored in there. FltMgr clearly must remember the IRP_CTRL associated with this IRP someplace, but where ?

1: kd> kbn
 # ChildEBP RetAddr  Args to Child              
...
16 981c9a08 9605f20c 93460958 94301bf8 00000000 nt!IofCallDriver+0x63
17 981c9a2c 9605f3cb 981c9a4c 93460958 00000000 fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x2aa
18 981c9a64 828734bc 93460958 94301bf8 94301bf8 fltmgr!FltpDispatch+0xc5
19 981c9a7c 82a74f6e 93715f80 94301bf8 94301dac nt!IofCallDriver+0x63
...

Well, it turns out that there is another very useful structure called the IRP_CALL_CTRL, which is a structure that associates an IRP and an IRP_CTRL and other context that FltMgr keeps for the operation:

1: kd> dt 981c9a4c fltmgr!_IRP_CALL_CTRL
   +0x000 Volume           : 0x932f1008 _FLT_VOLUME
   +0x004 Irp              : 0x94301bf8 _IRP
   +0x008 IrpCtrl          : 0x93591de0 _IRP_CTRL
   +0x00c StartingCallbackNode : 0xffffffff _CALLBACK_NODE
   +0x010 OperationStatusCallbackListHead : _SINGLE_LIST_ENTRY
   +0x014 Flags            : 0x204 (No matching name)

From here we can see the IRP_CTRL pointer and call my favorite extension, !fltkd (I get a complaint on my current symbols about how the PVOID type is not defined, which I've edited out):

1: kd> !fltkd.irpctrl 0x93591de0

IRP_CTRL: 93591de0  WRITE (4) [00000001] Irp
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : 94301bf8 
DeviceObject             : 93460958 "\Device\HarddiskVolume2"
FileObject               : 93715f80 
CompletionNodeStack      : 93591e98   Size=5  Next=1
SyncEvent                : (93591df0)
InitiatingInstance       : 00000000 
Icc                      : 981c9a4c 
PendingCallbackNode      : ffffffff 
PendingCallbackContext   : 00000000 
PendingStatus            : 0x00000000 
CallbackData             : (93591e40)
 Flags                    : [00000001] Irp
 Thread                   : 93006020 
 Iopb                     : 93591e6c 
 RequestorMode            : [00] KernelMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 00000000 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   93591fb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   93591f70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   93591f28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   93591ee0
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [4,0]    00060a01  00   0002   9341d918 93715f80 9608e55e-2662d614   93591e98
            ("FileInfo","FileInfo")  fileinfo!FIPostReadWriteCallback 
     Args: 00020000 00000000 003a0000 00000000 92fc6000 0000000000000000
Working IOPB:
>[4,0]    00060a01  00          9341d918 93715f80                     93591e6c
            ("FileInfo","FileInfo")  
     Args: 00020000 00000000 003a0000 00000000 92fc6000 0000000000000000

Here we can see what the minifilter stack looks like and that the FileInfo minifilter wanted a postOp callback for this operation. Another thing we can do is this (using the FLT_VOLUME pointer from the IRP_CALL_CTRL):

1: kd>  !fltkd.volume 0x932f1008

FLT_VOLUME: 932f1008 "\Device\HarddiskVolume2"
   FLT_OBJECT: 932f1008  [04000000] Volume
      RundownRef               : 0x00000074 (58)
      PointerCount             : 0x00000001 
      PrimaryLink              : [9334f404-932ad9b4] 
   Frame                    : 930adcc0 "Frame 0" 
   Flags                    : [00000064] SetupNotifyCalled EnableNameCaching FilterAttached
   FileSystemType           : [00000002] FLT_FSTYPE_NTFS
   VolumeLink               : [9334f404-932ad9b4] 
   DeviceObject             : 93460958 
   DiskDeviceObject         : 932b2320 
   FrameZeroVolume          : 932f1008 
   VolumeInNextFrame        : 00000000 
   Guid                     : "" 
   CDODeviceName            : "\Ntfs" 
   CDODriverName            : "\FileSystem\Ntfs" 
   TargetedOpenCount        : 55 
   Callbacks                : (932f109c)
   ContextLock              : (932f12f4)
   VolumeContexts           : (932f12f8)  Count=0
   StreamListCtrls          : (932f12fc)  rCount=2630 
   FileListCtrls            : (932f1340)  rCount=0 
   NameCacheCtrl            : (932f1388)
   InstanceList             : (932f1058)
      FLT_INSTANCE: 94114498 "luafv" "135000"
      FLT_INSTANCE: 9341d918 "FileInfo" "45000"

From here we can tell that there are in fact two minifilters attached to this frame , luafv and fileinfo. We knew about fileinfo from the IRP_CTRL, but what about luafv ? Did it even get called ? Well, unfortunately the only thing we can know for sure is that luafv was registered with fltmgr and attached to this volume. They might not have a callback registered for WRITEs or that callback was called but they returned FLT_PREOP_SUCCESS_NO_CALLBACK, so fltmgr didn't use a completion node for it so there is no record of it… We can look at the filter and see the registered callbacks, but we might not be able to find a record of whether the callback was actually called..

Of Filesystems And Other Demons

Thursday, January 20, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part VI

Thursday, January 13, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part V

Thursday, January 6, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part IV

Thursday, December 30, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part III

Thursday, December 23, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part II

Thursday, December 16, 2010

About IRP_MJ_CREATE and minifilter design considerations - Part I

Thursday, December 9, 2010

More on IRPs and IRP_CTRLs

Translate

Search This Blog

Helpful links for file system developers

About Me

Followers

Blog Archive

Popular Posts

Thursday, January 20, 2011

Thursday, January 13, 2011

Thursday, January 6, 2011

Thursday, December 30, 2010

Thursday, December 23, 2010

Thursday, December 16, 2010

Thursday, December 9, 2010

Translate

Search This Blog

Helpful links for file system developers

About Me

Subscribe To This Blog

Followers

Blog Archive

Popular Posts