Thursday, January 20, 2011

About IRP_MJ_CREATE and minifilter design considerations - Part VI

I'm pretty much done with what I wanted to cover about IRP_MJ_CREATE. I'd just like to go through a couple more things that I think are important before closing this topic.

FILE_DELETE_ON_CLOSE create option

This flag sets a flag associated with the current FILE_OBJECT, in a file system structure associated with the FILE_OBJECT itself and not the stream. There is no way to query whether this flag was set after the fact. Once the FILE_OBJECT is cleaned up, the flag moves to the SCB (a per stream structure) and it can be queried using  IRP_MJ_QUERY_INFORMATION and FileStandardInformation. The same flag can be set on a stream by an IRP_MJ_SET_INFORMATION with the FileDispositionInformation information class. Please note that if at the time when the FILE_OBJECT that was created with FILE_DELETE_ON_CLOSE is closed there are no other FILE_OBJECTs for that same stream, then the flag will be moved to the SCB and then the stream will immediately be deleted, so there is no opportunity for a filter to query the flag or remove it. Filters that want to to be able to potentially clear the "delete intent" from a file can do something like remove the FILE_DELETE_ON_CLOSE flag from the CreateOptions and then in postCreate set it to the stream with an IRP_MJ_SET_INFORMATION. This is not exactly the same as FILE_DELETE_ON_CLOSE, but it's a pretty good approximation. It also allows the delete on close flag in the SCB to be queried and possibly reset at any time.

STATUS_REPARSE in postCreate

A filter can return STATUS_REPARSE in postCreate. It can do so if the create failed or even if it was successful, provided that the filter takes care of undoing what was done in the file system (see FltCancelFileOpen and IoCancelFileOpen).

FltGetFileNameInformation behavior

FltGetFileNameInformation can be called during a create, both in preCreate and postCreate. Calling FltGetFileNameInformation might result in the fltmgr actually opening the file if the file doesn't have a path (open by ID), but there should be no open to the actual file in any other case. If the caller is asking for a normalized path in preCreate, FltMgr will try to open the parent directory and enumerate its entries in order to get the long file name. However, if the file doesn't exist (if the IRP_MJ_CREATE is actually trying to create a file) then it is possible that even the normalized name contains a short name as the final component (for example, if a filter is trying to create "/Foo/Ba~1.txt" then the normalized path will have Ba~1 as a final component; everything else in the path should be normalized though). However, there is a really big performance hit associated with requesting a normalized name in preCreate and so it should be avoided if possible (might not be possible in all cases, but perhaps it can be moved to postCreate or maybe the opened name will do). The perf hit is much smaller when getting a normalized path in postCreate, primarily because of the cache.

Contexts in preCreate

Since before the IRP_MJ_CREATE hits the file system the FILE_OBJECT is not associated with a file system stream, any mechanism that requires the SCB will not function. For minifilters this includes file related contexts (stream, streamhandle, file), the name cache (hence the perf penalty when getting a normalized name) and possibly other things. Please note that because of renames, even if a minifilter opens a file with the same name in preCreate and then lets the IRP_MJ_CREATE continue there is no guarantee that they're going to be opening the same stream. This is one reason security products should not attempt to scan files in preCreate (because there is no way to guarantee that what they scanned will be the stream that original IRP_MJ_CREATE will end up opening).

Opening a new FILE_OBJECT for an existing FILE_OBJECT

Sometimes a minifilter needs a new handle to the same FILE_OBJECT that a user has opened(FO1). Rather than getting the file name of the user's file and then calling FltCreateFile with that name, a minifilter can simply call create (IoCreateFile, ZwCreateFile ) with an empty name and use a handle to FO1 as the RootDirectory handle when setting up the OBJECT_ATTRIBUTES structure. This results in an IRP_MJ_CREATE where the FILE_OBJECT->FileName is empty and FILE_OBJECT->RelatedFileObject is FO1  and the file system will simply open a new handle to the same stream. This is a much better approach because it doesn't require using file names so there is no hit associated with FltGetFileNameInformation and also it is not vulnerable to renames of the original file. Of course, the user's FILE_OBJECT must be opened.

Writing to read-only files

A pretty interesting behavior of file systems is that when an IRP_MJ_CREATE creates a read-only file the handle associated with that IRP_MJ_CREATE can be used to write to the file. This is interesting because if a filters tries to open the same file the user has opened in postCreate  and it is using the same parameters, it doesn't necessarily mean it will get the same rights, depending on whether the file existed before that IRP_MJ_CREATE or not.

FileObject->FileName is not meaningful after a successful IRP_MJ_CREATE

Because FileObject->FileName is only a vehicle to pass the name information from the IO manager to the file system, once the IRP_MJ_CREATE actually reaches a file system and a stream is opened, it should be ignored. This is because once the FILE_OBJECT is associated with an SCB, the name of that SCB can immediately change (a rename on another FILE_OBJECT for that SCB) and the entity that knows the name of the SCB at all times is the file system, but there is no mechanism for a file system to go in and update all FILE_OBJECTs associated with an SCB when the name changes. As a side note, I still believe that the FILE_OBJECT structure would have been better off without a FileName member and that the FileName should have been a member of the IRP_MJ_CREATE.

FILE_OBJECT->RelatedFileObject is not recursive

In an IRP_MJ_CREATE if FILE_OBJECT->RelatedFileObject is not null, then that FILE_OBJECT (RFO) cannot also have a FILE_OBJECT->RelatedFileObject. However, since the RelatedFileObject has already been opened it means one cannot rely on its FileName member (see above) and so whether it had a RelatedFileObject or not is irrelevant.

SL_OPEN_TARGET_DIRECTORY in preCreate

SL_OPEN_TARGET_DIRECTORY means that this create is actually targeted at the parent directory of the FILE_OBJECT->FileName path ( if FILE_OBJECT->FileName is "\foo\bar\baz" and SL_OPEN_TARGET_DIRECTORY is set then the SCB that will be associated with this FILE_OBJECT is for "\foo\bar"). FltGetFileNameInformation in preCreate is aware of this and it will actually return the name "\foo\bar". So if a minifilter needs to get the full path even when SL_OPEN_TARGET_DIRECTORY is set, they must remove this flag before calling FltGetFileNameInformation (and set it back before sending the IRP_MJ_CREATE down, of course).

8 comments:

  1. I don't think I'm seeing a performance improvement (on XP) moving FltGetFileNameInformation from preCreate to postCreate. You say the "perf hit is much smaller when getting a normalized path in postCreate, primarily because of the cache" but the IRQ hasn't returned back out to the FltMgr yet so it won't have been cached. Perhaps it will be cached now and re-used later, but I think the much bigger problem is the ill effects of accessing the file system during the IRP_MJ_CREATE which is *synchronized* unlike other IRPs, no?

    ReplyDelete
  2. Well, FltMgr actually sees the IRP complete before calling any minifilter's postCreate callback and so it could do things after the FS completes the operation but before any minifilter sees it. However, that shouldn't matter anyway with respect to this issue since the name cache is stored in a context associated with the stream and so once the IRP_MJ_CREATE is completed by the file system any FltGetFileNameInformation() call can use the cache from the SCB.
    It is true that IRP_MJ_CREATE is synchronized by default (unlike other IRPs) but that shouldn't have an impact on performance.
    I don't know what you don't see a performance improvement, you'd have to tell me more about your architecture (normalized names are the heavy hitters here, if you're just getting opened names then the performance hit in preCreate is much smaller). It's also possible that moving the FltGetFileNameInformation() call from pre to postCreate requires some other changes that could degrade performance to the point where any benefits of the name cache would be cancelled.
    Finally, I would also test on Win7 since i know the name cache has been improved constantly (that is both in Vista and Win7) and so it would be interesting to get that data point as well.

    ReplyDelete
  3. Great help, thanks! I am monitoring normalized paths with FltGetFileNameInformation. I used KeQueryPerformanceCounter diffs to isolate FltGetFileNameInformation as the sole significantly expensive operation in my minifilter. I have a user mode program which just loops, creating writing and deleting uniquely named files in one thread and doing similar registry operations in another. My driver slows this program by a whopping factor of 100 files per second. My focus is on XP, but yes I am interested in comparing Vista/Win7 when I get a chance to do that.

    ReplyDelete
  4. Microsoft has a very nice tool you can use for performance work, Xperf (http://msdn.microsoft.com/en-us/performance/cc825801).
    You should see an improvement in a real system from moving FltGetFileNameInformation(...FLT_FILE_NAME_NORMALIZED...) from preCreate to postCreate. If you are not then it's interesting to know why (feel free to shoot me an email if you'd prefer to talk about this offline).
    The biggest improvement you can make is to use FltGetFileNameInformation(...FLT_FILE_NAME_OPENED...), but whether you can use it depends on what you driver does. However, it's usually worth going after this particular change if it's not way too complicated.
    As I said, if you want to talk specifics feel free to shoot me an email.

    ReplyDelete
  5. Hello Alex, if I'm interested in only the file extensions. Is looking at the iopb->TargetFileObject->FileName in pre-create/post-create the right approach? Or is FltGetFileNameInformation is the only way to obtain the file name? Thank you.

    ReplyDelete
  6. Hello Alex,

    Q1. Can I allocate a file context in PreCreate?
    Q2. Is the FileObject->FileName meaningful in postCreate?

    Thank you.

    ReplyDelete
  7. Hi Krish,

    No, the file name in preCreate might not have an extension. It might be empty (if the caller wants to reopen an existing stream) or it might be just the name of an alternate data stream... However, if you do find an extension there it's probably the right one so you could use it when you find it and when you don't find it you can call FltGetFileNameInformation.
    You can allocate a context in preCreate but you can't set it because there is nothing to attach it to (the FILE_OBJECT is not yet initialized and you don't attach it to the FILE_OBJECT but rather to the stream the FILE_OBJECT is pointing to...).
    The FILE_OBJECT->FileName is not guaranteed to be meaningful in postCreate so you can't rely on that...

    ReplyDelete
  8. Thank you Alex, for explaining this.

    ReplyDelete