So what is a context ? A context is a structure that is owned by some system component (in our case a filter, legacy or mini) that is associated with some other structure. In a very general way, a context is a "value" and at the object that it is associated with is a "key". In general contexts are necessary when the flow of execution is controlled by some other component in the system than the one that implements the actual code (for example for callbacks and services and library functions, where the code is provided by the library or service, but when the code is called depends on something else). Anyway, because the context is simply a key-value pair, anyone can implement a generic context mechanism by using hashes, and this allows great flexibility in what one can attach a context to. For example, one can associate a context with a thread or a logged on user or even a sector on a volume if they feel so inclined. One issue with this approach is how to know when the underlying object is released so that the context can be released as well. For example, if a context is associated with thread 128 and then thread 128 terminates and then at some later point in time another thread is created with the same ID of 128, clearly the context should be released since it's not referring to the same underlying object, but unless the entity implementing the context is notified that thread 128 was terminated, it won't know to release it.
So returning to filters, filter manager offers support for the following types of contexts (at least, these the ones that are typically interesting; the other contexts can usually be implemented fairly easily by legacy filters): Streams, StreamHandles and Files. Let's look at how each of these contexts can be implemented. These are just examples about how it could be done with little support from the OS, but it's definitely not the best way it can be done… I'll address that after this section.
Fortunately the nice folks at MS decided to offer some help to the filters writers and developed some support APIs. They are covered in the MSDN pages "Tracking Per-Stream Context in a Legacy File System Filter Driver" (which is currently here) and "Tracking Per-File Context in a Legacy File System Filter Driver" (which is here). These APIs rely on the file system implementing support for the FSRTL_ADVANCED_FCB_HEADER structure. Please note that a file system is not required to implement this support but if it doesn't then it won't work with Filter Manager. Anyway, these APIs allow any kernel component (filter or not) to associate a context with an SCB and to be notified when the SCB itself is torn down. Please note that the SCB might not be torn down immediately when the last FILE_OBJECT for it is closed, because some file systems implement SCB caching and the filter might be able to benefit from this (benefit from it because it can keep its context and if someone opens a new handle to the same stream the filter's context is also cached).
There is another useful structure when implementing contexts, the RTL_GENERIC_TABLE (MSDN page currently here). A generic table is an OS structure that can be used as a general purpose hash, so that the filter doesn't need to implement their own. However, please note that it is implemented as a tree so if performance must be really good then a custom hash might still be necessary.
To wrap it up, in order for a filter to implement a similar scheme to FltMgr's contexts it can use the following scheme:
- Use OS support for stream contexts (FsRtlInsertPerStreamContext, FsRtlLookupPerStreamContext and so on)
- Use OS support for file contexts (FsRtlInsertPerFileContext, FsRtlLookupPerFileContext and so on)
- Implement a hash for per FILE_OBJECT context. Either use a straight hash or use a per Stream structure which includes a hash for FILE_OBJECTS for that stream (which is useful because the number of entries in each hash is much smaller so the RTL_GENERIC_TABLE might be a good fit).
Finally, I'd like to point out that any filter (legacy or mini) that implements its own streams (that completes an IRP_MJ_CREATE and puts something in FILE_OBJECT->FsContext) should implement support for FSRTL_ADVANCED_FCB_HEADER otherwise contexts won't work for those files and it might cause problems for other filters. This should be fairly easy to implement though following the MSDN documentation.
Alex,
ReplyDeleteThis is a good post, but I find missing the whole area generated by NTFS hard links, where the notion of a "link/name context" seems to be needed.
Some questions the [mini]filter driver developer needs to contemplate:
- Which name was changed in IRP_MJ_SET_INFORMATION/FileRenameInformation
- Which name was (maybe will have been!) deleted in IRP_MJ_SET_INFORMATION/FileDispositionInformation (and "friend" FILE_DELETE_ON_CLOSE )
The FileObject, FileObject->FsContext (aka FCB/SCB), model seems to be less than complete for these questions.
There seems to be nothing in the "legacy filter" world other than brute force [normalized] file name (string) comparision.
Then here also, filter manager also seems [to me, at least[ to comes up short in terms of assist for the mini-filter developer.
Your thoughts, as ever, much appreciate.
Best Wishes,
Lyndon
Hi Lyndon! Thanks for your comment. This is a very good point, I'll talk about the relation between a file, its links and streams in the next post then.
ReplyDeleteDont forget the mess that is symbolic links and all the work that Win32 and the IoManager do to make them almost impossible to deal with in a name aware filter (because they do not respect the reparse point rules)
ReplyDeleteHi Rod. I can't think of any particular unpleasantness that symlinks introduce with regard to contexts. Or are you referring to them in general ?
ReplyDeleteI for one am grateful that they needed to add ECPs to implement symlinks or else i'm not sure we would have had them even now.
Alex,
ReplyDeleteNo my objection to sym links has nothing to do with contexts (my context dilikes are aimed at RDR, but thats another story).
But you appear to have hit one of my hot buttons.
The issue with symlinks is that IOCFSDH goes straight through them on the same volume. This allows shoddy code (the stuff that doesn't handle STATUS_REPARSE) to work just dandy, but it breaks code that wants to understand the name space.
An example (there are many more, I know because every namespace aware filter I have ever worked on has needed work because of this).
If I open /foo/bar/foo/jim and it says yes, then I have every right to assume that bar is a directory. So I may want to set up my datastructures like that. I may even want to open bar FILE_DIRECTORY. I can be sure that I won't enter a circularity when traversing it becasue NTFS doesn't alolow hard links to directories (I have work on filesystem that allowed that, don't ask).
But in fact bar is a symlink and IOCFSDH has *SILENTLY HANDLED IT FOR ME*, further there is no way to know that without traversing the path by hand and looking at the names.
Then a bit later I open /foo/fred/jim and that fails, but as far as the user is concerned there is no difference - fred and bar are both symbolic links but jim is off disk and I have to explain that they are different. How am I going to do that and pretend that I have a well engineered solution?
I'm sorry, but that is just shoddy engineering. I've been there and I can guess why: it was added (easier under time pressure) to add a gross hack to one module than to fix a thousand others.
As far as a filesystem is concerned an on disk symbolic link and a mount point at the same. They should have the same code to handle them. They don't and that sucks.
It would have been *so* much better if IOCFSDH had be modified to return the reparse buffer and fail. Given that I would have even been OK with a "go through symoblic links" option..
I'm not sure why ECPs and symlinks are related, but I'll guess that this allows known modules to workaround these issues..