Thursday, February 3, 2011

More contexts: tracking hardlinks

In one of the comments to my previous post, Lyndon pointed out that there is not a lot of support from either the OS or FltMgr when it comes to tracking hardlinks. So I figured I'd explain why this is so complicated and explain what a filter would need to do to implement this. I'm not going to describe what hardlinks are or how they operate, focusing instead on what FltMgr does and what a filter might need to do as well.

However, there is one specific particularity about hardlinks that i'll keep referring to. Once a file is opened the file system remembers which link was used to open the file and it will return that name when querying the file name. If that linked is renamed, the FS will of course return the new name.

So the problem with hardlinks is, like Lyndon pointed out, that the SCB model isn't granular enough. The SCB is associated with the stream and it doesn't really matter how the stream was opened (by which name), the SCB is the same. So a StreamContext is the same, regardless of how many hardlinks were used. On the other hand, StreamHandleContexts are too granular, in that they simply track the FILE_OBJECT and different opens even from the same link (using the same name for the file) will obviously get different FILE_OBJECTs and thus different StreamHandleContexts.

Filter manager doesn't offer an additional type of context. However, it does need to deal with hardlinks because it implements a name cache. The name cache is pretty simple to implement for files that only have one name, the name is stored in a structure associated with the stream. However, for hardlinks, clearly the structure needs to be different so that opens for the same name are cached properly. FltMgr solves this problem by not caching the file name in a structure associated with the SCB if the file has more than one link (as reported by the FileStandardInformation information class) and instead it caches the name per FILE_OBJECT.

If a filter wanted to keep track of hardlinks it would need to, as Lyndon indicated in his comment, look at the name that it gets from the file system (the FileNameInformation class) and from that deduce which link was used. This is complicated because a link can be renamed at any time so that must be taken into account. A possible implementation would need to keep some structure in a perStream context that would map each FILE_OBJECT to a link (possibly introducing an artificial concept like linkID or linkGuid or something) and in postCreate would map the newly opened FILE_OBJECT to the appropriate link (which requires looking at link names using the FileHardLinkInformation class) while disabling renames for that stream.

I was planning on writing more on this topic and playing with hardlinks some more, but I'm busy at work and it'll have to wait for a future post.