Thursday, October 28, 2010

Useful Models - how choosing the right abstraction can help design and some useful abstractions for working with minifilters

The poll on the site indicated this was the topic most people were interested in so here it is.

I find myself quite often in the position of trying to explain why something doesn't work the way someone expects it would. I guess this is due in large part that the work I do (storage and file systems) is something that people interact with quite often but in fact operates quite differently than the abstraction it presents to the users. I've mentioned this in my other posts anyway…

So in order to explain why some architecture won't work, I try to find an analogy or a model that would immediately make the problem obvious. Some of these models are very dependent on the problem I'm dealing with while some others I keep reusing. Some of the models are obviously not practical, but they highlight a certain features of the system. It would be nice if these models could be implemented as actual tools (like Driver Verifier) but the reality is that in some cases the effort to write something like this would not justify the benefits… So I guess most of them will remain in the realm of thought experiments but they can be useful nevertheless...

I'll go through a list of commonly asked questions and the models that I find help explain the problem. I'm sure most of the readers of this post could contribute their own examples so please do so through the comments.

Q: Why not send the file name directly to our minifilter from a service or some other user mode program ?
A: it really depends on the other minifilters on the system. The model here is a minifilter that implements ALL of the namespace perfectly, with file IDs and hardlinks and so on, at its level, and below itself it keeps a flat structure where all streams are identified by GUIDs and there are not directories. If your minifilter happens to be below such a filter then obviously the name of the file at your level (which is a GUID) has absolutely nothing to do with the name the user mode service sees (which can be a regular path). Now, it must be said that any minifilter that does anything like this to the namespace would be in the virtualization group, so if you are above the virtualization group you don't have this problem. But if you are IN or below the virtualization group, then you must take this into account.

Q: Why not communicate with my minifilter through a private communication channel and have it open and read files on behalf of my service ?
A: if you are in or below the virtualization group, see the example above. If you are below the AV group, then you should always think about malware. Let's say you do something very benign, like open your own file and read some configuration data (as opposed to opening and parsing or executing random user files). If there is a vulnerability with your parsing code, this allows someone to write a file based exploit targeting your product and no AVs will be able to see your accesses to the file and catch the vulnerability. Unfortunately, there isn't a good generic malware model so you need to construct your own every time you need to explain why bypassing some security measure is not a good idea…

Q: Why not create a back-up of a VHD file while the volume is mounted ? (which is another way of saying "why not try to read the data on a mounted volume by directly accessing the sectors ?").. This is a question that's not really related to file systems but to the storage stack.. However, I find a lot of people are confused about this and keep trying to read mounted volumes.
A: the model I find helps is that of a volume with a file system on top that on volume mount reads everything into memory and then it only writes the odd bytes (byte 1, 3, 5 and so on) of anything and keeps the even bytes in a cache, until it gets either a flush or a dismount. This makes immediately visible what would happen if you tried to read it. However, once I mention this people immediately ask whether we could flush and then take a snapshot, but then I point out that immediately after the flush the system might already have received some writes and then only the odd bytes have been written so you need a way to guarantee that no more writes happen on the file system, and the only way to do that is to dismount it.

Probably the most powerful model that exposes a lot of issues with filters (not only file system, any filters of any component really) is the "filter attached on top of itself" model. This is important because in general anything you can do in your filter someone else can do in theirs. For example, let's say the discussion is whether creating a new FSCTL that is currently unused and sending it down the FS stack to your filter is a good idea (spoiler: it's not). In the general case this wouldn't work with your filter attached twice, since all the IOCTLs will be captured by the top filter. This might not be an obvious problem (because depending on what the filter should do with the IOCTL , it might still work fine), but then consider that someone else can write a filter just like yours using the same IOCTL derived through the same mechanism and then you can expect more serious problems. So in this particular case you would want to make sure to either use a communication mechanism guaranteed to deliver messages directly to your filter like a control device or (if using a minifilter) communication ports. The same applies for file names (what if there already is a file with that name?) and other named resources.. Thinking about what would happen if your filter would be attached on top of itself is always an interesting thought experiment and highly recommended since it will expose potential problems with your design. Once you know what the problems are you can decide about how likely it is to happen and whether you should address the issue..

I thought I had more models and I should have done a better job at keeping track of them but I can't remember anymore right now. I will update the post when I do.