Thursday, October 6, 2011

File Deletion

I've finally managed to install the Win8 WDK preview and I had a look at what's new in the WDK. There are a some new things for file system filter developers and I plan to write about them in the coming weeks.

One the most obvious new things is a new sample, "Delete". The purpose of the sample is to output a message to the debugger when a file has been deleted from the file system. This sounds like an easy task but it really isn't. One of the reasons for that is that delete semantics are rather different at file system level than what one expects. This is one of those cases in file system filter development where everyone has previous experience with the concept but the concept is very different from the implementation. For example just consider the Recycle Bin. As you might expect, the Recycle Bin is not a file system concept but rather a Windows OS concept which is implemented on top of the file system. What actually happens when a file is deleted to the Recycle Bin is that the file is renamed into a special hidden folder on the volume so that the user can't see the file anymore. However, as far as the file system is concerned, the file is not deleted at all but instead it has a new name.

Things are even more interesting when looking about how file system handle deletes. From the user perspective things are not very complicated: there is a Win32 DeleteFile() function that can be used to delete a file by name. However, as we've discussed before on this blog, there is quite a lot involved in resolving a file name to an actual file in the file system and as such it makes sense to keep all that complexity in one place and that place is in the functions that deal with IRP_MJ_CREATE. This is why there is no "IRP_MJ_DELETE" request that takes a file path and deletes that file. Instead, the file system opens the file name using an IRP_MJ_CREATE. Once that is done the user has a handle to the file and they can delete it. This can be done by using the IRP_MJ_SET_INFORMATION request with the FileDispositionInformation information class, which has a corresponding structure that describes what information should be set:

typedef struct _FILE_DISPOSITION_INFORMATION {
  BOOLEAN DeleteFile;
} FILE_DISPOSITION_INFORMATION, *PFILE_DISPOSITION_INFORMATION;

According to the documentation, once a user has set this flag " the only legal operation by such a caller is to close the open file handle". However, it's possible that there are more handles open to the file when this request is sent to the file system and whoever has those handles open is not aware of the file having been deleted and so they will likely continue doing whatever it is they are doing. According to the documentation "A file marked for deletion is not actually deleted until all open handles for the file object have been closed and the link count for the file is zero". The documentation doesn't mention this but most (if not all) the operations that happen on the other handles will work just fine and once the last handle is closed the file is deleted from the file system (the space that was occupied by the file is reclaimed and the file data is lost). However, it is possible for any of the other handles to detect that the disposition has been set by issuing an IRP_MJ_QUERY_INFORMATION request with the FileStandardInformation information class, which returns the FILE_STANDARD_INFORMATION structure in which the DeletePending member will be set to TRUE if this or another handle has issued a delete request:

typedef struct _FILE_STANDARD_INFORMATION {
  LARGE_INTEGER AllocationSize;
  LARGE_INTEGER EndOfFile;
  ULONG         NumberOfLinks;
  BOOLEAN       DeletePending;
  BOOLEAN       Directory;
} FILE_STANDARD_INFORMATION, *PFILE_STANDARD_INFORMATION;

But there is an interesting twist to this. The "DeleteFile" member of the FILE_DISPOSITION_STRUCTURE is a BOOLEAN. This is interesting because it seems to suggest it could be set to false. If the delete disposition could only be set to TRUE then why even have a member in that structure because simply issuing the request would indicate to the file system that the file needs to be deleted. As it happens, the delete disposition can also be reset by calling the same IRP_MJ_SET_INFORMATION request with the FileDispositionInformation information class with the DeleteFile member set to FALSE. This means that the file will not be deleted from the file system once the final handle is closed, cancelling the previous request to delete the file. This call (to set DeleteFile to FALSE) will be successful regardless of whether the file had a delete disposition set or not. In fact, one can call to set and reset the disposition many times and whoever called last to set the disposition to either true or false will win.

So now let's talk about how this implemented inside the file system. As you might have gathered from the above behavior, it looks like the delete disposition is a flag (which is why one can set it and clear it as many times as they want and the last one to change it wins). Also, since it's possible to set the delete disposition on one handle and read it on another handle, it must mean that this is a per-stream flag. And indeed, if we look at the FastFat sample in the WDK we can see the function FatSetDispositionInfo() (under \src\filesys\fastfat\Win7\fileinfo.c) performs a bunch of checks to make sure it can delete the file and then if they all pass and it can delete the file then it does this (which confirms this is a flag in the FCB):

        SetFlag( Fcb->FcbState, FCB_STATE_DELETE_ON_CLOSE );
        FileObject->DeletePending = TRUE;

By looking at the FastFat source code we can see where the FCB_STATE_DELETE_ON_CLOSE is used and get a pretty clear picture of all the places where the fact that a file is about to be delete matters. However, let's just look at what happens during IRP_MJ_CLEANUP processing (the FatCommonCleanup() function in \src\filesys\fastfat\Win7\cleanup.c). The flag is largely ignored except during the last IRP_MJ_CLEANUP for the FCB (the UncleanCount == 1 check), when the file will be deleted if possible by calling FatDeleteDirent() which removes the corresponding directory entry.

There is yet another way to delete a file. IRP_MJ_CREATE allows a caller to open a file and specify that once the handle closes the file will should be deleted. This is achieved through the FILE_DELETE_ON_CLOSE option. Looking at our FastFat source at the create path (starting from FatCommonCreate() in \src\filesys\fastfat\Win7\create.c) we can see that the flag is translated into a CCB flag, CCB_FLAG_DELETE_ON_CLOSE. The CCB is unique per FILE_OBJECT so basically the FILE_OBJECT remembers that it was opened with the FILE_DELETE_ON_CLOSE flag. The question is, where is the CCB_FLAG_DELETE_ON_CLOSE flag converted into FCB_STATE_DELETE_ON_CLOSE ? A quick search shows that this happens in the IRP_MJ_CLEANUP path. This has a set of interesting implications. For example, since the FCB flag isn't set an IRP_MJ_QUERY_INFORMATION request with the FileStandardInformation information class will not return the DeletePending flag. Also, trying to set the DeleteFile flag to FALSE will have no effect since the FILE_DISPOSITION_INFORMATION structure only affects the FCB_STATE_DELETE_ON_CLOSE flag and not the CCB one.

So before this goes on for too long, I'll try to wrap up this by going over what this behavior means to filters. Please note that this behavior is the same for the NTFS file system, though according to this post on NTFSD this was not always the case. Anyway, here are the conclusions:

  • A file can be deleted by opening a handle and sending a FileDispositionInformation request with DeleteFile = TRUE.
  • Alternatively, a file can marked for deleting when the handle is opened (atomically with the IRP_MJ_CREATE operation) by setting the FILE_DELETE_ON_CLOSE create option.
  • Anyone that has a handle to a file can check whether the delete disposition is set by querying for FILE_STANDARD_INFORMATION and checking the DeletePending flag.
  • If a file is opened with the FILE_DELETE_ON_CLOSE option then there is nothing a filter driver can do to undo that and clear it from the FILE_OBJECT. Moreover, if the filter driver didn't see the IRP_MJ_CREATE request then it will be impossible to determine whether it had the FILE_DELETE_ON_CLOSE and so it will be impossible to know if the file will be deleted when the handle is closed. However, a filter that sees the IRP_MJ_CREATE can remove the FILE_DELETE_ON_CLOSE option before sending the request down to the file system and then the filter can call FltSetInformationFile to set the delete disposition, which achieves a similar behavior with what the user probably expects. Please note though that this is not identical to letting FILE_DELETE_ON_CLOSE through the file system and a filter implementing this approach might break some things that rely on that specific NTFS behavior (though no well-written code should rely on this particular implementation detail since it's not documented by Microsoft and so it could change in the future).
  • If a file is opened with FILE_DELETE_ON_CLOSE and when it is closed there is another handle for the same stream, then the CCB flag will be promoted to the FCB stream during IRP_MJ_CLEANUP but the file will not be deleted. This means that the filter (or whoever had the handle open) might be able to clear this flag if it wants in this case, thus preventing the file from actually being deleted.
  • An open that overwrites a file (IRP_MJ_CREATE with the create disposition FILE_SUPERSEDE, FILE_OVERWRITE or FILE_OVERWRITE_IF) can be considered a delete operation since the original file contents are lost, even though there will still be an entry in the file system with the same name.
  • A file can also be deleted by a rename operation (or a create hardlink operation), if the ReplaceIfExists member of the FILE_RENAME_INFORMATION structure is true. In this case the file will be removed without even being opened, similar in a way to the overwriting open case.

NTFS makes things even more interesting because it has a larger feature set. In particular, hardlinks change the discussion a bit because if a file has multiple hardlinks then delete removes only one of the links but the file still exists on the volume with the other links. NTFS also supports Alternate Data Streams and streams can be deleted independently from the whole file, but if the main data stream is deleted then the whole file is deleted with all the other streams. And finally NTFS supports transactions which means that even if a file is deleted in a transaction and the last IRP_MJ_CLEANUP finds the delete disposition set and it deletes the file (or any stream is deleted or a file is overwritten in a rename and so on) the transaction might rollback and the file will need to be put back the way it was before the transaction started. We'll see how the Delete filter handles these cases in the next post.