Thursday, December 1, 2011

Name Normalization in Win8

It's interesting to note that one of the FltMgr features that a lot of minifilters use is the ability to have FltMgr generate normalized paths. This is not something trivial to implement and a lot of the code in FltMgr is dedicated to generating these names. It's also fairly expensive so FltMgr implements a name cache. However file systems don't implement any mechanism to get this information, even though looking at the Win7 WDK we can see some references that indicate that this has been in the works (search for FileNormalizedNameInformation; there are references to it both in wdm.h and ntddk.h):

typedef enum _FILE_INFORMATION_CLASS {
    FileDirectoryInformation         = 1,
    FileFullDirectoryInformation,   // 2
 ...
    FileNormalizedNameInformation,           // 48 <- this seems to be the information class one can use to request the name..
 ...
} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS;

//
// This is also used for FileNormalizedNameInformation <- this indicates that the _FILE_NAME_INFORMATION structure can also be used for normalized names
//

typedef struct _FILE_NAME_INFORMATION {
    ULONG FileNameLength;
    WCHAR FileName[1];
} FILE_NAME_INFORMATION, *PFILE_NAME_INFORMATION;

Well, with the arrival of Win8 things are about to change and file systems and file system filters now have a way to query the file system directly for a normalized name. We can see some of the work that needs to happen by looking at the Win8 FastFat source, but the changes look pretty minimal. Also, there are no changes that I can see in the WDK headers from the Win7 WDK.

_Requires_lock_held_(_Global_critical_region_)    
NTSTATUS
FatCommonQueryInformation (
    IN PIRP_CONTEXT IrpContext,
    IN PIRP Irp
    )
{
...
            case FileNormalizedNameInformation:  <- FastFat will now support this request

                FatQueryNameInfo( IrpContext, Fcb, Ccb, TRUE, Buffer, &Length ); <- we can see a new boolean parameter for FatQueryNameInfo
                break;
}
...

VOID
FatQueryNameInfo (
...
    IN BOOLEAN Normalized,  <- new argument to tell FatQueryNameInfo whether it should return a normalized name or not
...
    )

/*++

Routine Description:

    This routine performs the query name information function for fat.

Arguments:

 ...
    Normalized - if true the caller wants a normalized name (w/out short names).
        This means we're servicing a FileNormalizedNameInformation query.

 ...

Return Value:

    None

--*/

{

 ...
    if (!Normalized &&
        (Fcb->LongName.Unicode.Name.Unicode.Buffer != NULL)) {

        if ((Ccb != NULL) &&
            FlagOn(Ccb->Flags, CCB_FLAG_OPENED_BY_SHORTNAME)) {

            TrimLength = Fcb->FinalNameLength;
        }
    }

 …
}

So now that we know that FastFat implements this, the next logical question is whether NTFS implements it as well and a small program I wrote confirms that it does and that it does indeed return a normalized path like one would expect.

The implications for file system developers are pretty clear, this is one more information class they need to implement and they can use FastFat as a sample of how that can be implemented. One thing I would add is that file systems might want to be extra careful about the performance of the implementation since it's likely that FltMgr's name generation and normalization code will use this information class and so it might be called pretty frequently.

Finally I'd like to talk about the implications about file system filter developers. There are a couple of scenarios that I think could be impacted by this.

  • Name providers must implement this information class. Actually all the filters that currently implement the FileNameInformation class should implement this new class as well (and if they do that they're most likely name providers.. I can't think of a case where a filter would need to implement that class and not be a name provider).
  • Filters that call FltGetFileNameInformation(..,FLT_FILE_NAME_NORMALIZED, ..) in preCreate might not see a significant performance improvement since in preCreate the file isn't opened and so FltMgr can't use this information class and while it might be able to use it for some parts of the path, a lot of the overhead is still there. So querying the normalized name in preCreate is likely to be just as bad as ever.
  • Filters that call FltGetFileNameInformation(..,FLT_FILE_NAME_NORMALIZED, ..) for open files might see some performance improvements since FltMgr should be able to leverage this information class a lot. Please note that I haven't verified that FltMgr actually does use this class, but it would make a lot of sense to use it.
  • Legacy filters would probably benefit the most from this new class but I don't expect that a lot of the legacy filters that are still around are in active development.