Thursday, December 1, 2011

Name Normalization in Win8

It's interesting to note that one of the FltMgr features that a lot of minifilters use is the ability to have FltMgr generate normalized paths. This is not something trivial to implement and a lot of the code in FltMgr is dedicated to generating these names. It's also fairly expensive so FltMgr implements a name cache. However file systems don't implement any mechanism to get this information, even though looking at the Win7 WDK we can see some references that indicate that this has been in the works (search for FileNormalizedNameInformation; there are references to it both in wdm.h and ntddk.h):

typedef enum _FILE_INFORMATION_CLASS {
    FileDirectoryInformation         = 1,
    FileFullDirectoryInformation,   // 2
 ...
    FileNormalizedNameInformation,           // 48 <- this seems to be the information class one can use to request the name..
 ...
} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS;

//
// This is also used for FileNormalizedNameInformation <- this indicates that the _FILE_NAME_INFORMATION structure can also be used for normalized names
//

typedef struct _FILE_NAME_INFORMATION {
    ULONG FileNameLength;
    WCHAR FileName[1];
} FILE_NAME_INFORMATION, *PFILE_NAME_INFORMATION;

Well, with the arrival of Win8 things are about to change and file systems and file system filters now have a way to query the file system directly for a normalized name. We can see some of the work that needs to happen by looking at the Win8 FastFat source, but the changes look pretty minimal. Also, there are no changes that I can see in the WDK headers from the Win7 WDK.

_Requires_lock_held_(_Global_critical_region_)    
NTSTATUS
FatCommonQueryInformation (
    IN PIRP_CONTEXT IrpContext,
    IN PIRP Irp
    )
{
...
            case FileNormalizedNameInformation:  <- FastFat will now support this request

                FatQueryNameInfo( IrpContext, Fcb, Ccb, TRUE, Buffer, &Length ); <- we can see a new boolean parameter for FatQueryNameInfo
                break;
}
...

VOID
FatQueryNameInfo (
...
    IN BOOLEAN Normalized,  <- new argument to tell FatQueryNameInfo whether it should return a normalized name or not
...
    )

/*++

Routine Description:

    This routine performs the query name information function for fat.

Arguments:

 ...
    Normalized - if true the caller wants a normalized name (w/out short names).
        This means we're servicing a FileNormalizedNameInformation query.

 ...

Return Value:

    None

--*/

{

 ...
    if (!Normalized &&
        (Fcb->LongName.Unicode.Name.Unicode.Buffer != NULL)) {

        if ((Ccb != NULL) &&
            FlagOn(Ccb->Flags, CCB_FLAG_OPENED_BY_SHORTNAME)) {

            TrimLength = Fcb->FinalNameLength;
        }
    }

 …
}

So now that we know that FastFat implements this, the next logical question is whether NTFS implements it as well and a small program I wrote confirms that it does and that it does indeed return a normalized path like one would expect.

The implications for file system developers are pretty clear, this is one more information class they need to implement and they can use FastFat as a sample of how that can be implemented. One thing I would add is that file systems might want to be extra careful about the performance of the implementation since it's likely that FltMgr's name generation and normalization code will use this information class and so it might be called pretty frequently.

Finally I'd like to talk about the implications about file system filter developers. There are a couple of scenarios that I think could be impacted by this.

  • Name providers must implement this information class. Actually all the filters that currently implement the FileNameInformation class should implement this new class as well (and if they do that they're most likely name providers.. I can't think of a case where a filter would need to implement that class and not be a name provider).
  • Filters that call FltGetFileNameInformation(..,FLT_FILE_NAME_NORMALIZED, ..) in preCreate might not see a significant performance improvement since in preCreate the file isn't opened and so FltMgr can't use this information class and while it might be able to use it for some parts of the path, a lot of the overhead is still there. So querying the normalized name in preCreate is likely to be just as bad as ever.
  • Filters that call FltGetFileNameInformation(..,FLT_FILE_NAME_NORMALIZED, ..) for open files might see some performance improvements since FltMgr should be able to leverage this information class a lot. Please note that I haven't verified that FltMgr actually does use this class, but it would make a lot of sense to use it.
  • Legacy filters would probably benefit the most from this new class but I don't expect that a lot of the legacy filters that are still around are in active development.

2 comments:

  1. > since in preCreate the file isn't opened and so FltMgr can't use this information class

    Is the reason that since the file is not opened, FltMgr can not issue IRP_MJ_QUERY_INFORMATION/FileNormalizedNameInformation to filesystem? Then how can FltMgr obtain normalized filename in preCreate even if in slow way? And fact that file is not opened in preCreate doesn't matter to FltGetFileNameInformation(..,FLT_FILE_NAME_OPENED, ..)?

    Is the same sentence about performance (just as bad as ever) true also for FltGetDestinationFileNameInformation(..,FLT_FILE_NAME_NORMALIZED, ..) in preSetInformation (renaming)?

    ReplyDelete
    Replies
    1. The general approach (before FileNormalizedNameInformation) is to get a name (the opened name) and then normalize it, which involves opening the parent folder for any name that might be a short name and do a directory query. So in preCreate FltMgr might need to open the parent folder for the file that is actually being opened and then it will issue a directory query on it. Naturally, if the normalized name for that folder or other folders on the path to the file have a cached normalized name then this whole process takes a lot less time.
      So yes, if the FILE_OBJECT isn't opened then FltMgr doesn't have a FILE_OBJECT to send the FileNormalizedNameInformation request on.
      FltGetDestinationFileNameInformation is similar in a way to preCreate, in that there is no FILE_OBJECT that represents the actual destination file. So FltMgr must also build the name from the target directory (for which it can get the FileNormalizedNameInformation) and the name specified in the rename request. So actually performance for FltGetDestinationFileNameInformation might be better but it's still not cheap.

      Delete