Thursday, November 4, 2010

ObQueryNameString can return names with a NULL Buffer ( and an example with SR.sys)

ObQueryNameString is a very useful API. It's used in a lot of places and is a pretty good choice if you want to find the name for an OB object. However, using it is not without pitfalls. At the moment the documentation page on MSDN has this to say in the Remarks section: "If the given object is unnamed, or if the object name was not successfully acquired, ObQueryNameString sets Name.Buffer to NULL and sets Name.Length and Name.MaximumLength to zero.". What is not clearly spelled in there is the fact that the return status in this case will be STATUS_SUCCESS.

So let's recap. Any app developer can call ObQueryNameString and get STATUS_SUCCESS, but the Name.Buffer will be NULL and they might not expect that. I've seen this issue over and over again. People get a reference to an object and they query the name, get a NULL buffer and then try to read/compare/do whatever with it and they get a visit from the bugcheck fairy. Please note that since the Length and MaximumLength are both 0, people would be safe using the Rtl functions since they tend to check these sort of things..

It is interesting to understand the reasons about how people can get bitten by this as well. The documentation specifies that "If the given object is unnamed, or if the object name was not successfully acquired...", which I guess for most people it translates into "if the name was not successfully acquired then I will some error NTSTATUS… if this object is unnamed then it's not clear what I get, maybe also some error code?..". So I suppose that people that create named objects that they own (or objects that the system creates and are guaranteed to be named) imagine that they can never get the NULL buffer and STATUS_SUCCESS. But any named object can become unnamed when it is deleted. After all, the namespace entry is simply an additional reference to the object and deleting a named object simply deletes that reference, but the object might still be kept around by other references. One easy way to see this is to follow the calls to IoCreateDevice. For example, for an unnamed device one can see this :

Imediately after a IoCreateDevice for an unnamed device:

3: kd> !devobj 93602e48  
Device object (93602e48) is for:
  \FileSystem\FltMgr DriverObject 92cb6660
Current Irp 00000000 RefCount 0 Type 00000003 Flags 00000080
DevExt 93602f00 DevObjExt 93602f30 
ExtensionFlags (0x00000800)  
                             Unknown flags 0x00000800
Device queue is not busy.

3: kd> !object 93602e48  
Object: 93602e48  Type: (922d6440) Device
    ObjectHeader: 93602e30 (new version)
    HandleCount: 0  PointerCount: 1

And immediately after a named device:
2: kd> !devobj 930e0628  
Device object (930e0628) is for:
 FltMgr \FileSystem\FltMgr DriverObject 92f691e8
Current Irp 00000000 RefCount 0 Type 00000008 Flags 000000c0
Dacl 96fd2eec DevExt 00000000 DevObjExt 930e06e0 
ExtensionFlags (0x00000800)  
                             Unknown flags 0x00000800
Device queue is not busy.

2: kd> !object 930e0628
Object: 930e0628  Type: (922d7508) Device
    ObjectHeader: 930e0610 (new version)
    HandleCount: 0  PointerCount: 2
    Directory Object: 96e61948  Name: FltMgr

Please notice how the pointer count is different. Once the named device is deleted (IoDeleteDevice), the reference from the OB namespace is removed (and the object's name in the OB header is changed) and then, when the reference count eventually reaches 0, the object is freed. However, it anyone calls ObQueryNameString on one of those references, they will get the NULL Name.Buffer...

So it is perfectly possible for a driver that is working with an object that it knows it must be named to actually get in the window between when the object is removed from the OB namespace and the time the final reference is released (the driver will of course have a reference of its own in order to be able to access the object… ). So that this means is that calling ObQueryNameString might return STATUS_SUCCESS and a NULL Name.Buffer even for a named object.

I've recently had the pleasure to debug an issue with SR.sys and my virtual volume drive on XP SP3. I will share it since it was somewhat interesting and it points out to this specific issue. This is what the stack looks like:

1: kd> lm v m sr
start    end        module name
f8489000 f849af00   sr         (pdb symbols)          d:\symbols\sr.pdb\9D5432B7234C4CD2A8F6275B9D9AF41F1\sr.pdb
    Loaded symbol image file: sr.sys
    Image path: sr.sys
    Image name: sr.sys
    Timestamp:        Sun Apr 13 11:36:50 2008 (480252C2)
    CheckSum:         00012604
    ImageSize:        00011F00
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
The problem in SR is this one:
sr!SrGetObjectName+0xd4:
f849105c ff15c49b48f8    call    dword ptr [sr!_imp__ObQueryNameString (f8489bc4)]  <- call ObQueryNameString
f8491062 3bc3            cmp     eax,ebx   <- check for STATUS_SUCCESS
f8491064 894514          mov     dword ptr [ebp+14h],eax  <- save the status...
f8491067 7c24            jl      sr!SrGetObjectName+0x105 (f849108d)
f8491069 0fb707          movzx   eax,word ptr [edi]   <-  this is the Length member of the UNICODE_STRING for the name
f849106c 8b4f04          mov     ecx,dword ptr [edi+4]  <- this is the Buffer member of the UNICODE_STRING..
f849106f d1e8            shr     eax,1   <- calculate the number of characters instead of the number of bytes
f8491071 66897702        mov     word ptr [edi+2],si   <- write some value in MaximumLength… 
f8491075 66891c41        mov     word ptr [ecx+eax*2],bx  <-   write in the buffer a 0 (basically, make sure the string is NULL terminated).. But ECX can be NULL
The stack when I hit this problem looks like this:
1: kd> kbn
 # ChildEBP RetAddr  Args to Child              
00 f80b7944 f849440d 00000000 81dc0a18 e10eac08 sr!SrGetObjectName+0xed
01 f80b7990 f848ecf2 81dc0a18 8239a818 f80b79c0 sr!SrCreateAttachmentDevice+0x99
02 f80b79c4 f848ee0f 8239a818 8239a8d0 81fb4d48 sr!SrFsControlMount+0x2e
03 f80b79e0 804ef18f 8239a8d0 81fb4c90 81fb4c90 sr!SrFsControl+0x4b
04 f80b79f0 80581bc7 00000000 81dc0a18 806e6a4c nt!IopfCallDriver+0x31
05 f80b7a40 804f53d6 c000014f f80b7b00 00000000 nt!IopMountVolume+0x1b9
06 f80b7a70 80582bc0 81e1f268 81dc0a18 f80b7ba4 nt!IopCheckVpbMounted+0x5e
07 f80b7b60 805bf444 81dc0a18 00000000 81fc6600 nt!IopParseDevice+0x3d8
08 f80b7bd8 805bb9d0 00000000 f80b7c18 00000040 nt!ObpLookupObjectName+0x53c
09 f80b7c2c 80576033 00000000 00000000 00000001 nt!ObOpenObjectByName+0xea
0a f80b7ca8 805769aa 009bef80 00100001 009bef24 nt!IopCreateFile+0x407
0b f80b7d04 8057a1a9 009bef80 00100001 009bef24 nt!IoCreateFile+0x8e
0c f80b7d44 8054161c 009bef80 00100001 009bef24 nt!NtOpenFile+0x27
0d f80b7d44 7c90e4f4 009bef80 00100001 009bef24 nt!KiFastCallEntry+0xfc
0e 009beef4 7c90d58c 7c80ec86 009bef80 00100001 ntdll!KiFastSystemCallRet
0f 009beef8 7c80ec86 009bef80 00100001 009bef24 ntdll!NtOpenFile+0xc
10 009bf1f0 7c80ef87 01be31e8 00000000 01be7bf0 kernel32!FindFirstFileExW+0x1a7
11 009bf210 751b1e05 01be31e8 01be7bf0 751a2a04 kernel32!FindFirstFileW+0x16
12 009bf240 751aad1f 009bf714 00000001 000e1358 srsvc!Delnode_Recurse+0x12e
13 009bfb34 751abd1f 009bfd54 7c97b440 7c97b420 srsvc!CEventHandler::OnFirstWrite_Notification+0x3cd
14 009bff60 7c927ba5 00000000 0000006a 000e5f40 srsvc!IoCompletionCallback+0x17a
15 009bff74 7c927b7c 751abba5 00000000 0000006a ntdll!RtlpApcCallout+0x11
16 009bffb4 7c80b713 00000000 00000000 00000000 ntdll!RtlpWorkerThread+0x87
17 009bffec 00000000 7c910230 00000000 00000000 kernel32!BaseThreadStart+0x37
So as you can see, in the mount path SR.sys is trying to create their device to attach to the volume and while doing that it tries to get the name for this device:
1: kd> !devobj 81dc0a18 
Device object (81dc0a18) is for:
  \Driver\IvmVhd DriverObject 81fad590
Current Irp 00000000 RefCount 1 Type 00000007 Flags 00000050
Vpb 81ea0f10 Dacl e1f17924 DevExt 81dc0ad0 DevObjExt 81dc0c30 
ExtensionFlags (0x00000002)  DOE_DELETE_PENDING
Device queue is not busy.
This happens to be my virtual volume device, which as you can tell from the DOE_DELETE_PENDING flag, is about to be torn down. So what this all looks like is this:

1. Something is trying to open a file, see the IopCreateFile (frame 0xb)

2. Io manager, while trying to send the IRP_MJ_CREATE irp (frame 7) wants to make sure the volume is mounted. Please note that at this point the volume device is still in the OB namespace, since otherwise the ObpLookupObjectName call (frame 8) would not have been able to reference it… So at this point IO manager resolved the name to a device object and it now has a reference to the device...

3. IopCheckVpbMounted (frame 6) finds the volume is not mounted (since I dismount it before tearing it down) so it tries to mount it…

4. SR.sys gets the mount request and is trying to build a device to attach to the newly mounted volume (in case the mount succeeds). This is pretty standard stuff for a legacy filter… Anyway, in doing so it calls ObQueryNameString which no longer finds a name for the device and returns a NULL buffer. SR checks for NT_SUCCESS but doesn't check the buffer to make sure it's not null (or even the length which is 0) and it blindly tries to make sure the string is NULL terminated (which is also pointless since ObQueryNameString documentation mentions that "The object name (when present) includes a NULL-terminator and all path separators "\" in the name.")… bugcheck.

What my driver did was simply call IoDeleteDevice somewhere between frame 8 and frame 0.

I'm willing to bet that not checking for the null Name.Buffer is a pretty common mistake. For example, there is some code posted on a blog that looks like this:

Status = ObQueryNameString(FileObject->Vpb->RealDevice,OBI,Returned,&Returned ); 

if(NT_SUCCESS(Status)) { 
if(Root) { 
wcscat(OBI->Name.Buffer,L"\\"); 
OBI->Name.Length+=sizeof(WCHAR); 
}