CVE-2021-1647: Windows Defender mpengine remote code execution
Maddie Stone, Project Zero
The Basics
Disclosure or Patch Date: 12 January 2021
Product: Microsoft Windows Defender
Advisory: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-1647
Affected Versions: Version 1.1.17600.5 and previous
First Patched Version: Version 1.1.17700.4
Issue/Bug Report: N/A
Patch CL: N/A
Bug-Introducing CL: N/A
Reporter(s): Anonymous
The Code
Proof-of-concept:
Exploit sample: 6e1e9fa0334d8f1f5d0e3a160ba65441f0656d1f1c99f8a9f1ae4b1b1bf7d788
Did you have access to the exploit sample when doing the analysis? Yes
The Vulnerability
Bug class: Heap buffer overflow
Vulnerability details:
There is a heap buffer overflow when Windows Defender (mpengine.dll
) processes the section table when unpacking an ASProtect packed executable. Each section entry has two values: the virtual address and the size of the section. The code in CAsprotectDLLAndVersion::RetrieveVersionInfoAndCreateObjects
only checks if the next section entry's address is lower than the previous one, not if they are equal. This means that if you have a section table such as the one used in this exploit sample: [ (0,0), (0,0), (0x2000,0), (0x2000,0x3000) ]
, 0 bytes are allocated for the section at address 0x2000, but when it sees the next entry at 0x2000, it simply skips over it without exiting nor updating the size of the section. 0x3000 bytes will then be copied to that section during the decompression, leading to the heap buffer overflow.
if ( next_sect_addr > sect_addr )// current va is greater than prev (not also eq)
{
sect_addr = next_sect_addr;
sect_sz = (next_sect_sz + 0xFFF) & 0xFFFFF000;
}
// if next_sect_addr <= sect_addr we continue on to next entry in the table
[...]
new_sect_alloc = operator new[](sect_sz + sect_addr);// allocate new section
[...]
Patch analysis: There are quite a few changes to the function CAsprotectDLLAndVersion::RetrieveVersionInfoAndCreateObjects
between version 1.1.17600.5 (vulnerable) and 1.1.17700.4 (patched). The directly related change was to add an else
branch to the comparison so that if any entry in the section array has an address less than or equal to the previous entry, the code will error out and exit rather than continuing to decompress.
Thoughts on how this vuln might have been found (fuzzing, code auditing, variant analysis, etc.):
It seems possible that this vulnerability was found through fuzzing or manual code review. If the ASProtect unpacking code was included from an external library, that would have made the process of finding this vulnerability even more straightforward for both fuzzing & review.
(Historical/present/future) context of bug:
The Exploit
(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)
Exploit strategy (or strategies):
- The heap buffer overflow is used to overwrite the data in an object stored as the first field in the
lfind_switch
object which is allocated in thelfind_switch::switch_out
function. - The two fields that were overwritten in the object pointed to by the
lfind_switch
object are used as indices inlfind_switch::switch_in
. Due to no bounds checking on these indices, another out-of-bounds write can occur. - The out of bounds write in step 2 performs an
or
operation on the field in theVMM_context_t
struct (the virtual memory manager within Windows Defender) that stores the length of a table that tracks the virtual mapped pages. This field usually equals the number of pages mapped * 2. By performing the 'or' operations, the value in the that field is increased (for example from 0x0000000C to 0x0003030c. When it's increased, it allows for an additional out-of-bounds read & write, used for modifying the memory management struct to allow for arbitrary r/w.
The second step of overwriting the lfind_switch
struct is likely done because the VMM_context_t
struct is very far from the buffer that is originally overflowed (0x3C0000+ in my test). Overwriting this amount of memory would likely make the exploit less stable.
Exploit flow:
The exploit uses "primitive bootstrapping" to to use the original buffer overflow to cause two additional out-of-bounds writes to ultimately gain arbitrary read/write.
Known cases of the same exploit flow: Unknown.
Part of an exploit chain? Unknown.
The Next Steps
Variant analysis
Areas/approach for variant analysis (and why):
- Review ASProtect unpacker for additional parsing bugs.
- Review and/or fuzz other unpacking code for parsing and memory issues.
Found variants: N/A
Structural improvements
What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?
Ideas to kill the bug class:
- Building
mpengine.dll
with ASAN enabled should allow for this bug class to be caught. - Rust. A memory safe language could potentially protect against these types of memory corruption vulnerabilities.
Ideas to mitigate the exploit flow:
- If possible, adding bounds checking to anywhere indices are used. For example, is there a way to add bounds check to when indices are used in
lfind_switch::switch_in
. It could have maybe prevented the 2nd out-of-bounds write which allowed this exploit to modify theVMM_context_t
structure. This would be dependent on the attacker not being able to overwrite the bounds.
Other potential improvements:
- It appears that by default the Windows Defender emulator runs outside of a sandbox. In 2018, there was this article that Windows Defender Antivirus can now run in a sandbox. The article states that when sandboxing is enabled, you will see a content process
MsMpEngCp.exe
running in addition toMsMpEng.exe
. By default, on Windows 10 machines, I only seeMsMpEng.exe
running asSYSTEM
. Sandboxing the anti-malware emulator by default, would make this vulnerability more difficult to exploit because a sandbox escape would then be required in addition to this vulnerability. - Open sourcing unpackers could allow more folks to find issues in this code, which could potentially detect issues like this more readily.
- It did not appear that this code had been extensively fuzzed. If this is the case, incorporating fuzz-testing into the software development lifecycle could help catch these types of issues.
0-day detection methods
What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?
- Detecting these types of 0-days will be difficult due to the sample simply dropping a new file with the characteristics to trigger the vulnerability, such as a section table that includes the same virtual address twice. The exploit method also did not require anything that especially stands out.
Other References
- February 2021: 浅析 CVE-2021-1647 的漏洞利用技巧("Analysis of CVE-2021-1647 vulnerability exploitation techniques") by Threatbook