# CVE-2022-22706 / CVE-2021-39793: Mali GPU driver makes read-only imported pages host-writable
*Jann Horn*

## The Basics

**Disclosure or Patch Date:** March 7, 2022

**Product:** Arm Mali GPU driver for Linux/Android

**Advisory:**

 - from Arm (upstream): https://developer.arm.com/Arm%20Security%20Center/Mali%20GPU%20Driver%20Vulnerabilities
 - from Google Pixel: https://source.android.com/security/bulletin/pixel/2022-03-01#pixel

**Affected Versions:** see Arm advisory (note that the affected version range
for the Bifrost version of the related CVE-2021-28664 seems to be off-by-one)

**First Patched Version:**

 - for Arm: see Arm advisory
 - for Pixel: patch level 2022-03-05

**Issue/Bug Report:** N/A

**Patch CL:** https://android.googlesource.com/kernel/google-modules/gpu/+/5381ff7b4106b277ff207396e293ede2bf959f0c%5E%21/

**Bug-Introducing CL:** N/A, Arm usually only publishes driver versions as tarballs

**Reporter(s):** unknown

## The Code

**Proof-of-concept:**

**Exploit sample:** N/A

**Did you have access to the exploit sample when doing the analysis?** no

## The Vulnerability

**Bug class:** Broken access control logic

**Vulnerability details:**

The out-of-tree Mali driver allows userspace to create GPU memory objects
from host-virtual memory areas using the memory type
`KBASE_MEM_TYPE_IMPORTED_USER_BUF`, which grabs page references using
`pin_user_pages_remote()` (or `get_user_pages_remote()` on older kernels).
I think this is somewhat frowned upon in upstream GPU drivers nowadays; for
comparison, the upstream Intel GPU driver `i915` has a similar mechanism under
the name `userptr`, but the function `i915_gem_userptr_ioctl` implementing this
interface has the following comment on top of it:

https://elixir.bootlin.com/linux/v5.18.14/source/drivers/gpu/drm/i915/gem/i915_gem_userptr.c#L477
```
 * Also note, that the object created here is not currently a "first class"
 * object, in that several ioctls are banned. These are the CPU access
 * ioctls: mmap(), pwrite and pread. In practice, you are expected to use
 * direct access via your pointer rather than use those ioctls. Another
 * restriction is that we do not allow userptr surfaces to be pinned to the
 * hardware and so we reject any attempt to create a framebuffer out of a
 * userptr.
 *
 * If you think this is a good interface to use to pass GPU memory between
 * drivers, please use dma-buf instead. In fact, wherever possible use
 * dma-buf instead.
```

Unlike i915, the Mali driver makes it possible for host userspace to create a
GPU memory object from a userspace area, but then access this object from
userspace.

The driver uses flags on the GPU memory object to track access permissions:

 - `KBASE_REG_GPU_RD` and `KBASE_REG_GPU_WR` for read / write access from jobs
   running on the GPU through GPU-virtual addresses; this mainly works by
   controlling the `ENTRY_ACCESS_RW` and `ENTRY_ACCESS_RO` bits in the GPU page
   tables
 - `KBASE_REG_CPU_RD` and `KBASE_REG_CPU_WR` for read / write access from
   host kernel code (on behalf of host userspace) and host userspace;
   these flags affect VMA permission flags in the host kernel (which control
   permission bits in host page tables) and are also used for explicit
   permission checks in kernel code

However, in vulnerable versions of the driver, `kbase_jd_user_buf_pin_pages()`
only checks the `KBASE_REG_GPU_WR` flag to determine whether
`pin_user_pages_remote()` should request write access, and wrongly ignores
the `KBASE_REG_CPU_WR` flag. The fix is essentially (with lots of duplicate
changes to handle different kernel versions):
```diff
@ -4556,65 +4557,62 @@ int kbase_jd_user_buf_pin_pages(struct kbase_context *kctx,
                struct kbase_va_region *reg)
 {
        struct kbase_mem_phy_alloc *alloc = reg->gpu_alloc;
        struct page **pages = alloc->imported.user_buf.pages;
        unsigned long address = alloc->imported.user_buf.address;
        struct mm_struct *mm = alloc->imported.user_buf.mm;
        long pinned_pages;
        long i;
+       int write;
 
        if (WARN_ON(alloc->type != KBASE_MEM_TYPE_IMPORTED_USER_BUF))
                return -EINVAL;
 
        if (alloc->nents) {
                if (WARN_ON(alloc->nents != alloc->imported.user_buf.nr_pages))
                        return -EINVAL;
                else
                        return 0;
        }
 
        if (WARN_ON(reg->gpu_alloc->imported.user_buf.mm != current->mm))
                return -EINVAL;
 
+       write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
+
[...]
        pinned_pages = pin_user_pages_remote(
                mm, address, alloc->imported.user_buf.nr_pages,
-               reg->flags & KBASE_REG_GPU_WR ? FOLL_WRITE : 0, pages, NULL,
-               NULL);
+               write ? FOLL_WRITE : 0, pages, NULL, NULL);
[...]
```


So in a vulnerable version, an attacker can write into read-only pages from
shared libraries and such as follows:

 - Map some page from a shared library as read-only
 - Create a Mali `KBASE_MEM_TYPE_IMPORTED_USER_BUF` with `KBASE_REG_CPU_WR` but
   without `KBASE_REG_GPU_WR` from the victim page mapping; this involves
   creating a host-side VMA for the Mali memory object.
   The buffer has to be created in a way that doesn't set
   `KBASE_REG_SHARE_BOTH`.
 - Trigger `kbase_jd_user_buf_pin_pages()` on this memory object (either via
   `KBASE_IOCTL_KCPU_QUEUE_ENQUEUE` with a `BASE_KCPU_COMMAND_TYPE_MAP_IMPORT`
   command, or by submitting an atom with `BASE_JD_REQ_EXTERNAL_RESOURCES`) to
   execute the incorrect `get_user_pages()` call
 - Write into the Mali memory object from host userspace

**Patch analysis:**

The patch addresses the remaining site that was missed in
the CVE-2021-28664 fix (see below).
At this point, I see no remaining places in the driver that look up page
pointers with access flags that don't match the corresponding Mali memory
object.

**Thoughts on how this vuln might have been found:**

This vulnerability is a straightforward variant of a previous Mali bug,
CVE-2021-28664, which was fixed as follows around 10 months earlier
(from the diff between Mali Bifrost r29p0 and r30p0):
```diff
 static struct kbase_va_region *kbase_mem_from_user_buffer(
                struct kbase_context *kctx, unsigned long address,
                unsigned long size, u64 *va_pages, u64 *flags)
 {
[...]
+       int write;
[...]
+       write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
+
 #if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
        faulted_pages = get_user_pages(current, current->mm, address, *va_pages,
 #if KERNEL_VERSION(4, 4, 168) <= LINUX_VERSION_CODE && \
 KERNEL_VERSION(4, 5, 0) > LINUX_VERSION_CODE
-                       reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
-                       pages, NULL);
+                       write ? FOLL_WRITE : 0, pages, NULL);
 #else
-                       reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+                       write, 0, pages, NULL);
 #endif
 #elif KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
        faulted_pages = get_user_pages(address, *va_pages,
-                       reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+                       write, 0, pages, NULL);
 #else
        faulted_pages = get_user_pages(address, *va_pages,
-                       reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
-                       pages, NULL);
+                       write ? FOLL_WRITE : 0, pages, NULL);
 #endif
```

This is very similar to the patch linked above - essentially, this was a bug in
duplicated code, and only one instance of it was patched.
Both copies of the code call `get_user_pages()` to grab page references for a
`KBASE_MEM_TYPE_IMPORTED_USER_BUF` memory object, and both of them wrongly
ignored `KBASE_REG_CPU_WR`. The only difference between them is that one copy is
for the case where pages are pinned directly at object creation, while the other
copy handles the case where pages are pinned at a later point.
Which one of these codepaths is used depends on the `KBASE_REG_SHARE_BOTH` flag.

It seems likely that an attacker could have discovered this issue by looking at
the fix for CVE-2021-28664 and searching for other `get_user_pages()` callers
in the Mali driver.

There has also been at least one very similar issue in an upstream graphics
driver: https://git.kernel.org/linus/cd5297b0855f

**(Historical/present/future) context of bug:**

See previous section. Additionally:

Looking through the list of public Mali bugs for issues described as
_"Mali GPU Kernel Driver elevates CPU RO pages to writable"_, there is a third
bug CVE-2021-44828 with this description. This bug doesn't involve
`get_user_pages()`, but it does again involve a missing check for the
`KBASE_REG_CPU_WR` flag.

Various methods across the driver (`kbase_kcpu_jit_allocate_process()`,
`kbasep_write_soft_event_status()` and `kbase_jit_allocate_process()`) would
write to Mali memory objects on behalf of the user, but instead of doing this
by directly writing to corresponding userspace-virtual addresses, they map the
corresponding page into kernel-virtual memory using `kbase_vmap()`, then write
to this kernel-virtual address.
The bug was that there was no check to ensure that the Mali memory object was
actually marked as writable using `KBASE_REG_CPU_WR`.
This was addressed by instead using `kbase_vmap_prot()`, which performs the
necessary access check.

## The Exploit

(The terms *exploit primitive*, *exploit strategy*, *exploit technique*, and *exploit flow* are [defined here](https://googleprojectzero.blogspot.com/2020/06/a-survey-of-recent-ios-kernel-exploits.html).)

**Exploit strategy (or strategies):** 

**Exploit flow:** 

**Known cases of the same exploit flow:**

**Part of an exploit chain?**

## The Next Steps

### Variant analysis

**Areas/approach for variant analysis (and why):**

 - Audit permission flag checks in Mali and other GPU drivers for memory
   imported via `get_user_pages()`. (**TODO**)

**Found variants:**

### Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

**Ideas to kill the bug class:**

 - Maybe get rid of the `get_user_pages`-based interface if it's unnecessary,
   since having `KBASE_MEM_TYPE_IMPORTED_USER_BUF` makes the impact of these
   types of bugs much worse?

**Ideas to mitigate the exploit flow:**

**Other potential improvements:**

### 0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected **as a 0-day**?

## Other References 
