CVE-2022-22706 / CVE-2021-39793: Mali GPU driver makes read-only imported pages host-writable
Jann Horn
The Basics
Disclosure or Patch Date: March 7, 2022
Product: Arm Mali GPU driver for Linux/Android
Advisory:
- from Arm (upstream): https://developer.arm.com/Arm%20Security%20Center/Mali%20GPU%20Driver%20Vulnerabilities
- from Google Pixel: https://source.android.com/security/bulletin/pixel/2022-03-01#pixel
Affected Versions: see Arm advisory (note that the affected version range for the Bifrost version of the related CVE-2021-28664 seems to be off-by-one)
First Patched Version:
- for Arm: see Arm advisory
- for Pixel: patch level 2022-03-05
Issue/Bug Report: N/A
Bug-Introducing CL: N/A, Arm usually only publishes driver versions as tarballs
Reporter(s): unknown
The Code
Proof-of-concept:
Exploit sample: N/A
Did you have access to the exploit sample when doing the analysis? no
The Vulnerability
Bug class: Broken access control logic
Vulnerability details:
The out-of-tree Mali driver allows userspace to create GPU memory objects
from host-virtual memory areas using the memory type
KBASE_MEM_TYPE_IMPORTED_USER_BUF
, which grabs page references using
pin_user_pages_remote()
(or get_user_pages_remote()
on older kernels).
I think this is somewhat frowned upon in upstream GPU drivers nowadays; for
comparison, the upstream Intel GPU driver i915
has a similar mechanism under
the name userptr
, but the function i915_gem_userptr_ioctl
implementing this
interface has the following comment on top of it:
https://elixir.bootlin.com/linux/v5.18.14/source/drivers/gpu/drm/i915/gem/i915_gem_userptr.c#L477
* Also note, that the object created here is not currently a "first class"
* object, in that several ioctls are banned. These are the CPU access
* ioctls: mmap(), pwrite and pread. In practice, you are expected to use
* direct access via your pointer rather than use those ioctls. Another
* restriction is that we do not allow userptr surfaces to be pinned to the
* hardware and so we reject any attempt to create a framebuffer out of a
* userptr.
*
* If you think this is a good interface to use to pass GPU memory between
* drivers, please use dma-buf instead. In fact, wherever possible use
* dma-buf instead.
Unlike i915, the Mali driver makes it possible for host userspace to create a GPU memory object from a userspace area, but then access this object from userspace.
The driver uses flags on the GPU memory object to track access permissions:
KBASE_REG_GPU_RD
andKBASE_REG_GPU_WR
for read / write access from jobs running on the GPU through GPU-virtual addresses; this mainly works by controlling theENTRY_ACCESS_RW
andENTRY_ACCESS_RO
bits in the GPU page tablesKBASE_REG_CPU_RD
andKBASE_REG_CPU_WR
for read / write access from host kernel code (on behalf of host userspace) and host userspace; these flags affect VMA permission flags in the host kernel (which control permission bits in host page tables) and are also used for explicit permission checks in kernel code
However, in vulnerable versions of the driver, kbase_jd_user_buf_pin_pages()
only checks the KBASE_REG_GPU_WR
flag to determine whether
pin_user_pages_remote()
should request write access, and wrongly ignores
the KBASE_REG_CPU_WR
flag. The fix is essentially (with lots of duplicate
changes to handle different kernel versions):
@ -4556,65 +4557,62 @@ int kbase_jd_user_buf_pin_pages(struct kbase_context *kctx,
struct kbase_va_region *reg)
{
struct kbase_mem_phy_alloc *alloc = reg->gpu_alloc;
struct page **pages = alloc->imported.user_buf.pages;
unsigned long address = alloc->imported.user_buf.address;
struct mm_struct *mm = alloc->imported.user_buf.mm;
long pinned_pages;
long i;
+ int write;
if (WARN_ON(alloc->type != KBASE_MEM_TYPE_IMPORTED_USER_BUF))
return -EINVAL;
if (alloc->nents) {
if (WARN_ON(alloc->nents != alloc->imported.user_buf.nr_pages))
return -EINVAL;
else
return 0;
}
if (WARN_ON(reg->gpu_alloc->imported.user_buf.mm != current->mm))
return -EINVAL;
+ write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
+
[...]
pinned_pages = pin_user_pages_remote(
mm, address, alloc->imported.user_buf.nr_pages,
- reg->flags & KBASE_REG_GPU_WR ? FOLL_WRITE : 0, pages, NULL,
- NULL);
+ write ? FOLL_WRITE : 0, pages, NULL, NULL);
[...]
So in a vulnerable version, an attacker can write into read-only pages from shared libraries and such as follows:
- Map some page from a shared library as read-only
- Create a Mali
KBASE_MEM_TYPE_IMPORTED_USER_BUF
withKBASE_REG_CPU_WR
but withoutKBASE_REG_GPU_WR
from the victim page mapping; this involves creating a host-side VMA for the Mali memory object. The buffer has to be created in a way that doesn't setKBASE_REG_SHARE_BOTH
. - Trigger
kbase_jd_user_buf_pin_pages()
on this memory object (either viaKBASE_IOCTL_KCPU_QUEUE_ENQUEUE
with aBASE_KCPU_COMMAND_TYPE_MAP_IMPORT
command, or by submitting an atom withBASE_JD_REQ_EXTERNAL_RESOURCES
) to execute the incorrectget_user_pages()
call - Write into the Mali memory object from host userspace
Patch analysis:
The patch addresses the remaining site that was missed in the CVE-2021-28664 fix (see below). At this point, I see no remaining places in the driver that look up page pointers with access flags that don't match the corresponding Mali memory object.
Thoughts on how this vuln might have been found:
This vulnerability is a straightforward variant of a previous Mali bug, CVE-2021-28664, which was fixed as follows around 10 months earlier (from the diff between Mali Bifrost r29p0 and r30p0):
static struct kbase_va_region *kbase_mem_from_user_buffer(
struct kbase_context *kctx, unsigned long address,
unsigned long size, u64 *va_pages, u64 *flags)
{
[...]
+ int write;
[...]
+ write = reg->flags & (KBASE_REG_CPU_WR | KBASE_REG_GPU_WR);
+
#if KERNEL_VERSION(4, 6, 0) > LINUX_VERSION_CODE
faulted_pages = get_user_pages(current, current->mm, address, *va_pages,
#if KERNEL_VERSION(4, 4, 168) <= LINUX_VERSION_CODE && \
KERNEL_VERSION(4, 5, 0) > LINUX_VERSION_CODE
- reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
- pages, NULL);
+ write ? FOLL_WRITE : 0, pages, NULL);
#else
- reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+ write, 0, pages, NULL);
#endif
#elif KERNEL_VERSION(4, 9, 0) > LINUX_VERSION_CODE
faulted_pages = get_user_pages(address, *va_pages,
- reg->flags & KBASE_REG_CPU_WR, 0, pages, NULL);
+ write, 0, pages, NULL);
#else
faulted_pages = get_user_pages(address, *va_pages,
- reg->flags & KBASE_REG_CPU_WR ? FOLL_WRITE : 0,
- pages, NULL);
+ write ? FOLL_WRITE : 0, pages, NULL);
#endif
This is very similar to the patch linked above - essentially, this was a bug in
duplicated code, and only one instance of it was patched.
Both copies of the code call get_user_pages()
to grab page references for a
KBASE_MEM_TYPE_IMPORTED_USER_BUF
memory object, and both of them wrongly
ignored KBASE_REG_CPU_WR
. The only difference between them is that one copy is
for the case where pages are pinned directly at object creation, while the other
copy handles the case where pages are pinned at a later point.
Which one of these codepaths is used depends on the KBASE_REG_SHARE_BOTH
flag.
It seems likely that an attacker could have discovered this issue by looking at
the fix for CVE-2021-28664 and searching for other get_user_pages()
callers
in the Mali driver.
There has also been at least one very similar issue in an upstream graphics driver: https://git.kernel.org/linus/cd5297b0855f
(Historical/present/future) context of bug:
See previous section. Additionally:
Looking through the list of public Mali bugs for issues described as
"Mali GPU Kernel Driver elevates CPU RO pages to writable", there is a third
bug CVE-2021-44828 with this description. This bug doesn't involve
get_user_pages()
, but it does again involve a missing check for the
KBASE_REG_CPU_WR
flag.
Various methods across the driver (kbase_kcpu_jit_allocate_process()
,
kbasep_write_soft_event_status()
and kbase_jit_allocate_process()
) would
write to Mali memory objects on behalf of the user, but instead of doing this
by directly writing to corresponding userspace-virtual addresses, they map the
corresponding page into kernel-virtual memory using kbase_vmap()
, then write
to this kernel-virtual address.
The bug was that there was no check to ensure that the Mali memory object was
actually marked as writable using KBASE_REG_CPU_WR
.
This was addressed by instead using kbase_vmap_prot()
, which performs the
necessary access check.
The Exploit
(The terms exploit primitive, exploit strategy, exploit technique, and exploit flow are defined here.)
Exploit strategy (or strategies):
Exploit flow:
Known cases of the same exploit flow:
Part of an exploit chain?
The Next Steps
Variant analysis
Areas/approach for variant analysis (and why):
- Audit permission flag checks in Mali and other GPU drivers for memory
imported via
get_user_pages()
. (TODO)
Found variants:
Structural improvements
What are structural improvements such as ways to kill the bug class, prevent the introduction of this vulnerability, mitigate the exploit flow, make this type of vulnerability harder to exploit, etc.?
Ideas to kill the bug class:
- Maybe get rid of the
get_user_pages
-based interface if it's unnecessary, since havingKBASE_MEM_TYPE_IMPORTED_USER_BUF
makes the impact of these types of bugs much worse?
Ideas to mitigate the exploit flow:
Other potential improvements:
0-day detection methods
What are potential detection methods for similar 0-days? Meaning are there any ideas of how this exploit or similar exploits could be detected as a 0-day?