How we fought bad apps and developers in 2020

April 21, 2021

Posted by Krish Vitaldevara, Director of Product Management Trust & Safety, Google Play

Providing safe experiences to billions of users and millions of Android developers has been one of the highest priorities for Google Play for many years. Last year we introduced new policies, improved our systems, and further optimized our processes to better protect our users, assist good developers and strengthen our guard against bad apps and developers. Additionally, in 2020, Google Play Protect scanned over 100B installed apps each day for malware across billions of devices.

Users come to Google Play to find helpful, reliable apps on everything from COVID-19 vaccine information to new forms of entertainment, grocery delivery, communication and more.

As such, we introduced a series of policies and new developer support to continue to elevate information quality on the platform and reduce the risk of user harm from misinformation.

COVID-19 apps requirements: To ensure public safety, information integrity and privacy, we introduced specific requirements for COVID-19 apps. Under these requirements, apps related to sensitive use cases, such as those providing testing information, must be endorsed by either official governmental entities or healthcare organizations and must meet a high standard for user data privacy.
News policy: To promote transparency in news publishing, we introduced minimum requirements that apps must meet in order for developers to declare their app as a “News” app on Google Play. These guidelines help promote user transparency and developer accountability by providing users with relevant information about the app.
Election support: We created teams and processes across Google Play focused on elections to provide additional support and adapt to the changing landscape. This includes support for government agencies, specially trained app reviewers, and a safety team to address election threats and abuse.

Our core efforts around identifying and mitigating bad apps and developers continued to evolve to address new adversarial behaviors and forms of abuse. Our machine-learning detection capabilities and enhanced app review processes prevented over 962k policy-violating app submissions from getting published to Google Play. We also banned 119k malicious and spammy developer accounts. Additionally, we significantly increased our focus on SDK enforcement, as we've found these violations have an outsized impact on security and user data privacy.

Last year, we continued to reduce developer access to sensitive permissions. In February, we announced a new background location policy to ensure that apps requesting this permission need the data in order to provide clear user benefit. As a result of the new policy, developers now have to demonstrate that benefit and prominently tell users about it or face possible removal from Google Play. We've begun enforcement on apps not meeting new policy guidelines and will provide an update on the usage of this permission in a future blog post.

We've also continued to invest in protecting kids and helping parents find great content. In 2020 we launched a new kids tab filled with “Teacher approved” apps. To evaluate apps, we teamed with academic experts and teachers across the country, including our lead advisors, Joe Blatt (Harvard Graduate School of Education) and Dr. Sandra Calvert (Georgetown University).

As we continue to invest in protecting people from apps with harmful content, malicious behaviors, or threats to user privacy, we are also equally motivated to provide trusted experiences to Play developers. For example, we’ve improved our process for providing relevant information about enforcement actions we’ve taken, resulting in significant reduction in appeals and increased developer satisfaction. We will continue to enhance the speed and quality of our communications to developers, and continue listening to feedback about how we can further engage and elevate trusted developers. Android developers can expect to see more on this front in the coming year.

Our global teams of product managers, engineers, policy experts, and operations leaders are more excited than ever to advance the safety of the platform and forge a sustaining trust with our users. We look forward to building an even better Google Play experience.

A New Standard for Mobile App Security

April 15, 2021

Posted by Brooke Davis and Eugene Liderman, Android Security and Privacy Team

With all of the challenges from this past year, users have become increasingly dependent on their mobile devices to create fitness routines, stay connected with loved ones, work remotely, and order things like groceries with ease. According to eMarketer, in 2020 users spent over three and a half hours per day using mobile apps. With so much time spent on mobile devices, ensuring the safety of mobile apps is more important than ever. Despite the importance of digital security, there isn’t a consistent industry standard for assessing mobile apps. Existing guidelines tend to be either too lightweight or too onerous for the average developer, and lack a compliance arm. That’s why we're excited to share ioXt’s announcement of a new Mobile Application Profile which provides a set of security and privacy requirements with defined acceptance criteria which developers can certify their apps against.

Over 20 industry stakeholders, including Google, Amazon, and a number of certified labs such as NCC Group and Dekra, as well as automated mobile app security testing vendors like NowSecure collaborated to develop this new security standard for mobile apps. We’ve seen early interest from Internet of Things (IoT) and virtual private network (VPN) developers, however the standard is appropriate for any cloud connected service such as social, messaging, fitness, or productivity apps.

The Internet of Secure Things Alliance (ioXt) manages a security compliance assessment program for connected devices. ioXt has over 300 members across various industries, including Google, Amazon, Facebook, T-Mobile, Comcast, Zigbee Alliance, Z-Wave Alliance, Legrand, Resideo, Schneider Electric, and many others. With so many companies involved, ioXt covers a wide range of device types, including smart lighting, smart speakers, and webcams, and since most smart devices are managed through apps, they have expanded coverage to include mobile apps with the launch of this profile.

The ioXt Mobile Application Profile provides a minimum set of commercial best practices for all cloud connected apps running on mobile devices. This security baseline helps mitigate against common threats and reduces the probability of significant vulnerabilities. The profile leverages existing standards and principles set forth by OWASP MASVS and the VPN Trust Initiative, and allows developers to differentiate security capabilities around cryptography, authentication, network security, and vulnerability disclosure program quality. The profile also provides a framework to evaluate app category specific requirements which may be applied based on the features contained in the app. For example, an IoT app only needs to certify under the Mobile Application profile, whereas a VPN app must comply with the Mobile Application profile, plus the VPN extension.

Certification allows developers to demonstrate product safety and we’re excited about the opportunity for this standard to push the industry forward. We observed that app developers were very quick to resolve any issues that were identified during their blackbox evaluations against this new standard, oftentimes with turnarounds in a matter of days. At launch, the following apps have been certified: Comcast, ExpressVPN, GreenMAX, Hubspace, McAfee Innovations, NordVPN, OpenVPN for Android, Private Internet Access, VPN Private, as well as the Google One app, including VPN by Google One.

We look forward to seeing adoption of the standard grow over time and for those app developers that are already investing in security best practices to be able to highlight their efforts. The standard also serves as a guiding light to inspire more developers to invest in mobile app security. If you are interested in learning more about the ioXt Alliance and how to get your app certified, visit https://compliance.ioxtalliance.org/sign-up and check out Android’s guidelines for building secure apps here.

Rust in the Linux kernel

April 14, 2021

Posted by Wedson Almeida Filho, Android Team

In our previous post, we announced that Android now supports the Rust programming language for developing the OS itself. Related to this, we are also participating in the effort to evaluate the use of Rust as a supported language for developing the Linux kernel. In this post, we discuss some technical aspects of this work using a few simple examples.

C has been the language of choice for writing kernels for almost half a century because it offers the level of control and predictable performance required by such a critical component. Density of memory safety bugs in the Linux kernel is generally quite low due to high code quality, high standards of code review, and carefully implemented safeguards. However, memory safety bugs do still regularly occur. On Android, vulnerabilities in the kernel are generally considered high-severity because they can result in a security model bypass due to the privileged mode that the kernel runs in.

We feel that Rust is now ready to join C as a practical language for implementing the kernel. It can help us reduce the number of potential bugs and security vulnerabilities in privileged code while playing nicely with the core kernel and preserving its performance characteristics.

Supporting Rust

We developed an initial prototype of the Binder driver to allow us to make meaningful comparisons between the safety and performance characteristics of the existing C version and its Rust counterpart. The Linux kernel has over 30 million lines of code, so naturally our goal is not to convert it all to Rust but rather to allow new code to be written in Rust. We believe this incremental approach allows us to benefit from the kernel’s existing high-performance implementation while providing kernel developers with new tools to improve memory safety and maintain performance going forward.

We joined the Rust for Linux organization, where the community had already done and continues to do great work toward adding Rust support to the Linux kernel build system. We also need designs that allow code in the two languages to interact with each other: we're particularly interested in safe, zero-cost abstractions that allow Rust code to use kernel functionality written in C, and how to implement functionality in idiomatic Rust that can be called seamlessly from the C portions of the kernel.

Since Rust is a new language for the kernel, we also have the opportunity to enforce best practices in terms of documentation and uniformity. For example, we have specific machine-checked requirements around the usage of unsafe code: for every unsafe function, the developer must document the requirements that need to be satisfied by callers to ensure that its usage is safe; additionally, for every call to unsafe functions (or usage of unsafe constructs like dereferencing a raw pointer), the developer must document the justification for why it is safe to do so.

Just as important as safety, Rust support needs to be convenient and helpful for developers to use. Let’s get into a few examples of how Rust can assist kernel developers in writing drivers that are safe and correct.

Example driver

We'll use an implementation of a semaphore character device. Each device has a current value; writes of n bytes result in the device value being incremented by n; reads decrement the value by 1 unless the value is 0, in which case they will block until they can decrement the count without going below 0.

Suppose semaphore is a file representing our device. We can interact with it from the shell as follows:

> cat semaphore

When semaphore is a newly initialized device, the command above will block because the device's current value is 0. It will be unblocked if we run the following command from another shell because it increments the value by 1, which allows the original read to complete:

> echo -n a > semaphore

We could also increment the count by more than 1 if we write more data, for example:

> echo -n abc > semaphore

increments the count by 3, so the next 3 reads won't block.

To allow us to show a few more aspects of Rust, we'll add the following features to our driver: remember what the maximum value was throughout the lifetime of a device, and remember how many reads each file issued on the device.

We'll now show how such a driver would be implemented in Rust, contrasting it with a C implementation. We note, however, we are still early on so this is all subject to change in the future. How Rust can assist the developer is the aspect that we'd like to emphasize. For example, at compile time it allows us to eliminate or greatly reduce the chances of introducing classes of bugs, while at the same time remaining flexible and having minimal overhead.

Character devices

A developer needs to do the following to implement a driver for a new character device in Rust:

Implement the FileOperations trait: all associated functions are optional, so the developer only needs to implement the relevant ones for their scenario. They relate to the fields in C's struct file_operations.
Implement the FileOpener trait: it is a type-safe equivalent to C's open field of struct file_operations.
Register the new device type with the kernel: this lets the kernel know what functions need to be called in response to files of this new type being operated on.

The following outlines how the first two steps of our example compare in Rust and C:

impl FileOpener<Arc<Semaphore>> for FileState {
    fn open(
        shared: &Arc<Semaphore>
    ) -> KernelResult<Box<Self>> {
        [...]
    }
}
 
impl FileOperations for FileState {
    type Wrapper = Box<Self>;
 
    fn read(
        &self,
        _: &File,
        data: &mut UserSlicePtrWriter,
        offset: u64
    ) -> KernelResult<usize> {
        [...]
    }
 
    fn write(
        &self,
        data: &mut UserSlicePtrReader,
        _offset: u64
    ) -> KernelResult<usize> {
        [...]
    }
 
    fn ioctl(
        &self,
        file: &File,
        cmd: &mut IoctlCommand
    ) -> KernelResult<i32> {
        [...]
    }
 
    fn release(_obj: Box<Self>, _file: &File) {
        [...]
    }
 
    declare_file_operations!(read, write, ioctl);
}

static 
int semaphore_open(struct inode *nodp,
                   struct file *filp)
{
    struct semaphore_state *shared =
        container_of(filp->private_data,
                     struct semaphore_state,
                     miscdev);
    [...]
}
 
static
ssize_t semaphore_write(struct file *filp,
                        const char __user *buffer,
                        size_t count, loff_t *ppos)
{
    struct file_state *state = filp->private_data;
    [...]
}
 
static
ssize_t semaphore_read(struct file *filp,
                       char __user *buffer,
                       size_t count, loff_t *ppos)
{
    struct file_state *state = filp->private_data;
    [...]
}
 
static
long semaphore_ioctl(struct file *filp,
                     unsigned int cmd,
                     unsigned long arg)
{
    struct file_state *state = filp->private_data;
    [...]
}
 
static
int semaphore_release(struct inode *nodp,
                      struct file *filp)
{
    struct file_state *state = filp->private_data;
    [...]
}
 
static const struct file_operations semaphore_fops = {
        .owner = THIS_MODULE,
        .open = semaphore_open,
        .read = semaphore_read,
        .write = semaphore_write,
        .compat_ioctl = semaphore_ioctl,
        .release = semaphore_release,
};

Character devices in Rust benefit from a number of safety features:

Per-file state lifetime management: FileOpener::open returns an object whose lifetime is owned by the caller from then on. Any object that implements the PointerWrapper trait can be returned, and we provide implementations for Box<T> and Arc<T>, so developers that use Rust's idiomatic heap-allocated or reference-counted pointers have no additional requirements.

All associated functions in FileOperations receive non-mutable references to self (more about this below), except the release function, which is the last function to be called and receives the plain object back (and its ownership with it). The release implementation can then defer the object destruction by transferring its ownership elsewhere, or destroy it then; in the case of a reference-counted object, 'destruction' means decrementing the reference count (and actual object destruction if the count goes to zero).

That is, we use Rust's ownership discipline when interacting with C code by handing the C portion ownership of a Rust object, allowing it to call functions implemented in Rust, then eventually giving ownership back. So as long as the C code is correct, the lifetime of Rust file objects work seamlessly as well, with the compiler enforcing correct lifetime management on the Rust side, for example: open cannot return stack-allocated pointers or heap-allocated objects containing pointers to the stack, ioctl/read/write cannot free (or modify without synchronization) the contents of the object stored in filp->private_data, etc.
Non-mutable references: the associated functions called between open and release all receive non-mutable references to self because they can be called concurrently by multiple threads and Rust aliasing rules prohibit more than one mutable reference to an object at any given time.

If a developer needs to modify some state (and they generally do), they can do so via interior mutability: mutable state can be wrapped in a Mutex<T> or SpinLock<T> (or atomics) and safely modified through them.

This prevents, at compile-time, bugs where a developer fails to acquire the appropriate lock when accessing a field (the field is inaccessible), or when a developer fails to wrap a field with a lock (the field is read-only).
Per-device state: when file instances need to share per-device state, which is a very common occurrence in drivers, they can do so safely in Rust. When a device is registered, a typed object can be provided and a non-mutable reference to it is provided when FileOperation::open is called. In our example, the shared object is wrapped in Arc<T>, so files can safely clone and hold on to a reference to them.

The reason FileOperation is its own trait (as opposed to, for example, open being part of the FileOperations trait) is to allow a single file implementation to be registered in different ways.

This eliminates opportunities for developers to get the wrong data when trying to retrieve shared state. For example, in C when a miscdevice is registered, a pointer to it is available in filp->private_data; when a cdev is registered, a pointer to it is available in inode->i_cdev. These structs are usually embedded in an outer struct that contains the shared state, so developers usually use the container_of macro to recover the shared state. Rust encapsulates all of this and the potentially troublesome pointer casts in a safe abstraction.
Static typing: we take advantage of Rust's support for generics to implement all of the above functions and types with static types. So there are no opportunities for a developer to convert an untyped variable or field to the wrong type. The C code in the table above has casts from an essentially untyped (void *) pointer to the desired type at the start of each function: this is likely to work fine when first written, but may lead to bugs as the code evolves and assumptions change. Rust would catch any such mistakes at compile time.
File operations: as we mentioned before, a developer needs to implement the FileOperations trait to customize the behavior of their device. They do this with a block starting with impl FileOperations for Device, where Device is the type implementing the file behavior (FileState in our example). Once inside this block, tools know that only a limited number of functions can be defined, so they can automatically insert the prototypes. (Personally, I use neovim and the rust-analyzer LSP server.)

While we use this trait in Rust, the C portion of the kernel still requires an instance of struct file_operations. The kernel crate automatically generates one from the trait implementation (and optionally the declare_file_operations macro): although it has code to generate the correct struct, it is all const, so evaluated at compile-time with zero runtime cost.

Ioctl handling

For a driver to provide a custom ioctl handler, it needs to implement the ioctl function that is part of the FileOperations trait, as exemplified in the table below.

fn ioctl(
    &self,
    file: &File,
    cmd: &mut IoctlCommand
) -> KernelResult<i32> {
    cmd.dispatch(self, file)
}
 
impl IoctlHandler for FileState {
    fn read(
        &self,
        _file: &File,
        cmd: u32,
        writer: &mut UserSlicePtrWriter
    ) -> KernelResult<i32> {
        match cmd {
            IOCTL_GET_READ_COUNT => {
                writer.write(
                    &self
                    .read_count
                    .load(Ordering::Relaxed))?;
                Ok(0)
            }
            _ => Err(Error::EINVAL),
        }
    }
 
    fn write(
        &self,
        _file: &File,
        cmd: u32,
        reader: &mut UserSlicePtrReader
    ) -> KernelResult<i32> {
        match cmd {
            IOCTL_SET_READ_COUNT => {
                self
                .read_count
                .store(reader.read()?,
                       Ordering::Relaxed);
                Ok(0)
            }
            _ => Err(Error::EINVAL),
        }
    }
}

#define IOCTL_GET_READ_COUNT _IOR('c', 1, u64)
#define IOCTL_SET_READ_COUNT _IOW('c', 1, u64)
 
static
long semaphore_ioctl(struct file *filp,
                     unsigned int cmd,
                     unsigned long arg)
{
    struct file_state *state = filp->private_data;
    void __user *buffer = (void __user *)arg;
    u64 value;
 
    switch (cmd) {
    case IOCTL_GET_READ_COUNT:
        value = atomic64_read(&state->read_count);
        if (copy_to_user(buffer, &value, sizeof(value)))
            return -EFAULT;
        return 0;
    case IOCTL_SET_READ_COUNT:
        if (copy_from_user(&value, buffer, sizeof(value)))
            return -EFAULT;
        atomic64_set(&state->read_count, value);
        return 0;
    default:
        return -EINVAL;
    }
}

Ioctl commands are standardized such that, given a command, we know whether a user buffer is provided, its intended use (read, write, both, none), and its size. In Rust, we provide a dispatcher (accessible by calling cmd.dispatch) that uses this information to automatically create user memory access helpers and pass them to the caller.

A driver is not required to use this though. If, for example, it doesn't use the standard ioctl encoding, Rust offers the flexibility of simply calling cmd.raw to extract the raw arguments and using them to handle the ioctl (potentially with unsafe code, which will need to be justified).

However, if a driver implementation does use the standard dispatcher, it will benefit from not having to implement any unsafe code, and:

The pointer to user memory is never a native pointer, so the developer cannot accidentally dereference it.
The types that allow the driver to read from user space only allow data to be read once, so we eliminate the risk of time-of-check to time-of-use (TOCTOU) bugs because when a driver needs to access data twice, it needs to copy it to kernel memory, where an attacker is not allowed to modify it. Excluding unsafe blocks, there is no way to introduce this class of bugs in Rust.
No accidental overflow of the user buffer: we'll never read or write past the end of the user buffer because this is enforced automatically based on the size encoded in the ioctl command. In our example above, the implementation of IOCTL_GET_READ_COUNT only has access to an instance of UserSlicePtrWriter, which limits the number of writable bytes to sizeof(u64) as encoded in the ioctl command.
No mixing of reads and writes: we'll never write buffers for ioctls that are only meant to read and never read buffers for ioctls that are only meant to write. This is enforced by read and write handlers only getting instances of UserSlicePtrWriter and UserSlicePtrReader respectively.

All of the above could potentially also be done in C, but it's very easy for developers to (likely unintentionally) break contracts that lead to unsafety; Rust requires unsafe blocks for this, which should only be used in rare cases and brings additional scrutiny. Additionally, Rust offers the following:

The types used to read and write user memory do not implement the Send and Sync traits, which means that they (and pointers to them) are not safe to be used in another thread context. In Rust, if a driver developer attempted to write code that passed one of these objects to another thread (where it wouldn't be safe to use them because it isn't necessarily in the right memory manager context), they would get a compilation error.
When calling IoctlCommand::dispatch, one might understandably think that we need dynamic dispatching to reach the actual handler implementation (which would incur additional cost in comparison to C), but we don't. Our usage of generics will lead the compiler to monomorphize the function, which will result in static function calls that can even be inlined if the optimizer so chooses.

Locking and condition variables

We allow developers to use mutexes and spinlocks to provide interior mutability. In our example, we use a mutex to protect mutable data; in the tables below we show the data structures we use in C and Rust, and how we implement a wait until the count is nonzero so that we can satisfy a read:

struct SemaphoreInner {
    count: usize,
    max_seen: usize,
}
 
struct Semaphore {
    changed: CondVar,
    inner: Mutex<SemaphoreInner>,
}
 
struct FileState {
    read_count: AtomicU64,
    shared: Arc<Semaphore>,
}

struct semaphore_state {
    struct kref ref;
    struct miscdevice miscdev;
    wait_queue_head_t changed;
    struct mutex mutex;
    size_t count;
    size_t max_seen;
};
 
struct file_state {
    atomic64_t read_count;
    struct semaphore_state *shared;
};

fn consume(&self) -> KernelResult {
    let mut inner = self.shared.inner.lock();
    while inner.count == 0 {
        if self.shared.changed.wait(&mut inner) {
            return Err(Error::EINTR);
        }
    }
    inner.count -= 1;
    Ok(())
}

static int semaphore_consume(
    struct semaphore_state *state)
{
    DEFINE_WAIT(wait);
 
    mutex_lock(&state->mutex);
    while (state->count == 0) {
        prepare_to_wait(&state->changed, &wait,
                        TASK_INTERRUPTIBLE);
        mutex_unlock(&state->mutex);
        schedule();
        finish_wait(&state->changed, &wait);
        if (signal_pending(current))
            return -EINTR;
        mutex_lock(&state->mutex);
    }
 
    state->count--;
    mutex_unlock(&state->mutex);
 
    return 0;
}

We note that such waits are not uncommon in the existing C code, for example, a pipe waiting for a "partner" to write, a unix-domain socket waiting for data, an inode search waiting for completion of a delete, or a user-mode helper waiting for state change.

The following are benefits from the Rust implementation:

The Semaphore::inner field is only accessible when the lock is held, through the guard returned by the lock function. So developers cannot accidentally read or write protected data without locking it first. In the C example above, count and max_seen in semaphore_state are protected by mutex, but there is no enforcement that the lock is held while they're accessed.
Resource Acquisition Is Initialization (RAII): the lock is unlocked automatically when the guard (inner in this case) goes out of scope. This ensures that locks are always unlocked: if the developer needs to keep a lock locked, they can keep the guard alive, for example, by returning the guard itself; conversely, if they need to unlock before the end of the scope, they can explicitly do it by calling the drop function.
Developers can use any lock that implements the Lock trait, which includes Mutex and SpinLock, at no additional runtime cost when compared to a C implementation. Other synchronization constructs, including condition variables, also work transparently and with zero additional run-time cost.
Rust implements condition variables using kernel wait queues. This allows developers to benefit from atomic release of the lock and putting the thread to sleep without having to reason about low-level kernel scheduler functions. In the C example above, semaphore_consume is a mix of semaphore logic and subtle Linux scheduling: for example, the code is incorrect if mutex_unlock is called before prepare_to_wait because it may result in a wake up being missed.
No unsynchronized access: as we mentioned before, variables shared by multiple threads/CPUs must be read-only, with interior mutability being the solution for cases when mutability is needed. In addition to the example with locks above, the ioctl example in the previous section also has an example of using an atomic variable; Rust also requires developers to specify how memory is to be synchronized by atomic accesses. In the C part of the example, we happen to use atomic64_t, but the compiler won't alert a developer to this need.

Error handling and control flow

In the tables below, we show how open, read, and write are implemented in our example driver:

fn read(
    &self,
    _: &File,
    data: &mut UserSlicePtrWriter,
    offset: u64
) -> KernelResult<usize> {
   if data.is_empty() || offset > 0 {
        return Ok(0);
    }
 
    self.consume()?;
    data.write_slice(&[0u8; 1])?;
    self.read_count.fetch_add(1, Ordering::Relaxed);
    Ok(1)
}

static
ssize_t semaphore_read(struct file *filp,
                       char __user *buffer,
                       size_t count, loff_t *ppos)
{
    struct file_state *state = filp->private_data;
    char c = 0;
    int ret;
 
    if (count == 0 || *ppos > 0)
        return 0;
 
    ret = semaphore_consume(state->shared);
    if (ret)
        return ret;
 
    if (copy_to_user(buffer, &c, sizeof(c)))
        return -EFAULT;
 
    atomic64_add(1, &state->read_count);
    *ppos += 1;
    return 1;
}

fn write(
    &self,
    data: &mut UserSlicePtrReader,
    _offset: u64
) -> KernelResult<usize> {
   {
        let mut inner = self.shared.inner.lock();
        inner.count = inner.count.saturating_add(data.len());
        if inner.count > inner.max_seen {
            inner.max_seen = inner.count;
        }
    }
 
    self.shared.changed.notify_all();
    Ok(data.len())
}

static
ssize_t semaphore_write(struct file *filp,
                        const char __user *buffer,
                        size_t count, loff_t *ppos)
{
    struct file_state *state = filp->private_data;
    struct semaphore_state *shared = state->shared;
 
    mutex_lock(&shared->mutex);
    shared->count += count;
    if (shared->count < count)
        shared->count = SIZE_MAX;
 
    if (shared->count > shared->max_seen)
        shared->max_seen = shared->count;
 
    mutex_unlock(&shared->mutex);
 
    wake_up_all(&shared->changed);
    return count;
}

fn open(
    shared: &Arc<Semaphore>
) -> KernelResult<Box<Self>> {
    Ok(Box::try_new(Self {
        read_count: AtomicU64::new(0),
        shared: shared.clone(),
    })?)
}

static 
int semaphore_open(struct inode *nodp,
                   struct file *filp)
{
    struct semaphore_state *shared =
        container_of(filp->private_data,
                     struct semaphore_state,
                     miscdev);
    struct file_state *state;
 
    state = kzalloc(sizeof(*state), GFP_KERNEL);
    if (!state)
        return -ENOMEM;
 
    kref_get(&shared->ref);
    state->shared = shared;
    atomic64_set(&state->read_count, 0);
 
    filp->private_data = state;
 
    return 0;
}

They illustrate other benefits brought by Rust:

The ? operator: it is used by the Rust open and read implementations to do error handling implicitly; the developer can focus on the semaphore logic, the resulting code being quite small and readable. The C versions have error-handling noise that can make them less readable.
Required initialization: Rust requires all fields of a struct to be initialized on construction, so the developer can never accidentally fail to initialize a field; C offers no such facility. In our open example above, the developer of the C version could easily fail to call kref_get (even though all fields would have been initialized); in Rust, the user is required to call clone (which increments the ref count), otherwise they get a compilation error.
RAII scoping: the Rust write implementation uses a statement block to control when inner goes out of scope and therefore the lock is released.
Integer overflow behavior: Rust encourages developers to always consider how overflows should be handled. In our write example, we want a saturating one so that we don't end up with a zero value when adding to our semaphore. In C, we need to manually check for overflows, there is no additional support from the compiler.

What's next

The examples above are only a small part of the whole project. We hope it gives readers a glimpse of the kinds of benefits that Rust brings. At the moment we have nearly all generic kernel functionality needed by Binder neatly wrapped in safe Rust abstractions, so we are in the process of gathering feedback from the broader Linux kernel community with the intent of upstreaming the existing Rust support.

We also continue to make progress on our Binder prototype, implement additional abstractions, and smooth out some rough edges. This is an exciting time and a rare opportunity to potentially influence how the Linux kernel is developed, as well as inform the evolution of the Rust language. We invite those interested to join us in Rust for Linux and attend our planned talk at Linux Plumbers Conference 2021!

Thanks Nick Desaulniers, Kees Cook, and Adrian Taylor for contributions to this post. Special thanks to Jeff Vander Stoep for contributions and editing, and to Greg Kroah-Hartman for reviewing and contributing to the code examples.

Rust in the Android platform

April 6, 2021

Posted by Jeff Vander Stoep and Stephen Hines, Android Team

Correctness of code in the Android platform is a top priority for the security, stability, and quality of each Android release. Memory safety bugs in C and C++ continue to be the most-difficult-to-address source of incorrectness. We invest a great deal of effort and resources into detecting, fixing, and mitigating this class of bugs, and these efforts are effective in preventing a large number of bugs from making it into Android releases. Yet in spite of these efforts, memory safety bugs continue to be a top contributor of stability issues, and consistently represent ~70% of Android’s high severity security vulnerabilities.

In addition to ongoing and upcoming efforts to improve detection of memory bugs, we are ramping up efforts to prevent them in the first place. Memory-safe languages are the most cost-effective means for preventing memory bugs. In addition to memory-safe languages like Kotlin and Java, we’re excited to announce that the Android Open Source Project (AOSP) now supports the Rust programming language for developing the OS itself.

Systems programming

Managed languages like Java and Kotlin are the best option for Android app development. These languages are designed for ease of use, portability, and safety. The Android Runtime (ART) manages memory on behalf of the developer. The Android OS uses Java extensively, effectively protecting large portions of the Android platform from memory bugs. Unfortunately, for the lower layers of the OS, Java and Kotlin are not an option.

Lower levels of the OS require systems programming languages like C, C++, and Rust. These languages are designed with control and predictability as goals. They provide access to low level system resources and hardware. They are light on resources and have more predictable performance characteristics.

For C and C++, the developer is responsible for managing memory lifetime. Unfortunately, it's easy to make mistakes when doing this, especially in complex and multithreaded codebases.

Rust provides memory safety guarantees by using a combination of compile-time checks to enforce object lifetime/ownership and runtime checks to ensure that memory accesses are valid. This safety is achieved while providing equivalent performance to C and C++.

The limits of sandboxing

C and C++ languages don’t provide these same safety guarantees and require robust isolation. All Android processes are sandboxed and we follow the Rule of 2 to decide if functionality necessitates additional isolation and deprivileging. The Rule of 2 is simple: given three options, developers may only select two of the following three options.

For Android, this means that if code is written in C/C++ and parses untrustworthy input, it should be contained within a tightly constrained and unprivileged sandbox. While adherence to the Rule of 2 has been effective in reducing the severity and reachability of security vulnerabilities, it does come with limitations. Sandboxing is expensive: the new processes it requires consume additional overhead and introduce latency due to IPC and additional memory usage. Sandboxing doesn’t eliminate vulnerabilities from the code and its efficacy is reduced by high bug density, allowing attackers to chain multiple vulnerabilities together.

Memory-safe languages like Rust help us overcome these limitations in two ways:

Lowers the density of bugs within our code, which increases the effectiveness of our current sandboxing.
Reduces our sandboxing needs, allowing introduction of new features that are both safer and lighter on resources.

But what about all that existing C++?

Of course, introducing a new programming language does nothing to address bugs in our existing C/C++ code. Even if we redirected the efforts of every software engineer on the Android team, rewriting tens of millions of lines of code is simply not feasible.

The above analysis of the age of memory safety bugs in Android (measured from when they were first introduced) demonstrates why our memory-safe language efforts are best focused on new development and not on rewriting mature C/C++ code. Most of our memory bugs occur in new or recently modified code, with about 50% being less than a year old.

The comparative rarity of older memory bugs may come as a surprise to some, but we’ve found that old code is not where we most urgently need improvement. Software bugs are found and fixed over time, so we would expect the number of bugs in code that is being maintained but not actively developed to go down over time. Just as reducing the number and density of bugs improves the effectiveness of sandboxing, it also improves the effectiveness of bug detection.

Limitations of detection

Bug detection via robust testing, sanitization, and fuzzing is crucial for improving the quality and correctness of all software, including software written in Rust. A key limitation for the most effective memory safety detection techniques is that the erroneous state must actually be triggered in instrumented code in order to be detected. Even in code bases with excellent test/fuzz coverage, this results in a lot of bugs going undetected.

Another limitation is that bug detection is scaling faster than bug fixing. In some projects, bugs that are being detected are not always getting fixed. Bug fixing is a long and costly process.

Each of these steps is costly, and missing any one of them can result in the bug going unpatched for some or all users. For complex C/C++ code bases, often there are only a handful of people capable of developing and reviewing the fix, and even with a high amount of effort spent on fixing bugs, sometimes the fixes are incorrect.

Bug detection is most effective when bugs are relatively rare and dangerous bugs can be given the urgency and priority that they merit. Our ability to reap the benefits of improvements in bug detection require that we prioritize preventing the introduction of new bugs.

Prioritizing prevention

Rust modernizes a range of other language aspects, which results in improved correctness of code:

Memory safety - enforces memory safety through a combination of compiler and run-time checks.
Data concurrency - prevents data races. The ease with which this allows users to write efficient, thread-safe code has given rise to Rust’s Fearless Concurrency slogan.
More expressive type system - helps prevent logical programming bugs (e.g. newtype wrappers, enum variants with contents).
References and variables are immutable by default - assist the developer in following the security principle of least privilege, marking a reference or variable mutable only when they actually intend it to be so. While C++ has const, it tends to be used infrequently and inconsistently. In comparison, the Rust compiler assists in avoiding stray mutability annotations by offering warnings for mutable values which are never mutated.
Better error handling in standard libraries - wrap potentially failing calls in Result, which causes the compiler to require that users check for failures even for functions which do not return a needed value. This protects against bugs like the Rage Against the Cage vulnerability which resulted from an unhandled error. By making it easy to propagate errors via the ? operator and optimizing Result for low overhead, Rust encourages users to write their fallible functions in the same style and receive the same protection.
Initialization - requires that all variables be initialized before use. Uninitialized memory vulnerabilities have historically been the root cause of 3-5% of security vulnerabilities on Android. In Android 11, we started auto initializing memory in C/C++ to reduce this problem. However, initializing to zero is not always safe, particularly for things like return values, where this could become a new source of faulty error handling. Rust requires every variable be initialized to a legal member of its type before use, avoiding the issue of unintentionally initializing to an unsafe value. Similar to Clang for C/C++, the Rust compiler is aware of the initialization requirement, and avoids any potential performance overhead of double initialization.
Safer integer handling - Overflow sanitization is on for Rust debug builds by default, encouraging programmers to specify a wrapping_add if they truly intend a calculation to overflow or saturating_add if they don’t. We intend to enable overflow sanitization for all builds in Android. Further, all integer type conversions are explicit casts: developers can not accidentally cast during a function call when assigning to a variable or when attempting to do arithmetic with other types.

Where we go from here

Adding a new language to the Android platform is a large undertaking. There are toolchains and dependencies that need to be maintained, test infrastructure and tooling that must be updated, and developers that need to be trained. For the past 18 months we have been adding Rust support to the Android Open Source Project, and we have a few early adopter projects that we will be sharing in the coming months. Scaling this to more of the OS is a multi-year project. Stay tuned, we will be posting more updates on this blog.

Java is a registered trademark of Oracle and/or its affiliates.

Thanks Matthew Maurer, Bram Bonne, and Lars Bergstrom for contributions to this post. Special thanks to our colleagues, Adrian Taylor for his insight into the age of memory vulnerabilities, and to Chris Palmer for his work on “The Rule of 2” and “The limits of Sandboxing”.

Security Blog