Tao
Tao

Rust Ownership

Ownership is a core concept in the Rust programming language, addressing many memory safety issues that exist in system programming languages. Ownership rules ensure that at any given time, only one mutable reference or multiple immutable references can access data. This way, the Rust compiler can catch and reject data races and other undefined behaviors at compile time.

Definition of Rust Ownership: A value can only be owned by one variable, and at any moment, there can only be one owner. When the owner leaves the scope, the owned value is dropped, and the memory is released.

  • Each value in Rust has a variable that’s called its owner.
  • There can only be one owner at a time, meaning no two variables can own the same value simultaneously. Thus, during variable assignment, parameter passing, or function returning, the old owner transfers ownership to the new owner to ensure a single owner.
  • When the owner goes out of scope, the value will be dropped, and the memory is released.

rust

// 1. Ownership uniqueness
let x = String::from("hello"); // x owns the "hello" string

// 2. Ownership transfer
let y = x; // ownership moves from x to y
// x is now invalid

// 3. Borrowing
let z = &y; // z borrows an immutable reference to y
println!("{}", z); // prints "hello"

let mut w = String::from("world");
{
    let p = &mut w; // p borrows a mutable reference to w
    p.push_str("!"); // modifies the borrowed value
}
println!("{}", w); // prints "world!"

If a value implements the Copy trait, assignment or passing it will use copy semantics, where the value is bitwise copied (shallow copy), producing a new value.

Consider the following code (sourced from Geek Time’s first Rust course):

rust

fn is_copy<T: Copy>() {}

fn types_impl_copy_trait() {
    is_copy::<bool>();
    is_copy::<char>();

    // all iXX and uXX, usize/isize, fXX implement Copy trait
    is_copy::<i8>();
    is_copy::<u64>();
    is_copy::<i64>();
    is_copy::<usize>();

    // function (actually a pointer) is Copy
    is_copy::<fn()>();

    // raw pointer is Copy
    is_copy::<*const String>();
    is_copy::<*mut String>();

    // immutable reference is Copy
    is_copy::<&[Vec<u8>]>();
    is_copy::<&String>();

    // array/tuple with values which is Copy is Copy
    is_copy::<[u8; 4]>();
    is_copy::<(&str, &str)>();
}

fn types_not_impl_copy_trait() {
    // unsized or dynamic sized type is not Copy
    is_copy::<str>();
    is_copy::<[u8]>();
    is_copy::<Vec<u8>>();
    is_copy::<String>();

    // mutable reference is not Copy
    is_copy::<&mut String>();

    // array / tuple with values that not Copy is not Copy
    is_copy::<[Vec<u8>; 4]>();
    is_copy::<(String, u32)>();
}

fn main() {
    types_impl_copy_trait();
    types_not_impl_copy_trait();
}

From the above code, we can determine which data types implement the Copy trait by default. Summary:

  • Primitive types, including functions, immutable references, and raw pointers, implement Copy.
  • Arrays and tuples implement Copy if their contained data types also implement Copy.
  • Mutable references do not implement Copy.
  • Dynamically sized data structures do not implement Copy.

Copy is a shallow copy where the value is automatically bitwise copied upon assignment or parameter passing. Clone is a deep copy requiring the clone method to be called during assignment or parameter passing.

Copy Trait:

  • The Copy trait is used for simple scalar types stored on the stack, such as i32, f64, and composite types entirely composed of Copy data.
  • Types implementing the Copy trait undergo a bitwise memory copy during assignment or function passing, a cheap operation.
  • For Copy types, ownership is not transferred, and both the original and new values can be used simultaneously.
  • Copy can only be applied to types that do not require memory allocation or resource management, as copying must be a completely side-effect-free operation.

Clone Trait:

  • The Clone trait is used for heap-allocated data structures, such as String and Vec, and any type that owns resources.
  • Types implementing the Clone trait perform deep memory allocation and data copying during copying.
  • For Clone types, copying produces a fully independent new value, with both the original and new values owning their resources.
  • Clone can be applied to any type as long as the type provides the logic required for deep copying.

Example:

rust

// Copy example
let x = 42;
let y = x; // Directly copies x's value
println!("x = {}, y = {}", x, y); // Outputs "x = 42, y = 42"

// Clone example
let s1 = String::from("hello");
let s2 = s1.clone(); // Clones s1's content to a new String instance
println!("s1 = {}, s2 = {}", s1, s2); // Outputs "s1 = hello, s2 = hello"

In the first example, x is an integer that implements the Copy trait. Thus, when x is assigned to y, it simply copies x’s memory value without any resource allocation or ownership transfer.

In the second example, s1 is a String type, which is stored on the heap. Calling the clone() method allocates new memory on the heap and deep-copies s1’s content into it. This way, s1 and s2 have completely independent memory spaces, and modifying one does not affect the other.

Generally, Copy is used for simple value semantics, while Clone is used for complex data structures and resource copying. Copy incurs less performance overhead but has strict limitations; Clone is more general but also more costly.

In Rust, Copy is a special marker trait that the compiler automatically implements for eligible types. On the other hand, Clone requires manual implementation or automatic derivation using #[derive(Clone)].

Overall, when copying values, you should prefer Copy. If a type cannot implement Copy (e.g., it contains heap-allocated data), use Clone. In performance-sensitive scenarios, avoid unnecessary cloning operations.

Assignment or passing a value will cause a move, transferring ownership. Once ownership is transferred, the previous variable cannot access the value.

Specifically, move occurs in the following situations:

  1. Variable Binding

rust

let x = String::from("hello");
let y = x; // x's value is moved to y, x no longer owns the string

In this example, x’s value (a String instance) is moved to y because String is a heap-allocated type, and its ownership must be unique. After x’s value is moved to y, x loses ownership of the value and can no longer be used.

  1. Passing as a Function Parameter

rust

let s = String::from("hello");
take_ownership(s); // s's value is moved into the function

When a value is passed as a parameter to a function, its ownership moves into the function. After the function ends, the value is dropped.

  1. Returning a Value

rust

fn create_string() -> String {
    let s = String::from("hello");
    s // s's ownership moves to the function's return value
}

The function’s return value also involves ownership transfer. In the example above, s’s ownership moves to the return value.

The move semantics ensure that a value is only owned by one variable at any time, preventing multiple variables from accessing and modifying the same value simultaneously, thereby avoiding data races and undefined behavior.

Borrow semantics allow a value’s ownership to be used in other contexts without transferring ownership.

Rust enforces strict rules for mutable references:

  • Only one active mutable reference is allowed in a scope. An active reference is one that is actually used to modify the data; if it is defined but not used or used as a read-only reference, it is not considered active.
  • Active mutable references (write) and immutable references (read) are mutually exclusive in the same scope.

Immutable borrowing allows multiple immutable references to refer to the same value simultaneously. These references can only read the value, not modify it.

rust

let x = 5;
let y = &x; // Create an immutable reference
let z = &x; // Multiple immutable references are allowed

println!("

{} {} {}", x, y, z); // 5 5 5

Mutable borrowing allows creating a mutable reference through which the referenced value can be read and written. However, only one mutable reference is allowed in the same scope.

rust

let mut x = 5;
let y = &mut x; // Create a mutable reference
*y += 1; // Modify x's value through the mutable reference

println!("{}", x); // 6

How does Rust handle scenarios like implementing a doubly linked list, a DAG, or multiple threads accessing the same memory? For special scenarios, Rust provides smart pointers like Rc, Arc, RefCell, and Cell to solve these issues. Smart pointers are data structures that encapsulate pointers and add additional metadata and functionality. They use raw pointers underneath but provide higher-level, safer abstractions, helping to automatically manage resources and prevent common memory safety issues. Rust’s standard library offers several commonly used smart pointers.

  1. Box: Box is a smart pointer that allocates memory on the heap. It owns the data, and when the Box goes out of scope, it automatically frees the heap memory it wraps. Box is typically used for storing data whose size cannot be determined at compile time or to avoid excessive data copying.

rust

let x = Box::new(5);
println!("x = {}", x); // Use *x to dereference
  1. Rc and Arc: Rc (Reference Counted) and Arc (Atomic Reference Counted) provide shared ownership functionality. They track the number of owners of a resource through reference counting and automatically free the resource when there are no owners. Rc is used in single-threaded contexts, while Arc is used in multi-threaded contexts (atomic operations are more expensive).

rust

use std::rc::Rc;

let x = Rc::new(vec![1, 2, 3]);
let y = x.clone(); // Increases reference count
println!("x before = {:?}", x);
println!("y = {:?}", y);
  1. RefCell and Rc<RefCell>: RefCell provides interior mutability, allowing for mutable access to data even when the RefCell itself is immutable. It performs borrow checks at runtime rather than compile time. Typically used with Rc to enable mutable borrowing in multiple places.

rust

use std::rc::Rc;
use std::cell::RefCell;

let x = Rc::new(RefCell::new(vec![1, 2, 3]));
x.borrow_mut().push(4); // Mutably borrow the data in x
println!("x = {:?}", x);
  1. Mutex and Arc<Mutex>: Mutex (Mutual Exclusion) is a concurrency primitive for thread-safe programming. It provides a mechanism for mutually exclusive access to shared data. Typically used with Arc to safely share and modify data across multiple threads.

rust

use std::sync::{Arc, Mutex};
use std::thread;

let data = Arc::new(Mutex::new(vec![1, 2, 3]));
let mut children = vec![];

for _ in 0..3 {
    let data_clone = Arc::clone(&data);
    children.push(thread::spawn(move || {
        let mut value = data_clone.lock().unwrap();
        value.push(4);
    }));
}

for child in children {
    child.join().unwrap();
}

println!("data = {:?}", data);

These smart pointers not only provide automatic resource management and thread safety but also adhere to Rust’s ownership and borrowing rules, ensuring memory safety.