mirror of
https://github.com/rust-lang-cn/book-cn.git
synced 2025-01-26 09:18:42 +08:00
1072 lines
34 KiB
Markdown
1072 lines
34 KiB
Markdown
|
# Ownership
|
|||
|
|
|||
|
Rust’s central feature is called ‘ownership’. It is a feature that is
|
|||
|
straightforward to explain, but has deep implications for the rest of the
|
|||
|
language.
|
|||
|
|
|||
|
Rust is committed to both safety and speed. One of the key tools for balancing
|
|||
|
between them is “zero-cost abstractions”: the various abstractions in Rust do
|
|||
|
not pose a global performance penalty. The ownership system is a prime example
|
|||
|
of a zero-cost abstraction. All of the analysis we’ll talk about in this guide
|
|||
|
is done at compile time. You do not pay any run-time cost for any of these
|
|||
|
features.
|
|||
|
|
|||
|
However, this system does have a certain cost: learning curve. Many new
|
|||
|
Rustaceans experience something we like to call ‘fighting with the borrow
|
|||
|
checker’, where the Rust compiler refuses to compile a program that the author
|
|||
|
thinks is valid. This can happen because the programmer isn’t used to thinking
|
|||
|
carefully about ownership, or is thinking about it differently from the way
|
|||
|
that Rust does. You probably will experience something similar at first. There is
|
|||
|
good news, however: more experienced Rust developers report that once they work
|
|||
|
with the rules of the ownership system for a period of time, they fight the
|
|||
|
borrow checker less and less. Keep at it!
|
|||
|
|
|||
|
This chapter will give you a foundation for understanding the rest of the
|
|||
|
language. To do so, we’re going to learn through examples, focusing on a very
|
|||
|
common data structure: strings.
|
|||
|
|
|||
|
## Variable binding scope
|
|||
|
|
|||
|
Let’s take a step back and look at the very basics again. Now that we’re past
|
|||
|
basic syntax, we won’t include all of the `fn main() {` stuff in examples, so
|
|||
|
if you’re following along, you will have to put them inside of a `main()`
|
|||
|
function. This lets our examples be a bit more concise, letting us focus on the
|
|||
|
actual details, rather than boilerplate.
|
|||
|
|
|||
|
Anyway, here it is:
|
|||
|
|
|||
|
```rust
|
|||
|
let s = "hello";
|
|||
|
```
|
|||
|
|
|||
|
This variable binding refers to a string literal. It’s valid from the point at
|
|||
|
which it’s declared, until the end of the current _scope_. That is:
|
|||
|
|
|||
|
```rust
|
|||
|
{ // s is not valid here, it’s not yet in scope
|
|||
|
let s = "hello"; // s is valid from this point forward
|
|||
|
|
|||
|
// do stuff with s
|
|||
|
} // this scope is now over, and s is no longer valid
|
|||
|
```
|
|||
|
|
|||
|
In other words, there are two important points in time here:
|
|||
|
|
|||
|
- When `s` comes ‘into scope’, it is valid.
|
|||
|
- It remains so until it ‘goes out of scope’.
|
|||
|
|
|||
|
At this point, things are similar to other programming languages. Let’s build
|
|||
|
on top of this understanding by introducing a new type: `String`.
|
|||
|
|
|||
|
## Strings
|
|||
|
|
|||
|
String literals are convenient, but they aren’t the only way that you use strings.
|
|||
|
For one thing, they’re immutable. For another, not every string is literal:
|
|||
|
what about taking user input and storing it in a string?
|
|||
|
|
|||
|
For this, Rust has a second string type, `String`. You can create a `String` from
|
|||
|
a string literal using the `from` function:
|
|||
|
|
|||
|
```rust
|
|||
|
let s = String::from("hello");
|
|||
|
```
|
|||
|
|
|||
|
We haven’t seen the double colon (`::`) syntax yet. It is a kind of scope
|
|||
|
operator, allowing us to namespace this particular `from()` function under the
|
|||
|
`String` type itself, rather than using some sort of name like `string_from()`.
|
|||
|
We’ll discuss this syntax more in the “Method Syntax” and “Modules” chapters.
|
|||
|
|
|||
|
This kind of string can be mutated:
|
|||
|
|
|||
|
```rust
|
|||
|
let mut s = String::from("hello");
|
|||
|
|
|||
|
s.push_str(", world!");
|
|||
|
```
|
|||
|
|
|||
|
## Memory and allocation
|
|||
|
|
|||
|
So, what’s the difference here? Why can `String` be mutated, but literals
|
|||
|
cannot? The difference comes down to how these two types deal with memory.
|
|||
|
|
|||
|
In the case of a string literal, because we know the contents of the string at
|
|||
|
compile time, we can hard-code the text of the string directly into the final
|
|||
|
executable. This means that string literals are quite fast and efficient. But
|
|||
|
these properties only come from its immutability; we can’t put an
|
|||
|
arbitrary-sized blob of memory into the binary for each string!
|
|||
|
|
|||
|
With `String`, to support a mutable, growable string, we need to allocate an
|
|||
|
unknown amount of memory to hold the contents. This means two things:
|
|||
|
|
|||
|
1. The memory must be requested from the operating system at runtime.
|
|||
|
2. We need a way of giving this memory back to the operating system when we’re
|
|||
|
done with our `String`.
|
|||
|
|
|||
|
That first part is done by us: when we call `String::from()`, its
|
|||
|
implementation requests the memory it needs. This is pretty much universal in
|
|||
|
programming languages.
|
|||
|
|
|||
|
The second case, however, is different. In languages with a garbage collector
|
|||
|
(‘GC’), the GC handles that second case, and we, as the programmer, don’t need
|
|||
|
to think about it. Without GC, it’s the programmer’s responsibility to identify
|
|||
|
when memory is no longer being used, and explicitly return it, just as it was
|
|||
|
requested. Doing this correctly has historically been a difficult problem. If
|
|||
|
we forget, we will waste memory. If we do it too early, we will have an invalid
|
|||
|
variable. If we do it twice, that’s a bug too. We need to pair exactly one
|
|||
|
`allocate()` with exactly one `free()`.
|
|||
|
|
|||
|
Rust takes a different path. Remember our example? Here’s a version with
|
|||
|
`String`:
|
|||
|
|
|||
|
```rust
|
|||
|
{
|
|||
|
let s = String::from("hello"); // s is valid from this point forward
|
|||
|
|
|||
|
// do stuff with s
|
|||
|
} // this scope is now over, and s is no longer valid
|
|||
|
```
|
|||
|
|
|||
|
We have a natural point at which we can return the memory our `String` needs back
|
|||
|
to the operating system: when it goes out of scope! When a variable goes out of
|
|||
|
scope, a special function is called. This function is called `drop()`, and it
|
|||
|
is where the author of `String` can put the code to return the memory.
|
|||
|
|
|||
|
> Aside: This pattern is sometimes called “Resource Aquisition Is
|
|||
|
> Initialization” in C++, or “RAII” for short. While they are very similar,
|
|||
|
> Rust’s take on this concept has a number of differences, and so we don’t tend
|
|||
|
> to use the same term. If you’re familliar with this idea, keep in mind that it
|
|||
|
> is _roughly_ similar in Rust, but not identical.
|
|||
|
|
|||
|
This pattern has a profound impact on the way that Rust code is written. It may
|
|||
|
seem obvious right now, but things can get tricky in more advanced situations!
|
|||
|
Let’s go over the first one of those right now.
|
|||
|
|
|||
|
## Move
|
|||
|
|
|||
|
What would you expect this code to do?
|
|||
|
|
|||
|
```rust
|
|||
|
let x = 5;
|
|||
|
let y = x;
|
|||
|
```
|
|||
|
|
|||
|
You might say “Make a copy of `5`.” That’d be correct! We now have two
|
|||
|
bindings, `x` and `y`, and both equal `5`.
|
|||
|
|
|||
|
Now let’s look at `String`. What would you expect this code to do?
|
|||
|
|
|||
|
```rust
|
|||
|
let s1 = String::from("hello");
|
|||
|
let s2 = s1;
|
|||
|
```
|
|||
|
|
|||
|
You might say “copy the `String`!” This is both correct and incorrect at the
|
|||
|
same time. It does a _shallow_ copy of the `String`. What’s that mean? Well,
|
|||
|
let’s take a look at what `String` looks like under the covers:
|
|||
|
|
|||
|
<img alt="string" src="img/foo1.png" class="center" />
|
|||
|
|
|||
|
A `String` is made up of three parts: a pointer to the memory that holds the
|
|||
|
contents of the string, a length, and a capacity. The length is how much memory
|
|||
|
the `String` is currently using. The capacity is the total amount of memory the
|
|||
|
`String` has gotten from the operating system. The difference between length
|
|||
|
and capacity matters, but not in this context, so don’t worry about it too much
|
|||
|
if it doesn’t make sense, and just ignore the capacity.
|
|||
|
|
|||
|
> We’ve talked about two kinds of composite types: arrays and tuples. `String`
|
|||
|
> is a third type: a `struct`, which we will cover the details of in the next
|
|||
|
> chapter of the book. For now, thinking about `String` as a tuple is close
|
|||
|
> enough.
|
|||
|
|
|||
|
When we assign `s1` to `s2`, the `String` itself is copied. But not all kinds
|
|||
|
of copying are the same. Many people draw distinctions between ‘shallow
|
|||
|
copying’ and ‘deep copying’. We don’t use these terms in Rust. We instead say
|
|||
|
that something is ‘moved’ or ‘cloned’. Assignment in Rust causes a ‘move’. In
|
|||
|
other words, it looks like this:
|
|||
|
|
|||
|
<img alt="s1 and s2" src="img/foo2.png" class="center" />
|
|||
|
|
|||
|
_Not_ this:
|
|||
|
|
|||
|
<img alt="s1 and s2 to two places" src="img/foo4.png" class="center" />
|
|||
|
|
|||
|
When moving, Rust makes a copy of the data structure itself, the contents of
|
|||
|
`s1` are copied, but if `s1` contains a reference, like it does in this case,
|
|||
|
Rust will not copy the things that those references refer to.
|
|||
|
|
|||
|
There’s a problem here! Both `data` pointers are pointing to the same place.
|
|||
|
Why is this a problem? Well, when `s2` goes out of scope, it will free the
|
|||
|
memory that `data` points to. And then `s1` goes out of scope, and it will
|
|||
|
_also_ try to free the memory that `data` points to! That’s bad.
|
|||
|
|
|||
|
So what’s the solution? Here, we stand at a crossroads. There are a few
|
|||
|
options. One would be to declare that assignment will also copy out any data.
|
|||
|
This works, but is inefficient: what if our `String` contained a novel? Also,
|
|||
|
it only works for memory. What if, instead of a `String`, we had a
|
|||
|
`TcpConnection`? Opening and closing a network connection is very similar to
|
|||
|
allocating and freeing memory. The solution that we could use there is to allow
|
|||
|
the programmer to hook into the assignment, similar to `drop()`, and write code
|
|||
|
fix things up. That would work, but now, an `=` can run arbitrary code. That’s
|
|||
|
also not good, and it doesn’t solve our efficiency concerns either.
|
|||
|
|
|||
|
Let’s take a step back: the root of the problem is that `s1` and `s2` both
|
|||
|
think that they have control of the memory, and therefore needs to free it.
|
|||
|
Instead of trying to copy the allocated memory, we could say that `s1` is no
|
|||
|
longer valid, and therefore, doesn’t need to free anything. This is in fact the
|
|||
|
choice that Rust makes. Check it out what happens when you try to use `s1`
|
|||
|
after `s2` is created:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
let s1 = String::from("hello");
|
|||
|
let s2 = s1;
|
|||
|
|
|||
|
println!("{}", s1);
|
|||
|
```
|
|||
|
|
|||
|
You’ll get an error like this:
|
|||
|
|
|||
|
```text
|
|||
|
5:22 error: use of moved value: `s1` [E0382]
|
|||
|
println!("{}", s1);
|
|||
|
^~
|
|||
|
5:24 note: in this expansion of println! (defined in <std macros>)
|
|||
|
3:11 note: `s1` moved here because it has type `collections::string::String`, which is moved by default
|
|||
|
let s2 = s1;
|
|||
|
^~
|
|||
|
```
|
|||
|
|
|||
|
We say that `s1` was _moved_ into `s2`. When a value moves, its data is copied,
|
|||
|
but the original variable binding is no longer usable. That solves our problem:
|
|||
|
|
|||
|
<img alt="s1 and s2 to the same place" src="img/foo3.png" class="center" />
|
|||
|
|
|||
|
With only `s2` valid, when it goes out of scope, it will free the memory, and we’re done!
|
|||
|
|
|||
|
## Ownership Rules
|
|||
|
|
|||
|
This leads us to the Ownership Rules:
|
|||
|
|
|||
|
> 1. Each value in Rust has a variable binding that’s called it’s ‘owner’.
|
|||
|
> 2. There can only be one owner at a time.
|
|||
|
> 3. When the owner goes out of scope, the value will be `drop()`ped.
|
|||
|
|
|||
|
Furthermore, there’s a design choice that’s implied by this: Rust will never
|
|||
|
automatically create ‘deep’ copies of your data. Any automatic copying must be
|
|||
|
inexpensive.
|
|||
|
|
|||
|
## Clone
|
|||
|
|
|||
|
But what if we _do_ want to deeply copy the `String`’s data, and not just the
|
|||
|
`String` itself? There’s a common method for that: `clone()`. Here’s an example
|
|||
|
of `clone()` in action:
|
|||
|
|
|||
|
```rust
|
|||
|
let s1 = String::from("hello");
|
|||
|
let s2 = s1.clone();
|
|||
|
|
|||
|
println!("{}", s1);
|
|||
|
```
|
|||
|
|
|||
|
This will work just fine. Remember our diagram from before? In this case,
|
|||
|
it _is_ doing this:
|
|||
|
|
|||
|
<img alt="s1 and s2 to two places" src="img/foo4.png" class="center" />
|
|||
|
|
|||
|
When you see a call to `clone()`, you know that some arbitrary code is being
|
|||
|
executed, which may be expensive. It’s a visual indicator that something
|
|||
|
different is going on here.
|
|||
|
|
|||
|
## Copy
|
|||
|
|
|||
|
There’s one last wrinkle that we haven’t talked about yet. This code works:
|
|||
|
|
|||
|
```rust
|
|||
|
let x = 5;
|
|||
|
let y = x;
|
|||
|
|
|||
|
println!("{}", x);
|
|||
|
```
|
|||
|
|
|||
|
But why? We don’t have a call to `clone()`. Why didn’t `x` get moved into `y`?
|
|||
|
|
|||
|
For types that do not have any kind of complex storage requirements, like
|
|||
|
integers, typing `clone()` is busy work. There’s no reason we would ever want
|
|||
|
to prevent `x` from being valid here, as there’s no situation in which it’s
|
|||
|
incorrect. In other words, there’s no difference between deep and shallow
|
|||
|
copying here, so calling `clone()` wouldn’t do anything differently from the
|
|||
|
usual shallow copying.
|
|||
|
|
|||
|
Rust has a special annotation that you can place on types, called `Copy`. If
|
|||
|
a type is `Copy`, an older binding is still usable after assignment. Integers
|
|||
|
are an example of such a type; most of the primitive types are `Copy`.
|
|||
|
|
|||
|
While we haven’t talked about how to mark a type as `Copy` yet, you might ask
|
|||
|
yourself “what happens if we made `String` `Copy`?” The answer is, you cannot.
|
|||
|
Remember `drop()`? Rust will not let you make something `Copy` if it has
|
|||
|
implemented `drop()`. If you need to do something special when the value goes
|
|||
|
out of scope, being `Copy` will be an error.
|
|||
|
|
|||
|
So what types are `Copy`? You can check the documentation for the given type to
|
|||
|
be sure, but as a rule of thumb, any group of simple scalar values can be
|
|||
|
Copy, but nothing that requires allocation or is some form of resource is `Copy`.
|
|||
|
And you can’t get it wrong: the compiler will throw an error if you try to use
|
|||
|
a type that moves incorrectly, as we saw above.
|
|||
|
|
|||
|
Here’s some types that you’ve seen so far that are `Copy`:
|
|||
|
|
|||
|
* All of the integer types, like `u32`.
|
|||
|
* The booleans, `true` and `false`.
|
|||
|
* All of the floating point types, like `f64`.
|
|||
|
* Tuples, but only if they contain types which are also `Copy`. `(i32, i32)`
|
|||
|
is `Copy`, but `(i32, String)` is not!
|
|||
|
|
|||
|
## Ownership and functions
|
|||
|
|
|||
|
Passing a value to a function has similar semantics as assigning it:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
takes_ownership(s);
|
|||
|
|
|||
|
let x = 5;
|
|||
|
|
|||
|
makes_copy(x);
|
|||
|
}
|
|||
|
|
|||
|
fn takes_ownership(some_string: String) {
|
|||
|
println!("{}", some_string);
|
|||
|
}
|
|||
|
|
|||
|
fn makes_copy(some_integer: i32) {
|
|||
|
println!("{}", some_integer);
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Passing a binding to a function will move or copy, just like assignment. Here’s
|
|||
|
the same example, but with some annotations showing where things go into and
|
|||
|
out of scope:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s = String::from("hello"); // s goes into scope.
|
|||
|
|
|||
|
takes_ownership(s); // s moves into the function...
|
|||
|
// ... and so is no longer valid here.
|
|||
|
let x = 5; // x goes into scope.
|
|||
|
|
|||
|
makes_copy(x); // x would move into the function,
|
|||
|
// but i32 is Copy, so it’s okay to still
|
|||
|
// use x afterward.
|
|||
|
|
|||
|
} // Here, x goes out of scope, then s. But since s was moved, nothing special
|
|||
|
// happens.
|
|||
|
|
|||
|
fn takes_ownership(some_string: String) { // some_string comes into scope.
|
|||
|
println!("{}", some_string);
|
|||
|
} // Here, some_string goes out of scope and `drop()` is called. The backing
|
|||
|
// memory is freed.
|
|||
|
|
|||
|
fn makes_copy(some_integer: i32) { // some_integer comes into scope.
|
|||
|
println!("{}", some_integer);
|
|||
|
} // Here, some_integer goes out of scope. Nothing special happens.
|
|||
|
```
|
|||
|
|
|||
|
Remember: If we tried to use `s` after the call to `takes_ownership()`, Rust
|
|||
|
would throw a compile-time error! These static checks protect us from mistakes.
|
|||
|
|
|||
|
Returning values can also transfer ownership:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s1 = gives_ownership();
|
|||
|
|
|||
|
let s2 = String::from("hello");
|
|||
|
|
|||
|
let s3 = takes_and_gives_back(s2);
|
|||
|
}
|
|||
|
|
|||
|
fn gives_ownership() -> String {
|
|||
|
let some_string = String::from("hello");
|
|||
|
|
|||
|
some_string
|
|||
|
}
|
|||
|
|
|||
|
fn takes_and_gives_back(a_string: String) -> String {
|
|||
|
|
|||
|
a_string
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
With simililar annotations:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s1 = gives_ownership(); // gives_ownership moves its return
|
|||
|
// value into s1.
|
|||
|
|
|||
|
let s2 = String::from("hello"); // s2 comes into scope
|
|||
|
|
|||
|
let s3 = takes_and_gives_back(s2); // s2 is moved into
|
|||
|
// takes_and_gives_back, which also
|
|||
|
// moves its return value into s3.
|
|||
|
} // Here, s3 goes out of scope, and is dropped. s2 goes out of scope, but was
|
|||
|
// moved, so nothing happens. s1 goes out of scope, and is dropped.
|
|||
|
|
|||
|
fn gives_ownership() -> String { // gives_ownership will move its
|
|||
|
// return value into the function
|
|||
|
// that calls it.
|
|||
|
|
|||
|
let some_string = String::from("hello"); // some_string comes into scope.
|
|||
|
|
|||
|
some_string // some_string is returned, and
|
|||
|
// moves out to the calling
|
|||
|
// function.
|
|||
|
}
|
|||
|
|
|||
|
// takes_and_gives_back will both take a String and return one
|
|||
|
fn takes_and_gives_back(a_string: String) -> String { // a_string comes into scope
|
|||
|
|
|||
|
a_string // a_string is returned, and moves out to the calling function
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
It’s the same pattern, every time: assigning something moves it, and when an
|
|||
|
owner goes out of scope, if it hasn’t been moved, it will `drop()`.
|
|||
|
|
|||
|
This might seem a bit tedious, and it is. What if I want to let a function use
|
|||
|
a value, but not take ownership? It’s quite annoying that anything I pass in
|
|||
|
also needs passed back. Look at this function:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s1 = String::from("hello");
|
|||
|
|
|||
|
let (s2, len) = calculate_length(s1);
|
|||
|
|
|||
|
println!("The length of '{}' is {}.", s2, len);
|
|||
|
}
|
|||
|
|
|||
|
fn calculate_length(s: String) -> (String, usize) {
|
|||
|
let length = s.len(); // len() returns the length of a String.
|
|||
|
|
|||
|
(s, length)
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This is too much ceremony: we have to use a tuple to give back the `String` as
|
|||
|
well as the length. It’s a lot of work for a pattern that should be common.
|
|||
|
|
|||
|
Luckily for us, Rust has such a feature, and it’s what the next section is about.
|
|||
|
|
|||
|
|
|||
|
# References and Borrowing
|
|||
|
|
|||
|
At the end of the last section, we had some example Rust that wasn’t very
|
|||
|
good. Here it is again:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s1 = String::from("hello");
|
|||
|
|
|||
|
let (s2, len) = calculate_length(s1);
|
|||
|
|
|||
|
println!("The length of '{}' is {}.", s2, len);
|
|||
|
}
|
|||
|
|
|||
|
fn calculate_length(s: String) -> (String, usize) {
|
|||
|
let length = s.len(); // len() returns the length of a String.
|
|||
|
|
|||
|
(s, length)
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
The issue here is that we have to return the `String` back to the calling
|
|||
|
function so that it could still use it.
|
|||
|
|
|||
|
There is a better way. It looks like this:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let s1 = String::from("hello");
|
|||
|
|
|||
|
let len = calculate_length(&s1);
|
|||
|
|
|||
|
println!("The length of '{}' is {}.", s1, len);
|
|||
|
}
|
|||
|
|
|||
|
fn calculate_length(s: &String) -> usize {
|
|||
|
let length = s.len();
|
|||
|
|
|||
|
length
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
First, you’ll notice all of the tuple stuff is gone. Next, that we pass `&s1`
|
|||
|
into `calculate_lengths()`. And in its definition, we take `&String` rather
|
|||
|
than `String`.
|
|||
|
|
|||
|
These `&s` are called ‘references’, and they allow you to refer to some value
|
|||
|
without taking ownership of it. Here’s a diagram:
|
|||
|
|
|||
|
DIAGRAM GOES HERE of a &String pointing at a String, with (ptr, len, capacity)
|
|||
|
|
|||
|
Let’s take a closer look at the function call here:
|
|||
|
|
|||
|
```rust
|
|||
|
# fn calculate_length(s: &String) -> usize {
|
|||
|
# let length = s.len();
|
|||
|
#
|
|||
|
# length
|
|||
|
# }
|
|||
|
let s1 = String::from("hello");
|
|||
|
|
|||
|
let len = calculate_length(&s1);
|
|||
|
```
|
|||
|
|
|||
|
The `&s1` syntax lets us create a reference from `s1`. This reference _refers_
|
|||
|
to the value of `s1`, but does not own it. Because it does not own it, the
|
|||
|
value it points to will not be dropped when the reference goes out of scope.
|
|||
|
|
|||
|
Likewise, the signature of the function uses `&` to indicate that it takes
|
|||
|
a reference as an argument:
|
|||
|
|
|||
|
Let’s add some explanatory annotations:
|
|||
|
|
|||
|
```rust
|
|||
|
fn calculate_length(s: &String) -> usize { // s is a reference to a String
|
|||
|
let length = s.len();
|
|||
|
|
|||
|
length
|
|||
|
} // Here, s goes out of scope. But since it does not have ownership of what
|
|||
|
// it refers to, nothing happens.
|
|||
|
```
|
|||
|
|
|||
|
It’s the same process as before, except that because we don’t have ownership,
|
|||
|
we don’t drop what a reference points to when the reference goes out of scope.
|
|||
|
This lets us write functions which take references as arguments instead of the
|
|||
|
values themselves, so that we won’t need to return them to give back ownership.
|
|||
|
|
|||
|
There’s another word for what references do, and that’s ‘borrowing’. Just like
|
|||
|
with real life, if I own something, you can borrow it from me. When you’re done,
|
|||
|
you have to give it back.
|
|||
|
|
|||
|
Speaking of which, what if you try to modify something you borrow from me? Try
|
|||
|
this code out. Spoiler alert: it doesn’t work:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn main() {
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
change(&s);
|
|||
|
}
|
|||
|
|
|||
|
fn change(some_string: &String) {
|
|||
|
some_string.push_str(", world"); // push_str() appends a literal to a String
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Here’s the error:
|
|||
|
|
|||
|
```text
|
|||
|
8:16 error: cannot borrow immutable borrowed content `*some_string` as mutable
|
|||
|
some_string.push_str(", world"); // push_str() appends a literal to a String
|
|||
|
^~~~~~~~~~~
|
|||
|
```
|
|||
|
|
|||
|
Just like bindings are immutable by default, so are references. We’re not allowed
|
|||
|
to modify something we have a reference to.
|
|||
|
|
|||
|
## Mutable references
|
|||
|
|
|||
|
We can fix this bug! Just a small tweak:
|
|||
|
|
|||
|
```rust
|
|||
|
fn main() {
|
|||
|
let mut s = String::from("hello");
|
|||
|
|
|||
|
change(&mut s);
|
|||
|
}
|
|||
|
|
|||
|
fn change(some_string: &mut String) {
|
|||
|
some_string.push_str(", world"); // push_str() appends a literal to a String
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
First, we had to change `s` to be `mut`. Then, we had to create a mutable reference
|
|||
|
with `&mut s` and accept a mutable reference with `some_string: &mut String`.
|
|||
|
|
|||
|
Mutable references have one big restriction, though. This code fails:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
let mut s = String::from("hello");
|
|||
|
|
|||
|
let r1 = &mut s;
|
|||
|
let r2 = &mut s;
|
|||
|
```
|
|||
|
|
|||
|
Here’s the error:
|
|||
|
|
|||
|
```text
|
|||
|
5:20 error: cannot borrow `s` as mutable more than once at a time [E0499]
|
|||
|
let r2 = &mut s;
|
|||
|
^
|
|||
|
4:20 note: previous borrow of `s` occurs here; the mutable borrow prevents
|
|||
|
subsequent moves, borrows, or modification of `s` until the borrow
|
|||
|
ends
|
|||
|
let r1 = &mut s;
|
|||
|
^
|
|||
|
7:2 note: previous borrow ends here
|
|||
|
fn main() {
|
|||
|
|
|||
|
}
|
|||
|
^
|
|||
|
```
|
|||
|
|
|||
|
The error is what it says on the tin: you cannot borrow something more than
|
|||
|
once at a time in a mutable fashion. This restriction allows for mutation, but
|
|||
|
in a very controlled fashion. It is something that new Rustaceans struggle
|
|||
|
with, because most languages let you mutate whenever you’d like.
|
|||
|
|
|||
|
As always, we can use `{}`s to create a new scope, allowing for multiple mutable
|
|||
|
references. Just not _simultaneous_ ones:
|
|||
|
|
|||
|
```rust
|
|||
|
let mut s = String::from("hello");
|
|||
|
|
|||
|
{
|
|||
|
let r1 = &mut s;
|
|||
|
|
|||
|
} // r1 goes out of scope here, so we can make a new reference with no problems.
|
|||
|
|
|||
|
let r2 = &mut s;
|
|||
|
```
|
|||
|
|
|||
|
There is a simlar rule for combining the two kinds of references. This code errors:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
let mut s = String::from("hello");
|
|||
|
|
|||
|
let r1 = &s; // no problem
|
|||
|
let r2 = &s; // no problem
|
|||
|
let r3 = &mut s; // BIG PROBLEM
|
|||
|
```
|
|||
|
|
|||
|
Here’s the error:
|
|||
|
|
|||
|
```text
|
|||
|
19: 6:20 error: cannot borrow `s` as mutable because it is also borrowed as
|
|||
|
immutable [E0502]
|
|||
|
let r3 = &mut s; // BIG PROBLEM
|
|||
|
^
|
|||
|
15: 4:16 note: previous borrow of `s` occurs here; the immutable borrow
|
|||
|
prevents subsequent moves or mutable borrows of `s` until the
|
|||
|
borrow ends
|
|||
|
let r1 = &s; // no problem
|
|||
|
^
|
|||
|
8:2 note: previous borrow ends here
|
|||
|
fn main() {
|
|||
|
|
|||
|
}
|
|||
|
^
|
|||
|
```
|
|||
|
|
|||
|
Whew! We _also_ cannot have a mutable reference while we have an immutable one.
|
|||
|
Users of an immutable reference don’t expect the values to suddenly change out
|
|||
|
from under them! Multiple immutable references are okay, however.
|
|||
|
|
|||
|
## Dangling references
|
|||
|
|
|||
|
In languages with pointers, it’s easy to create a “dangling pointer” by freeing
|
|||
|
some memory while keeping around a pointer to that memory. In Rust, by
|
|||
|
contrast, the compiler guarantees that references will never be dangling: if we
|
|||
|
have a reference to something, the compiler will ensure that it will not go
|
|||
|
out of scope before the reference does.
|
|||
|
|
|||
|
Let’s try to create a dangling reference:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn main() {
|
|||
|
let reference_to_nothing = dangle();
|
|||
|
}
|
|||
|
|
|||
|
fn dangle() -> &String {
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
&s
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Here’s the error:
|
|||
|
|
|||
|
```text
|
|||
|
error: missing lifetime specifier [E0106]
|
|||
|
fn dangle() -> &String {
|
|||
|
^~~~~~~
|
|||
|
help: this function’s return type contains a borrowed value, but there is no
|
|||
|
value for it to be borrowed from
|
|||
|
help: consider giving it a ‘static lifetime
|
|||
|
```
|
|||
|
|
|||
|
This error message refers to a feature we haven’t learned about yet,
|
|||
|
‘lifetimes’. The message does contain the key to why this code is a problem,
|
|||
|
though:
|
|||
|
|
|||
|
```text
|
|||
|
this function’s return type contains a borrowed value, but there is no value
|
|||
|
for it to be borrowed from
|
|||
|
```
|
|||
|
|
|||
|
Let’s examine exactly what happens with `dangle()`:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn dangle() -> &String { // dangle returns a reference to a String
|
|||
|
|
|||
|
let s = String::from("hello"); // s is a new String
|
|||
|
|
|||
|
&s // we return a reference to the String, s
|
|||
|
} // Here, s goes out of scope, and is dropped. Its memory goes away.
|
|||
|
// Danger!
|
|||
|
```
|
|||
|
|
|||
|
Because `s` is created inside of `dangle()`, when the code of `dangle()` is
|
|||
|
finished, it will be deallocated. But we tried to return a reference to it.
|
|||
|
That means this reference would be pointing to an invalid `String`! That’s
|
|||
|
no good. Rust won’t let us do this.
|
|||
|
|
|||
|
The correct code here is to return the `String` directly:
|
|||
|
|
|||
|
```rust
|
|||
|
fn no_dangle() -> String {
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
s
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This works, no problem. Ownership is moved out, nothing is deallocated.
|
|||
|
|
|||
|
## The Rules of References
|
|||
|
|
|||
|
Here’s a recap of what we’ve talked about. The Rules of References:
|
|||
|
|
|||
|
1. At any given time, you may have _either_, but not both of:
|
|||
|
1. One mutable reference.
|
|||
|
2. Any number of immutable references .
|
|||
|
2. References must always be valid.
|
|||
|
|
|||
|
While these rules are not complicated on their own, they can be tricky when
|
|||
|
applied to real code.
|
|||
|
|
|||
|
# Slices
|
|||
|
|
|||
|
So far, we’ve talked about types that have ownership, like `String`, and ones
|
|||
|
that don’t, like `&String`. There is a second kind of type which does not have
|
|||
|
ownership: slices. Slices let you reference a contiguous sequence of elements
|
|||
|
in a collection, rather than the whole collection itself.
|
|||
|
|
|||
|
Here’s a small programming problem: write a function which takes a string,
|
|||
|
and returns the first word you find. If we don’t find a space in the string,
|
|||
|
then the whole string is a word, so the whole thing should be returned.
|
|||
|
|
|||
|
Let’s think about the signature of this function:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn first_word(s: &String) -> ?
|
|||
|
```
|
|||
|
|
|||
|
This function, `first_word`, takes a `&String` as an argument. We don’t want
|
|||
|
ownership, so this is fine. But what should we return? We don’t really have a
|
|||
|
way to talk about _part_ of a string. We could return the index of the end of
|
|||
|
the word, though. Let’s try that:
|
|||
|
|
|||
|
```rust
|
|||
|
fn first_word(s: &String) -> usize {
|
|||
|
let bytes = s.as_bytes();
|
|||
|
|
|||
|
for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
if byte == 32 {
|
|||
|
return i;
|
|||
|
}
|
|||
|
}
|
|||
|
|
|||
|
s.len()
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Let’s break that down a bit:
|
|||
|
|
|||
|
```rust
|
|||
|
fn first_word(s: &String) -> usize {
|
|||
|
|
|||
|
// Since we need to go through the String element by element, and
|
|||
|
// check if a value is a space, we will convert our String to an
|
|||
|
// array of bytes, using the `.as_bytes()` method.
|
|||
|
let bytes = s.as_bytes();
|
|||
|
|
|||
|
// We discussed using the iter() method with for in Chapter 3.7. Here,
|
|||
|
// we’re adding another method: enumerate(). While iter() returns each
|
|||
|
// element, enumerate() modifies the result of iter(), and returns a
|
|||
|
// tuple instead. The first element of the tuple is the index, and the
|
|||
|
// second element is a reference to the element itself. This is a bit
|
|||
|
// nicer than calculating the index ourselves.
|
|||
|
//
|
|||
|
// Since it’s a tuple, we can use patterns, just like elsewhere in Rust.
|
|||
|
// So we match against the tuple with i for the index, and &byte for
|
|||
|
// the byte itself.
|
|||
|
for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
|
|||
|
// 32 is the value of a space in UTF-8
|
|||
|
if byte == 32 {
|
|||
|
|
|||
|
// We found a space! Return this position.
|
|||
|
return i;
|
|||
|
}
|
|||
|
}
|
|||
|
|
|||
|
// If we got here, we didn’t find a space, so this whole thing must be a
|
|||
|
// word. So return the length.
|
|||
|
s.len()
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This works, but there’s a problem. We’re returning a `usize` on its own, but
|
|||
|
it’s only a meaningful number in the context of the `&String` itself. In other
|
|||
|
words, because it’s a separate value from the `String`, there’s no guarantee
|
|||
|
that it will still be valid in the future. Consider this:
|
|||
|
|
|||
|
```rust
|
|||
|
# fn first_word(s: &String) -> usize {
|
|||
|
# let bytes = s.as_bytes();
|
|||
|
#
|
|||
|
# for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
# if byte == 32 {
|
|||
|
# return i;
|
|||
|
# }
|
|||
|
# }
|
|||
|
#
|
|||
|
# s.len()
|
|||
|
# }
|
|||
|
|
|||
|
fn main() {
|
|||
|
let mut s = String::from("hello world");
|
|||
|
|
|||
|
let word = first_word(&s);
|
|||
|
|
|||
|
s.clear(); // This empties the String, making it equal to "".
|
|||
|
|
|||
|
// word is now totally invalid! There’s no more word here.
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
This is bad! It’s even worse if we wanted to write a `second_word()`
|
|||
|
function. Its signature would have to look like this:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn second_word(s: &String) -> (usize, usize) {
|
|||
|
```
|
|||
|
|
|||
|
Now we’re tracking both a start _and_ and ending index. Even more chances for
|
|||
|
things to go wrong. We now have three unrelated variable bindings floating
|
|||
|
around which need to be kept in sync.
|
|||
|
|
|||
|
Luckily, Rust has a solution to this probem: string slices.
|
|||
|
|
|||
|
# String slices
|
|||
|
|
|||
|
A string slice looks like this:
|
|||
|
|
|||
|
```rust
|
|||
|
let s = String::from("hello world");
|
|||
|
|
|||
|
let hello = &s[0..5];
|
|||
|
let world = &s[5..9];
|
|||
|
```
|
|||
|
|
|||
|
This looks just like taking a reference to the whole `String`, but with the
|
|||
|
extra `[0..5]` bit. Instead of being a reference to the entire `String`,
|
|||
|
it’s a reference to an internal position in the `String`, but it also keeps
|
|||
|
track of the number of elements that it refers to as well. In other words,
|
|||
|
it looks like this:
|
|||
|
|
|||
|
DIAGRAM GOES HERE of s, hello, and world
|
|||
|
|
|||
|
With Rust’s `..` syntax, if you want to start at zero, you can drop the zero.
|
|||
|
In other words, these are equal:
|
|||
|
|
|||
|
```rust
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
let slice = &s[0..2];
|
|||
|
let slice = &s[..2];
|
|||
|
```
|
|||
|
|
|||
|
By the same token, if you want to go to the maximum value, which for slices is
|
|||
|
the last element, you can drop the trailing number. In other words, these are
|
|||
|
equal:
|
|||
|
|
|||
|
```rust
|
|||
|
let s = String::from("hello");
|
|||
|
|
|||
|
let len = s.len();
|
|||
|
|
|||
|
let slice = &s[1..len];
|
|||
|
let slice = &s[1..];
|
|||
|
```
|
|||
|
|
|||
|
With this in mind, let’s re-write `first_word()` to return a slice:
|
|||
|
|
|||
|
```rust
|
|||
|
fn first_word(s: &String) -> &str {
|
|||
|
let bytes = s.as_bytes();
|
|||
|
|
|||
|
for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
if byte == 32 {
|
|||
|
return &s[0..i];
|
|||
|
}
|
|||
|
}
|
|||
|
|
|||
|
&s[..]
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Now, we have a single value, the `&str`. It contains both elements that we care
|
|||
|
about: a reference to the starting point, and the number of elements.
|
|||
|
This would also work for a `second_word()`:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn second_word(s: &String) -> &str {
|
|||
|
```
|
|||
|
|
|||
|
Same deal. We now have a straightforward API, that’s much harder to mess up.
|
|||
|
|
|||
|
But what about our error condition from before? Slices also fix that. Using
|
|||
|
the slice version of `first_word()` will throw an error:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
# fn first_word(s: &String) -> &str {
|
|||
|
# let bytes = s.as_bytes();
|
|||
|
#
|
|||
|
# for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
# if byte == 32 {
|
|||
|
# return &s[0..i];
|
|||
|
# }
|
|||
|
# }
|
|||
|
#
|
|||
|
# &s[..]
|
|||
|
# }
|
|||
|
fn main() {
|
|||
|
let mut s = String::from("hello world");
|
|||
|
|
|||
|
let word = first_word(&s);
|
|||
|
|
|||
|
s.clear(); // Error!
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Here’s the error:
|
|||
|
|
|||
|
```text
|
|||
|
17:6 error: cannot borrow `s` as mutable because it is also borrowed as
|
|||
|
immutable [E0502]
|
|||
|
s.clear(); // Error!
|
|||
|
^
|
|||
|
15:29 note: previous borrow of `s` occurs here; the immutable borrow prevents
|
|||
|
subsequent moves or mutable borrows of `s` until the borrow ends
|
|||
|
let word = first_word(&s);
|
|||
|
^
|
|||
|
18:2 note: previous borrow ends here
|
|||
|
fn main() {
|
|||
|
|
|||
|
}
|
|||
|
^
|
|||
|
```
|
|||
|
|
|||
|
Remember the borrowing rules? If we have an immutable reference to something,
|
|||
|
we cannot also take a mutable reference. Since `clear()` needs to truncate the
|
|||
|
`String`, it tries to take a mutable reference, which fails. Not only has Rust
|
|||
|
made our API easier to use, but it’s also eliminated an entire class of errors
|
|||
|
at compile time!
|
|||
|
|
|||
|
### String literals are slices
|
|||
|
|
|||
|
Remember how we talked about string literals being stored inside of the binary
|
|||
|
itself? Now that we know about slices, we can now properly understand string
|
|||
|
literals.
|
|||
|
|
|||
|
```rust
|
|||
|
let s = "Hello, world!";
|
|||
|
```
|
|||
|
|
|||
|
The type of `s` here is `&str`: It’s a slice, pointing to that specific point
|
|||
|
of the binary. This is also why string literals are immutable; `&str` is an
|
|||
|
immutable reference.
|
|||
|
|
|||
|
## String slices as arguments
|
|||
|
|
|||
|
Knowing that you can take slices of both literals and `String`s leads us to
|
|||
|
one more improvement on `first_word()`, and that’s its signature:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn first_word(s: &String) -> &str {
|
|||
|
```
|
|||
|
|
|||
|
A more experienced Rustacean would write this one instead:
|
|||
|
|
|||
|
```rust,ignore
|
|||
|
fn first_word(s: &str) -> &str {
|
|||
|
```
|
|||
|
|
|||
|
Why is this? Well, we aren’t trying to modify `s` at all. And we can take
|
|||
|
a string slice that’s the full length of a `String`, so we haven’t lost
|
|||
|
the ability to talk about full `String`s. And additionally, we can take
|
|||
|
string slices of string literals too, so this function is more useful, but
|
|||
|
with no loss of functionality:
|
|||
|
|
|||
|
```rust
|
|||
|
# fn first_word(s: &str) -> &str {
|
|||
|
# let bytes = s.as_bytes();
|
|||
|
#
|
|||
|
# for (i, &byte) in bytes.iter().enumerate() {
|
|||
|
# if byte == 32 {
|
|||
|
# return &s[0..i];
|
|||
|
# }
|
|||
|
# }
|
|||
|
#
|
|||
|
# &s[..]
|
|||
|
# }
|
|||
|
fn main() {
|
|||
|
let s = String::from("hello world");
|
|||
|
let word = first_word(&s[..]);
|
|||
|
|
|||
|
let s = "hello world";
|
|||
|
let word = first_word(&s[..]);
|
|||
|
|
|||
|
let word = first_word(s); // since literals are &strs, this works too!
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
# Other slices
|
|||
|
|
|||
|
String slices, as you might imagine, are specific to strings. But there’s a more
|
|||
|
general slice type, too. Consider arrays:
|
|||
|
|
|||
|
```rust
|
|||
|
let a = [1, 2, 3, 4, 5];
|
|||
|
```
|
|||
|
|
|||
|
Just like we may want to refer to a part of a string, we may want to refer to
|
|||
|
part of an array:
|
|||
|
|
|||
|
```rust
|
|||
|
let a = [1, 2, 3, 4, 5];
|
|||
|
|
|||
|
let slice = &a[1..3];
|
|||
|
```
|
|||
|
|
|||
|
This slice has the type `&[i32]`. It works the exact same way as string slices
|
|||
|
do, with a reference to the first element, and a length. You’ll use this kind
|
|||
|
of slice for all sorts of other collections. We’ll discuss these other slices
|
|||
|
in detail when we talk about vectors, in Chapter 9.1.
|