mirror of
https://github.com/rust-lang-cn/book-cn.git
synced 2025-01-23 23:50:25 +08:00
Addressing steve nits
This commit is contained in:
parent
701855e165
commit
d26cdc7508
@ -49,9 +49,10 @@ change, we can store data on the heap instead. The heap is less organized: when
|
||||
we put data on the heap, we ask for some amount of space. The operating system
|
||||
finds an empty spot somewhere in the heap that is big enough, marks it as being
|
||||
in use, and returns to us a pointer to that location. This process is called
|
||||
*allocating on the heap*. Since the pointer is a known, fixed size, we can
|
||||
store the pointer on the stack, but when we want the actual data, we have to
|
||||
follow the pointer.
|
||||
*allocating on the heap*, and sometimes we just say "allocating" for short.
|
||||
Pushing values onto the stack is not considered allocating. Since the pointer
|
||||
is a known, fixed size, we can store the pointer on the stack, but when we want
|
||||
the actual data, we have to follow the pointer.
|
||||
|
||||
Think of being seated at a restaurant. When you enter, you say how many people
|
||||
are in your group, and the staff finds an empty table that would fit everyone
|
||||
@ -82,14 +83,14 @@ examples that will illustrate the rules:
|
||||
|
||||
> 1. Each value in Rust has a variable binding that’s called its *owner*.
|
||||
> 2. There can only be one owner at a time.
|
||||
> 3. When the owner goes out of scope, the value will be `drop()`ped.
|
||||
> 3. When the owner goes out of scope, the value will be dropped.
|
||||
|
||||
### Variable Binding Scope
|
||||
|
||||
We've walked through an example of a Rust program already in the tutorial
|
||||
chapter. Now that we’re past basic syntax, we won’t include all of the `fn
|
||||
main() {` stuff in examples, so if you’re following along, you will have to put
|
||||
the following examples inside of a `main()` function yourself. This lets our
|
||||
the following examples inside of a `main` function yourself. This lets our
|
||||
examples be a bit more concise, letting us focus on the actual details rather
|
||||
than boilerplate.
|
||||
|
||||
@ -135,11 +136,10 @@ data types provided by the standard library and that you create. We'll go into
|
||||
more depth about `String` specifically in Chapter XX.
|
||||
|
||||
We've already seen string literals, where a string value is hard-coded into our
|
||||
program. String literals are on the stack, because our source code is actually
|
||||
the first thing that goes onto our stack. String literals are convenient, but
|
||||
they aren’t always suitable for every situation you want to use text. For one
|
||||
thing, they’re immutable. For another, not every string value can be known when
|
||||
we write our code: what if we want to take user input and store it?
|
||||
program. String literals are convenient, but they aren’t always suitable for
|
||||
every situation you want to use text. For one thing, they’re immutable. For
|
||||
another, not every string value can be known when we write our code: what if we
|
||||
want to take user input and store it?
|
||||
|
||||
For things like this, Rust has a second string type, `String`. This type is
|
||||
allocated on the heap, and as such, is able to store an amount of text that is
|
||||
@ -151,8 +151,8 @@ let s = String::from("hello");
|
||||
```
|
||||
|
||||
The double colon (`::`) is an operator that allows us to namespace this
|
||||
particular `from()` function under the `String` type itself, rather than using
|
||||
some sort of name like `string_from()`. We’ll discuss this syntax more in the
|
||||
particular `from` function under the `String` type itself, rather than using
|
||||
some sort of name like `string_from`. We’ll discuss this syntax more in the
|
||||
“Method Syntax” and “Modules” chapters.
|
||||
|
||||
This kind of string *can* be mutated:
|
||||
@ -170,11 +170,11 @@ cannot? The difference comes down to how these two types deal with memory.
|
||||
### Memory and Allocation
|
||||
|
||||
In the case of a string literal, because we know the contents at compile time,
|
||||
the text is hard-coded directly into the final executable and stored with the
|
||||
code on the stack. This makes string literals quite fast and efficient. But
|
||||
these properties only come from its immutability. Unfortunately, we can’t put a
|
||||
blob of memory into the binary for each piece of text whose size is unknown at
|
||||
compile time and whose size might change over the course of running the program.
|
||||
the text is hard-coded directly into the final executable. This makes string
|
||||
literals quite fast and efficient. But these properties only come from its
|
||||
immutability. Unfortunately, we can’t put a blob of memory into the binary for
|
||||
each piece of text whose size is unknown at compile time and whose size might
|
||||
change over the course of running the program.
|
||||
|
||||
With the `String` type, in order to support a mutable, growable piece of text,
|
||||
we need to allocate an amount of memory on the heap, unknown at compile time,
|
||||
@ -184,7 +184,7 @@ to hold the contents. This means two things:
|
||||
2. We need a way of giving this memory back to the operating system when we’re
|
||||
done with our `String`.
|
||||
|
||||
That first part is done by us: when we call `String::from()`, its
|
||||
That first part is done by us: when we call `String::from`, its
|
||||
implementation requests the memory it needs. This is pretty much universal in
|
||||
programming languages.
|
||||
|
||||
@ -196,7 +196,7 @@ call code to explicitly return it, just as we did to request it. Doing this
|
||||
correctly has historically been a difficult problem in programming. If we
|
||||
forget, we will waste memory. If we do it too early, we will have an invalid
|
||||
variable. If we do it twice, that’s a bug too. We need to pair exactly one
|
||||
`allocate()` with exactly one `free()`.
|
||||
`allocate` with exactly one `free`.
|
||||
|
||||
Rust takes a different path: the memory is automatically returned once the
|
||||
binding to it goes out of scope. Here’s a version of our scope example from
|
||||
@ -213,11 +213,11 @@ earlier using `String`:
|
||||
There is a natural point at which we can return the memory our `String` needs
|
||||
back to the operating system: when `s` goes out of scope. When a variable
|
||||
binding goes out of scope, Rust calls a special function for us. This function
|
||||
is called `drop()`, and it is where the author of `String` can put the code to
|
||||
return the memory. Rust calls `drop()` automatically at the closing `}`.
|
||||
is called `drop`, and it is where the author of `String` can put the code to
|
||||
return the memory. Rust calls `drop` automatically at the closing `}`.
|
||||
|
||||
> Note: This pattern is sometimes called “Resource Acquisition Is
|
||||
> Initialization” in C++, or “RAII” for short. While they are very similar,
|
||||
> Note: This pattern is sometimes called *Resource Acquisition Is
|
||||
> Initialization* in C++, or RAII for short. While they are very similar,
|
||||
> Rust’s take on this concept has a number of differences, so we don’t tend
|
||||
> to use the same term. If you’re familiar with this idea, keep in mind that it
|
||||
> is _roughly_ similar in Rust, but not identical.
|
||||
@ -239,7 +239,7 @@ let y = x;
|
||||
|
||||
We can probably guess what this is doing based on our experience with other
|
||||
languages: “Bind the value `5` to `x`, then make a copy of the value in `x` and
|
||||
bind it to `y`”. We now have two bindings, `x` and `y`, and both equal `5`.
|
||||
bind it to `y`.” We now have two bindings, `x` and `y`, and both equal `5`.
|
||||
This is indeed what is happening since integers are simple values with a known,
|
||||
fixed size, and these two `5` values are pushed onto the stack.
|
||||
|
||||
@ -252,10 +252,7 @@ let s2 = s1;
|
||||
|
||||
This looks very similar to the previous code, so we might assume that the way
|
||||
it works would be the same: that the second line would make a copy of the value
|
||||
in `s1` and bind it to `s2`. This isn't quite what happens: `String` values are
|
||||
stored on the heap, so Rust's ownership rules apply here so that Rust ensures
|
||||
we don't have any of the bugs we mentioned before that are common around
|
||||
cleaning up memory.
|
||||
in `s1` and bind it to `s2`. This isn't quite what happens.
|
||||
|
||||
To explain this more thoroughly, let’s take a look at what `String` looks like
|
||||
under the covers in Figure 4-1. A `String` is made up of three parts, shown on
|
||||
@ -292,10 +289,10 @@ Figure 4-3: Another possibility for what `s2 = s1` might do, if Rust chose to
|
||||
copy heap data as well.
|
||||
|
||||
Earlier, we said that when a binding goes out of scope, Rust will automatically
|
||||
call the `drop()` function and clean up the heap memory for that binding. But
|
||||
call the `drop` function and clean up the heap memory for that binding. But
|
||||
in figure 4-2, we see both data pointers pointing to the same location. This is
|
||||
a problem: when `s2` and `s1` go out of scope, they will both try to free the
|
||||
same memory. This is known as a "double free" error and is one of the memory
|
||||
same memory. This is known as a *double free* error and is one of the memory
|
||||
safety bugs we mentioned before. Freeing memory twice can lead to memory
|
||||
corruption, which can potentially lead to security vulnerabilities.
|
||||
|
||||
@ -344,14 +341,14 @@ copying can be assumed to be inexpensive.
|
||||
#### Ways Bindings and Data Interact: Clone
|
||||
|
||||
If we _do_ want to deeply copy the `String`’s data and not just the `String`
|
||||
itself, there’s a common method for that: `clone()`. We will discuss methods in
|
||||
itself, there’s a common method for that: `clone`. We will discuss methods in
|
||||
the section on [`structs` in Chapter XX][structs]<!-- ignore -->, but they’re a
|
||||
common enough feature in many programming languages that you have probably seen
|
||||
them before.
|
||||
|
||||
[structs]: ch05-01-structs.html
|
||||
|
||||
Here’s an example of the `clone()` method in action:
|
||||
Here’s an example of the `clone` method in action:
|
||||
|
||||
```rust
|
||||
let s1 = String::from("hello");
|
||||
@ -363,7 +360,7 @@ println!("s1 = {}, s2 = {}", s1, s2);
|
||||
This will work just fine, and this is how you can explicitly get the behavior
|
||||
we showed in Figure 4-3, where the heap data *does* get copied.
|
||||
|
||||
When you see a call to `clone()`, you know that some arbitrary code is being
|
||||
When you see a call to `clone`, you know that some arbitrary code is being
|
||||
executed, and that code may be expensive. It’s a visual indicator that something
|
||||
different is going on here.
|
||||
|
||||
@ -380,21 +377,20 @@ println!("x = {}, y = {}", x, y);
|
||||
```
|
||||
|
||||
This seems to contradict what we just learned: we don't have a call to
|
||||
`clone()`, but `x` is still valid, and wasn't moved into `y`.
|
||||
`clone`, but `x` is still valid, and wasn't moved into `y`.
|
||||
|
||||
This is because types like integers that have a known size at compile time are
|
||||
stored entirely on the stack, do not ask for heap memory from the operating
|
||||
system, and therefore do not need to be `drop()`ped when they go out of scope.
|
||||
stored entirely on the stack, so copies of the actual values are quick to make.
|
||||
That means there's no reason we would want to prevent `x` from being valid
|
||||
after we create the binding `y`. In other words, there’s no difference between
|
||||
deep and shallow copying here, so calling `clone()` wouldn’t do anything
|
||||
deep and shallow copying here, so calling `clone` wouldn’t do anything
|
||||
differently from the usual shallow copying and we can leave it out.
|
||||
|
||||
Rust has a special annotation called the `Copy` trait that we can place on
|
||||
types like these (we'll talk more about traits in Chapter XX). If a type has
|
||||
the `Copy` trait, an older binding is still usable after assignment. Rust will
|
||||
not let us annotate a type with the `Copy` trait if the type, or any of its
|
||||
parts, has implemented `drop()`. If the type needs something special to happen
|
||||
parts, has implemented `drop`. If the type needs something special to happen
|
||||
when the value goes out of scope and we add the `Copy` annotation to that type,
|
||||
we will get a compile-time error.
|
||||
|
||||
@ -435,7 +431,7 @@ fn main() {
|
||||
|
||||
fn takes_ownership(some_string: String) { // some_string comes into scope.
|
||||
println!("{}", some_string);
|
||||
} // Here, some_string goes out of scope and `drop()` is called. The backing
|
||||
} // Here, some_string goes out of scope and `drop` is called. The backing
|
||||
// memory is freed.
|
||||
|
||||
fn makes_copy(some_integer: i32) { // some_integer comes into scope.
|
||||
@ -443,7 +439,7 @@ fn makes_copy(some_integer: i32) { // some_integer comes into scope.
|
||||
} // Here, some_integer goes out of scope. Nothing special happens.
|
||||
```
|
||||
|
||||
If we tried to use `s` after the call to `takes_ownership()`, Rust
|
||||
If we tried to use `s` after the call to `takes_ownership`, Rust
|
||||
would throw a compile-time error. These static checks protect us from mistakes.
|
||||
Try adding code to `main` that uses `s` and `x` to see where you can use them
|
||||
and where the ownership rules prevent you from doing so.
|
||||
@ -488,7 +484,7 @@ fn takes_and_gives_back(a_string: String) -> String { // a_string comes into sco
|
||||
It’s the same pattern, every time: assigning a value to another binding moves
|
||||
it, and when heap data values' bindings go out of scope, if the data hasn’t
|
||||
been moved to be owned by another binding, the value will be cleaned up by
|
||||
`drop()`.
|
||||
`drop`.
|
||||
|
||||
Taking ownership then returning ownership with every function is a bit tedious.
|
||||
What if we want to let a function use a value but not take ownership? It’s
|
||||
|
@ -2,8 +2,8 @@
|
||||
|
||||
The issue with the tuple code at the end of the last section is that we have to
|
||||
return the `String` back to the calling function so that we can still use the
|
||||
`String` after the call to `calculate_length()`, since the `String` was moved
|
||||
into `calculate_length()`.
|
||||
`String` after the call to `calculate_length`, since the `String` was moved
|
||||
into `calculate_length`.
|
||||
|
||||
Here is how you would use a function without taking ownership of it using
|
||||
*references:*
|
||||
@ -28,7 +28,7 @@ fn calculate_length(s: &String) -> usize {
|
||||
|
||||
First, you’ll notice all of the tuple stuff in the binding declaration and the
|
||||
function return value is gone. Next, note that we pass `&s1` into
|
||||
`calculate_length()`, and in its definition, we take `&String` rather than
|
||||
`calculate_length`, and in its definition, we take `&String` rather than
|
||||
`String`.
|
||||
|
||||
These `&`s are *references*, and they allow you to refer to some value
|
||||
@ -159,10 +159,11 @@ something that new Rustaceans struggle with, because most languages let you
|
||||
mutate whenever you’d like. The benefit of having this restriction is that Rust
|
||||
can prevent data races at compile time. A *data race* is a particular type of
|
||||
race condition where two or more pointers access the same data at the same
|
||||
time, and at least one of the pointers is being used to write to the data. Data
|
||||
races cause unpredictable behavior and can be difficult to diagnose and fix
|
||||
when trying to track them down at runtime; Rust prevents this problem from
|
||||
happening since it won't even compile code with data races!
|
||||
time, at least one of the pointers is being used to write to the data, and
|
||||
there's no mechanism being used to synchronize access to the data. Data races
|
||||
cause undefined behavior and can be difficult to diagnose and fix when trying
|
||||
to track them down at runtime; Rust prevents this problem from happening since
|
||||
it won't even compile code with data races!
|
||||
|
||||
As always, we can use `{}`s to create a new scope, allowing for multiple mutable
|
||||
references, just not _simultaneous_ ones:
|
||||
@ -217,11 +218,13 @@ to track down why sometimes your data isn't what you thought it should be.
|
||||
|
||||
### Dangling References
|
||||
|
||||
In languages with pointers, it's easy to make the error of creating a “dangling
|
||||
pointer” by freeing some memory while keeping around a pointer to that memory.
|
||||
In Rust, by contrast, the compiler guarantees that references will never be
|
||||
dangling: if we have a reference to some data, the compiler will ensure that
|
||||
the data will not go out of scope before the reference to the data does.
|
||||
In languages with pointers, it's easy to make the error of creating a *dangling
|
||||
pointer*, a pointer referencing a location in memory that may have been given
|
||||
to someone else, by freeing some memory while keeping around a pointer to that
|
||||
memory. In Rust, by contrast, the compiler guarantees that references will
|
||||
never be dangling: if we have a reference to some data, the compiler will
|
||||
ensure that the data will not go out of scope before the reference to the data
|
||||
does.
|
||||
|
||||
Let’s try to create a dangling reference:
|
||||
|
||||
@ -262,7 +265,7 @@ a problem: `this function’s return type contains a borrowed value, but there i
|
||||
no value for it to be borrowed from`.
|
||||
|
||||
Let’s have a closer look at exactly what's happenening at each stage of our
|
||||
`dangle()` code:
|
||||
`dangle` code:
|
||||
|
||||
```rust,ignore
|
||||
fn dangle() -> &String { // dangle returns a reference to a String
|
||||
@ -274,7 +277,7 @@ fn dangle() -> &String { // dangle returns a reference to a String
|
||||
// Danger!
|
||||
```
|
||||
|
||||
Because `s` is created inside of `dangle()`, when the code of `dangle()` is
|
||||
Because `s` is created inside of `dangle`, when the code of `dangle` is
|
||||
finished, it will be deallocated. But we tried to return a reference to it.
|
||||
That means this reference would be pointing to an invalid `String`! That’s
|
||||
no good. Rust won’t let us do this.
|
||||
|
@ -44,15 +44,15 @@ let bytes = s.as_bytes();
|
||||
|
||||
Since we need to go through the String element by element and
|
||||
check if a value is a space, we will convert our String to an
|
||||
array of bytes using the `.as_bytes()` method.
|
||||
array of bytes using the `as_bytes` method.
|
||||
|
||||
```rust,ignore
|
||||
for (i, &item) in bytes.iter().enumerate() {
|
||||
```
|
||||
|
||||
We will be discussing iterators in more detail in Chapter XX, but for now, know
|
||||
that `iter()` is a method that returns each element in a collection, and
|
||||
`enumerate()` modifies the result of `iter()` and returns each element as part
|
||||
that `iter` is a method that returns each element in a collection, and
|
||||
`enumerate` modifies the result of `iter` and returns each element as part
|
||||
of a tuple instead, where the first element of the tuple is the index, and the
|
||||
second element is a reference to the element itself. This is a bit nicer than
|
||||
calculating the index ourselves.
|
||||
@ -78,7 +78,7 @@ string, but there’s a problem. We’re returning a `usize` on its own, but it
|
||||
only a meaningful number in the context of the `&String`. In other words,
|
||||
because it’s a separate value from the `String`, there’s no guarantee that it
|
||||
will still be valid in the future. Consider this program that uses this
|
||||
`first_word()` function:
|
||||
`first_word` function:
|
||||
|
||||
Filename: src/main.rs
|
||||
|
||||
@ -113,7 +113,7 @@ so `word` still contains the value `5`. We could use that `5` with `s` to try
|
||||
to extract the first word out, but this would be a bug since the contents of
|
||||
`s` have changed since we saved `5` in `word`.
|
||||
|
||||
This is bad! It’s even worse if we wanted to write a `second_word()`
|
||||
This is bad! It’s even worse if we wanted to write a `second_word`
|
||||
function. Its signature would have to look like this:
|
||||
|
||||
```rust,ignore
|
||||
@ -188,7 +188,7 @@ let slice = &s[0..len];
|
||||
let slice = &s[..];
|
||||
```
|
||||
|
||||
With this in mind, let’s re-write `first_word()` to return a slice. The type
|
||||
With this in mind, let’s re-write `first_word` to return a slice. The type
|
||||
that signifies "string slice" is written as `&str`:
|
||||
|
||||
Filename: src/main.rs
|
||||
@ -212,11 +212,11 @@ for the first occurrence of a space. When we find a space, we return a string
|
||||
slice using the start of the string and the index of the space as the starting
|
||||
and ending indices.
|
||||
|
||||
Now when we call `first_word()`, we get back a single value that is tied to the
|
||||
Now when we call `first_word`, we get back a single value that is tied to the
|
||||
underlying data. The value is made up of a reference to the starting point of
|
||||
the slice and the number of elements in the slice.
|
||||
|
||||
Returning a slice would also work for a `second_word()` function:
|
||||
Returning a slice would also work for a `second_word` function:
|
||||
|
||||
```rust,ignore
|
||||
fn second_word(s: &String) -> &str {
|
||||
@ -225,10 +225,10 @@ fn second_word(s: &String) -> &str {
|
||||
We now have a straightforward API that’s much harder to mess up. Remember our
|
||||
bug from before, when we got the first word but then cleared the string so that
|
||||
our first word was invalid? That code was logically incorrect but didn't show
|
||||
any immediate errors-- the problems would show up later, if we kept trying to
|
||||
any immediate errors. The problems would show up later, if we kept trying to
|
||||
use the first word index with an emptied string. Slices make this bug
|
||||
impossible, and let us know we have a problem with our code much sooner. Using
|
||||
the slice version of `first_word()` will throw a compile time error:
|
||||
the slice version of `first_word` will throw a compile time error:
|
||||
|
||||
Filename: src/main.rs
|
||||
|
||||
@ -261,7 +261,7 @@ fn main() {
|
||||
```
|
||||
|
||||
Remember from the borrowing rules that if we have an immutable reference to
|
||||
something, we cannot also take a mutable reference. Since `clear()` needs to
|
||||
something, we cannot also take a mutable reference. Since `clear` needs to
|
||||
truncate the `String`, it tries to take a mutable reference, which fails. Not
|
||||
only has Rust made our API easier to use, but it’s also eliminated an entire
|
||||
class of errors at compile time!
|
||||
@ -283,7 +283,7 @@ immutable reference.
|
||||
#### String Slices as Arguments
|
||||
|
||||
Knowing that you can take slices of both literals and `String`s leads us to
|
||||
one more improvement on `first_word()`, and that’s its signature:
|
||||
one more improvement on `first_word`, and that’s its signature:
|
||||
|
||||
```rust,ignore
|
||||
fn first_word(s: &String) -> &str {
|
||||
|
Loading…
Reference in New Issue
Block a user