Edits to the intro and Closures section

This commit is contained in:
Carol (Nichols || Goulding) 2017-01-17 22:11:06 -05:00
parent 7ce1c8fc72
commit 7cb1782fc8

View File

@ -1,216 +1,341 @@
# Functional Language features in Rust - Iterators and Closures
As a language, Rust takes influence from a lot of places. One of those places
is functional programming. We won't use this chapter to debate what exactly
'functional programming' is, but instead, show off some features of Rust that
are similar to many languages that are referred to as functional.
Rust's design has taken inspiration from a lot of previous work. One of Rust's
influences is functional programming, where functions are values that can be
used as arguments or return values to other functions, assigned to variables,
and so forth. We're going to sidestep the issue of what, exactly, function
programming is or is not, and instead show off some features of Rust that
are similar to features in many languages referred to as functional.
More specifically, we're going to cover:
* Closures, a function-like construct you can store in a binding.
* Iterators, a way of processing series of elements.
* Using these features to improve upon the project from the last chapter.
* A bit about these features' performance. Spoiler alert: they're faster than
you might think!
* *Closures*, a function-like construct you can store in a variable.
* *Iterators*, a way of processing series of elements.
* How to use these features to improve upon the project from the last chapter.
* The performance of these features. Spoiler alert: they're faster than you
might think!
This is not a complete list of Rust's influence from the functional style;
This is not a complete list of Rust's influence from the functional style:
pattern matching, enums, and many other features are too. But mastering
closures and iterators will help you write idiomatic, fast Rust code.
closures and iterators are an important part of writing idiomatic, fast Rust
code.
## Closures
Rust gives you the ability to define *closures*, which are sort of like
functions. Instead of giving you a technical definintion, let's dive into
what clousures look like, syntactically:
Rust gives you the ability to define *closures*, which are similar to
functions. Instead of starting with a technical definintion, let's see what
clousures look like, syntactically, and then we'll return to defining what they
are. Listing 13-1 shows a small closure whose definition is assigned to the
variable `add_one`, which we can then use to call the closure:
<figure>
<span class="filename">Filename: src/main.rs</span>
```rust
fn main() {
let add_one = |x| x + 1;
let five = add_one(4);
assert_eq!(5, five);
}
```
<figcaption>
Listing 13-1: A closure that takes one parameter and adds one to it, assigned to
the variable `add_one`
</figcaption>
</figure>
The closure definition, on the first line, shows that the closure takes one
parameter named `x`. Parameters to closures go in between vertical pipes (`|`).
This is a minimal closure with only one expression as its body. Listing 13-2 has
a closure with a bit more complexity:
<figure>
<span class="filename">Filename: src/main.rs</span>
```rust
fn main() {
let calculate = |a, b| {
let mut result = a * 2;
result += b;
result
};
assert_eq!(7, calculate(2, 3)); // 2 * 2 + 3 == 7
assert_eq!(13, calculate(4, 5)); // 4 * 2 + 5 == 13
}
```
<figcaption>
Listing 13-2: A closure with two parameters and multiple expressions in its body
</figcaption>
</figure>
We can use curly brackets to define a closure body with more than one
expression.
You'll notice a few things about closures that are different from functions
defined with the `fn` keyword. The first difference is that we did not need to
annotate the types of the parameters the closure takes or the value it returns.
We can choose to add type annotations if we want; Listing 13-3 shows the
closure from Listing 13-1 with annotations for the parameter's and return
value's types:
<figure>
<span class="filename">Filename: src/main.rs</span>
```rust
fn main() {
let add_one = |x: i32| -> i32 { x + 1 };
assert_eq!(2, add_one(1));
}
```
<figcaption>
Listing 13-3: A closure definition with optional parameter and return value
type annotations
</figcaption>
</figure>
The syntax of closures and functions looks more similar with type annotations.
Let's compare the different ways we can specify closures with the syntax for
defining a function more directly. We've added some spaces here to line up the
relevant parts:
```rust
fn add_one_v1 (x: i32) -> i32 { x + 1 } // a function
let add_one_v2 = |x: i32| -> i32 { x + 1 }; // the full syntax for a closure
let add_one_v3 = |x: i32| { x + 1 }; // a closure eliding types
let add_one_v4 = |x: i32| x + 1 ; // without braces
```
The reason type annotations are not required for defining a closure but are
required for defining a function is that functions are part of an explicit
interface exposed to your users, so defining this interface rigidly is
important for ensuring that everyone agrees on what types of values a function
uses and returns. Closures aren't used in an exposed interface like this,
though: they're stored in bindings and called directly. Being forced to
annotate the types would be a significant ergonomic loss for little advantage.
Closure definitions do have one type inferred for each of their parameters and
for their return value. For instance, if we call the closure without type
annotations from Listing 13-1 using an `i8`, we'll get an error if we then try
to call the same closure with an `i32`:
<span class="filename">Filename: src/main.rs</span>
```rust
let add_one = |x| x + 1;
let five = add_one(4);
let five = add_one(4i8);
assert_eq!(5i8, five);
assert_eq!(5, five);
let three = add_one(2i32);
```
The first line defines a closure, and binds it to the `add_one` variable. The
arguments to the closure go in between the pipes (`|`).
The compiler gives us this error:
This is a simple closure with only one expression as its body, let's see one
that's got a bit more going on:
```rust
let calculate = |a, b| {
let mut result = a * 2;
result += b;
result
};
assert_eq!(7, calculate(2, 3)); // 2 * 2 + 3 == 7
assert_eq!(13, calculate(4, 5)); // 4 * 2 + 5 == 13
```text
error[E0308]: mismatched types
-->
|
7 | let three = add_one(2i32);
| ^^^^ expected i8, found i32
```
We can use `{}`s to give a closure a body with more than one expression.
Since closures' types can be inferred reliably since they're called directly,
it would be tedious if we were required to annotate their types.
You'll notice a few things about closures that are a bit different from
functions defined with `fn`. The first is that we did not need to annotate the
types of arguments the closure takes or the values it returns. We can:
Another reason to have a different syntax from functions for closures is that
they have different behavior than functions: closures possess an *environment*.
```rust
let plus_one = |x: i32| -> i32 { x + 1 };
assert_eq!(2, plus_one(1));
```
But we dont have to. Why is this? Functions are part of an explicit interface
that's exposed to your users, so defining this interface rigidly is helpful.
But closures aren't used like this: they're stored in bindings and called
directly. Being forced to annotate the types would be a significant ergonomic
loss for little advantage.
The syntax is similar, but a bit different. Let's compare more directly. We've
added some spaces here to line up the relevant parts:
```rust
fn plus_one_v1 (x: i32) -> i32 { x + 1 } // a function
let plus_one_v2 = |x: i32| -> i32 { x + 1 }; // the full syntax for a closure
let plus_one_v3 = |x: i32| { x + 1 }; // a closure eliding types
let plus_one_v4 = |x: i32| x + 1 ; // without braces
```
Small differences, but they're similar. Why come up with a different syntax
for closures? There's one additional way in which they're different from
functions: they posses an 'environment'.
## Closures and their environment
### Closures Can Reference Their Environment
We've learned that functions can only use variables that are in scope, either
by being static or being declared as parameters. But closures can do more.
They can access variables from their enclosing scope. Like this:
by being static or being declared as parameters. Closures can do more: they're
allowed to access variables from their enclosing scope. Listing 13-4 has an
example of a closure in the variable `equal_to_x` that uses the variable `x`
from the closure's surrounding environment:
<figure>
<span class="filename">Filename: src/main.rs</span>
```rust
fn main() {
let x = 4;
let equal_to_x = |z| z == x;
let y = 4;
assert!(equal_to_x(y));
}
```
Here, even though `x` is not an argument to `equal_to_x`, it's able to
refer to it, as it's a variable defined in the same scope. We couldn't
do this with a `fn`. If we tried...
<figcaption>
Listing 13-4: Example of a closure that refers to a variable in its enclosing
scope
</figcaption>
</figure>
Here, even though `x` is not one of the parameters of `equal_to_x`, the
`equal_to_x` closure is allowed to use `x`, since `x` is a variable defined in
the same scope that `equal_to_x` is defined. We aren't allowed to do the same
thing that Listing 13-4 does with functions; let's see what happens if we try:
<span class="filename">Filename: src/main.rs</span>
```rust,ignore
fn main() {
let x = 4;
fn equal_to_x(z) { z == x }
fn equal_to_x(z: i32) -> bool { z == x }
let y = 4;
assert!(equal_to_x(y));
}
```
We'd get an error:
We get an error:
```text
error: can't capture dynamic environment in a fn item; use the || { ... }
closure form instead [E0434]
z == x
^
error[E0434]: can't capture dynamic environment in a fn item; use the || { ... }
closure form instead
-->
|
4 | fn equal_to_x(z: i32) -> bool { z == x }
| ^
```
This only works with closures! This property is also subject to all of the
usual rules around ownership and borrowing. Because closures attempt to infer
the types of their arguments, they also have to infer how they're borrowed.
They'll do that from how they are used. Consider this example:
The compiler even reminds us that this only works with closures!
### Closures, Ownership, and Borrowing
The property of being allowed to use variables from the surrounding scope is
also subject to all of the usual rules around ownership and borrowing. Since
closures attempt to infer the types of their parameters, they also infer how
those parameters are borrowed. Closures make that inference by looking at how
they are used. Consider the example in Listing 13-5 that has functions that
borrow immutably, borrow mutably, and move their parameters, then closures that
reference values from their environment and call each of the functions. We'll
see how this affects inference of when a value is borrowed:
<figure>
<span class="filename">Filename: src/main.rs</span>
```rust
#[derive(Debug)]
struct Foo;
fn borrow(f: &Foo) {
println!("Took foo by reference.");
fn borrows(f: &Foo) {
println!("Took {:?} by reference.", f);
}
fn borrow_mut(f: &mut Foo) {
println!("Took foo by mutable reference.");
fn borrows_mut(f: &mut Foo) {
println!("Took {:?} by mutable reference.", f);
}
fn moves(f: Foo) {
println!("Took foo by ownership.");
println!("Took ownership of {:?}.", f);
}
let f = Foo;
let borrows = |f| borrow(f);
borrows(&f);
fn main() {
let f1 = Foo;
let closure_that_borrows = |x| borrows(x);
closure_that_borrows(&f1);
let mut f = Foo;
let borrows_mut = |f| borrow_mut(f);
borrows_mut(&mut f);
let mut f2 = Foo;
let closure_that_borrows_mut = |y| borrows_mut(y);
closure_that_borrows_mut(&mut f2);
let f = Foo;
let moves = |f| moves(f);
moves(f);
let f3 = Foo;
let closure_that_moves = |z| moves(z);
closure_that_moves(f3);
}
```
Here, Rust is able to look at how we use `f` inside of each closure. If we pass
it to a function that takes `&Foo`, then the type of `f` must be `&Foo`. If we
pass it to a function that takes `&mut Foo`, then the type of `f` must be `Foo`.
And so on.
<figcaption>
### The `move` keyword
Listing 13-5: Closures that borrow, borrow mutably, and take ownership of their
parameters, which is inferred from how the closure body uses the parameters
Rust will allow you to override this inference with the `move` keyword. This
will cause all variables to be taken by ownership, instead of whatever they
were inferred as. Consider this:
</figcaption>
</figure>
Here, Rust is able to look at how we use the parameters of each closure inside
their bodies. If the closure passes its parameter it to a function that takes
`&Foo`, then the type of the parameter must be `&Foo`. If it passes the
parameter to a function that takes `&mut Foo`, then the type of parameter must
be `&mut Foo`, and so on. If we try to use `f3` after the call to
`closure_that_moves` in the last line of `main`, we'll get a compiler error
since ownership of `f3` was transferred to `closure_that_moves`, which
transferred ownership to the function `moves`.
### Overriding Inferred Borrowing with the `move` Keyword
Rust will allow you to override the borrowing inference by using the `move`
keyword. This will cause all of the closure's parameters to be taken by
ownership, instead of whatever they were inferred as. Consider this example:
```rust
let mut num = 5;
let mut num = 4;
{
let mut add_num = |x| num += x;
add_num(5);
add_num(6);
}
assert_eq!(10, num);
```
So in this case, our closure took a mutable reference to `num`, and then when
we called `add_num`, it mutated the underlying value, as we'd expect. We also
needed to declare `add_num` as `mut` too, because were mutating its
environment.
In this case, the `add_num` closure took a mutable reference to `num`, then
when we called `add_num`, it mutated the underlying value. In the last line,
`num` contains 10, as we'd expect. We also needed to declare `add_num` itself
as `mut` too, because we're mutating its environment.
If we change to a `move` closure, its different:
If we change the definition of `add_num` to a `move` closure, the behavior is
different:
```rust
let mut num = 5;
let mut num = 4;
{
let mut add_num = move |x| num += x;
add_num(5);
add_num(6);
}
assert_eq!(5, num);
assert_eq!(4, num);
```
We only get `5`. Rather than taking a mutable borrow out on our `num`, we took
ownership of a copy.
In the last line, `num` now contains 4: `add_num` took ownership of a copy of
`num`, rather than mutably borrowing `num`.
One of the most common places you'll see the `move` keyword used is with threads.
We'll talk more about that in Chapter XX.
One of the most common places you'll see the `move` keyword used is with
threads, since it's important that one thread is no longer allowed to use a
value once the value has been transferred to another thread through a closure
in order to prevent data races. We'll talk more about that in Chapter XX.
### Closures, ownership, and borrowing
### Closures and Lifetimes
Remember Listing 10.8 from Chapter 10.3? It looked like this:
Remember Listing 10-8 from the Lifetime Syntax section of Chapter 10? It looked
like this:
```rust,ignore
{
@ -225,27 +350,38 @@ Remember Listing 10.8 from Chapter 10.3? It looked like this:
}
```
This example doesn't compile, becuase `x` doesn't have a long enough lifetime.
Becuase closures may borrow variables from their enclosing scope, we can
construct a similar example with closures. It also won't compile:
This example doesn't compile since `x` doesn't have a long enough lifetime.
Because closures may borrow variables from their enclosing scope, we can
construct a similar example with a closure that borrows `x` and tries to return
that borrowed value. The code in Listing 13-6 also won't compile:
<figure>
```rust,ignore
{
let closure;
{
let x = 4;
closure = || x ; // A closure that takes no arguments and returns x.
}
}
```
<figcaption>
Listing 13-6: A closure that tries to return a borrowed value that does not live
long enough
</figcaption>
</figure>
We get an error because `x` does not live long enough:
```text
error: `x` does not live long enough
--> <anon>:8:22
-->
|
8 | closure = || x ; // A closure that takes no arguments and returns x.
| -- ^ does not live long enough
@ -257,57 +393,73 @@ error: `x` does not live long enough
| - borrowed value needs to live until here
```
In this instance, we can use the `move` keyword from the last section
to have the closure take `x` by value, and since `x` is a number, it
is `Copy` and therefore will be copied:
To fix the error in the code in Listing 13-6, we can use the `move` keyword
from the last section to make the closure take ownership of `x`. Because `x` is
a number, it is a `Copy` type and therefore will be copied into the closure.
The code in Listing 13-7 will compile:
<figure>
```rust
{
let closure;
{
let mut x = 4;
closure = move || x ; // A closure that takes no arguments and returns x.
x = 5;
assert_eq!(closure(), 4);
assert_eq!(closure(), 4);
}
}
```
Even though we modified `x`, since `closure` now has its own version, our
changes to `x` won't change the version of `x` that's in the closure.
<figcaption>
Rust doesn't provide a way to say "these variables must be captured like this";
it's either all by inference, or all by value. However, you can accomplish
this sort of goal by combining `move` with some extra bindings:
Listing 13-7: Moving a value into the closure to fix the lifetime error
</figcaption>
</figure>
Even though we modified `x` between the closure definition and `assert_eq!`,
since `closure` now has its own version, the changes to `x` won't change the
version of `x` that's in the closure.
Rust doesn't provide a way to say that some values a closure uses should be
borrowed and some should be moved; it's either all by inference or all moved by
adding the `move` keyword. However, we can accomplish the goal of borrowing
some values and taking ownership of others by combining `move` with some extra
bindings. Consider this example where we want to borrow `s1` but take ownership
of `s2`:
```rust
let s1 = String::from("hello");
let s2 = String::from("hello");
// We want to capture s1 by reference, but s2 by value. What to do? First, make
// an extra binding for s1:
let s2 = String::from("goodbye");
let r = &s1;
// Then make it a `move` closure:
let calculation = move || {
// ... and use them inside. That's the trick: r is captured, but it's a
// reference; so we've effectively taken s1 by reference and s2 by move.
r;
s2;
};
println!("Can still use s1 here but not s2: {}", s1);
```
### Accepting closures as arguments with the `Fn` traits
We've declared `calculation` to `move` all the values it references. Before
defining `calculation`, we declare a new variable `r` that borrows `s1`. Then
in the body of the `calculation` closure, we use `r` instead of using `s1`
directly. The closure takes ownership of `r`, but `r` is a reference, so the
closure hasn't taken ownership of `s1` even though `calculation` uses `move`.
### Closures as Function Parameters Using the `Fn` Traits
While we can bind closures to variables, that's not the most useful thing we
can do with them. We can also take closures as arguments to functions. We can
do that with the `Fn` traits. Like this:
can do with them. We can also define functions that have closures as parameters
by using the `Fn` traits. Here's an example of a function named `call_with_one`
whose signature has a closure as a parameter:
```rust
fn call_with_one<F>(some_closure: F) -> i32
@ -321,40 +473,33 @@ let answer = call_with_one(|x| x + 2);
assert_eq!(3, answer);
```
We pass our closure, `|x| x + 2`, to `call_with_one`. It does what it suggests:
it calls the closure, giving it `1` as an argument.
We pass the closure `|x| x + 2`, to `call_with_one`, and `call_with_one` calls
that closure with `1` as an argument. The return value of the call to
`some_closure` is then returned from `call_with_one`.
Let's examine the signature of `call_with_one` in more depth:
The signature of `call_with_one` is using the `where` syntax discussed in the
Traits section of Chapter 10. The `some_closure` parameter has the generic type
`F`, which in the `where` clause is defined as having the trait bounds
`Fn(i32) -> i32`. The `Fn` trait represents a closure, and we can add types to
the `Fn` trait to represent a specific type of closure. In this case, our
closure has a parameter of type `i32` and returns an `i32`, so the generic bound
we specify is `Fn(i32) -> i32`.
```rust
fn call_with_one<F>(some_closure: F) -> i32
# where F: Fn(i32) -> i32 {
# some_closure(1) }
```
Specifying a function signature that contains a closure requires the use of
generics and trait bounds. Each closure has a unique type, so we can't write
the type of a closure directly, we have to use generics.
We take one parameter, and it has the type `F`. We also return an `i32`. This
part isnt interesting. The next part is:
`Fn` isn't the only trait bound available for specifying closures, however.
There are three: `Fn`, `FnMut`, and `FnOnce`. This continues the patterns of
threes we've seen elsewhere in Rust: borrowing, borrowing mutably, and
ownership. Using `Fn` specifies that the closure used may only borrow values in
its environment. To specify a closure that mutates the environment, use
`FnMut`, and if the closure takes ownership of the environment, `FnOnce`. Most
of the time, you can start with `Fn`, and the compiler will tell you if you
need `FnMut` or `FnOnce` based on the closure values passed into the function.
```rust
# fn call_with_one<F>(some_closure: F) -> i32
where F: Fn(i32) -> i32 {
# some_closure(1) }
```
The `Fn` trait represents a closure. We can use it as a bound for our generic
type. In this case, our closure takes an `i32` as an argument and returns an
`i32`, and so the generic bound we use is `Fn(i32) -> i32`.
Why a trait? Well, each closure has a unique type. Becuase of this, we can't
write the type of a closure directly, we have to use generics.
`Fn` isn't the only trait, however, there are three. `Fn`, `FnMut`, and
`FnOnce`. This continues the patterns of threes we've seen elsewhere in Rust:
by owner, by reference, and by mutable reference. By using `Fn`, you may only
refer to things in its environment by reference. If you mutate the environment,
you must use `FnMut`, and if you take ownership of the environment, `FnOnce`.
Most of the time, you can write `Fn`, and then the compiler will tell you if
you need `FnMut` or `FnOnce`.
To illustrate a situation where it's useful for a function to have a parameter
that's a closure, let's move on to our next topic: iterators.
## Iterators
@ -687,7 +832,7 @@ let sum: u64 = (1..).zip(2..)
.filter(|&x| x < 100)
.take(5)
.sum();
assert_eq!(35, sum);
```
@ -819,7 +964,7 @@ We can write this code like this instead:
fn grep<'a>(search: &str, contents: &'a str) -> Vec<&'a str> {
contents.lines()
.filter(|line| line.contains(search))
.collect()
.collect()
}
```