Start of generics chapter

This commit is contained in:
Steve Klabnik 2016-09-06 17:52:42 -04:00 committed by Carol (Nichols || Goulding)
parent dfe2d91dbb
commit 479b254803
3 changed files with 604 additions and 2 deletions

View File

@ -0,0 +1,85 @@
# Generics
One of the core tools a programming gives you is the ability to deal
effectively with duplication. Different kinds of duplication are dealt with in
different ways. Consider a small program that finds the largest number in a list:
```rust
let numbers = vec![34, 50, 25, 100, 65];
let mut largest = numbers[0];
for number in numbers {
if largest > number {
largest = number;
}
}
println!("The largest number is {}", largest);
```
If we needed to find the largest number twice, we could duplicate our code:
```rust
let numbers = vec![34, 50, 25, 100, 65];
let mut largest = numbers[0];
for number in numbers {
if largest > number {
largest = number;
}
}
println!("The largest number is {}", largest);
let numbers = vec![102, 34, 6000, 89, 54, 2, 43, 8];
let mut largest = numbers[0];
for number in numbers {
if largest > number {
largest = number;
}
}
println!("The largest number is {}", largest);
```
However, this is tedious and error-prone. Rust, like many languages, gives us a
way to deal with this duplication by creating an abstraction. In this case, the
answer is functions:
```rust
fn largest(numbers: Vec<i32>) {
let mut largest = numbers[0];
for number in numbers {
if largest > number {
largest = number;
}
}
println!("The largest number is {}", largest);
}
let numbers = vec![34, 50, 25, 100, 65];
largest(numbers);
let numbers = vec![102, 34, 6000, 89, 54, 2, 43, 8];
largest(numbers);
```
But functions aren't the only way to abstract away different kinds of code. For
example, our `largest` function only works for vectors of `i32`. What if we wanted
to find the largest number in a list of floats? Or the largest element of some sort
of custom `struct` or `enum`? We can't solve this duplication with regular functions.
To solve these kinds of problems, Rust provides a feature called *generics*. In the
same way that functions allow us to abstract over common code, generics allow us to
abstract over types. This ability gives us tremendous power to write code that works
in a large number of situations. First, we'll examine the syntax of generics. Then,
we'll talk about another feature that's used to augment generics: traits. Finally,
we'll discuss one of Rust's most unique uses of generics: lifetimes.

View File

@ -1,7 +1,294 @@
# Generics
We've already hinted at generics previously in the book, but never dug into
what exactly they are. You can always recognize when generics are used by
the way that they fit into Rust's syntax:
Any time you see angle brackets, `<>`, you're dealing with generics.
The types we've seen before like `Vec<i32>`? That's employing generics. The
proper name for vectors is `Vec<T>`. That `T` is called a *type parameter*, and
it serves a similar function to parameters to functions: you give it some kind
of value, and that determines how it works. In the same way that a function
like `foo(x: i32)` can be called with `foo(5)`, a `Vec<T>` can be created with
a specific type, like `Vec<i32>`.
## Generic data types
## Generic functions
Let's dive into generic data types in a bit more detail. We previously learned
about the `Option<T>` type, but we never examined its definition. Let's try to
imagine how we'd write it. First, let's consider an option of a number:
## Generic methods
```rust
enum OptionalNumber {
Some(i32),
None,
}
let number = OptionalNumber::Some(5);
let no_number = OptionalNumber::None;
```
This works just fine for `i32`s. But what if we also wanted to store `f64`s?
Or `String`s? We would have to write this:
```rust
enum OptionalFloatingPointNumber {
Some(f64),
None,
}
let number = OptionalFloatingPointNumber::Some(5.0);
let no_number = OptionalFloatingPointNumber::None;
```
The name is a bit long to drive the point home. With our current knowledge, we
would have to write a unique type for every single kind of option. In other
words, the idea of "an optional value" is a higher-order concept than any
specific type. We want it to work for any type at all.
We can do that with generics. In fact, that's how the actual option type works
in Rust. Let's check out its definition:
```rust
enum Option<T> {
Some(T),
None,
}
```
There's those angle brackets. If we were to read this definition aloud, we'd
say "`Option` is an `enum` with one type parameter, `T`. It has two variants:
`Some`, which has a value with type `T`, and `None`, which has no value." A
bit of a mouthful! But this will work with any type:
```rust
let integer = Option::Some(5);
let float = Option::Some(5.0);
```
We've left in the `Option` bit for consistency with the previous examples, but
since `Option<T>` is in the prelude, it's not needed:
```rust
let integer = Some(5);
let float = Some(5.0);
```
So, what's up with this syntax. Let's compare our two non-generic `enum`s side by
side:
```text
enum OptionalNumber { enum OptionalFloatingPointNumber {
Some(i32), Some(f64),
None, None,
} }
```
We have one line that's very close, but different: the `Some` bit. The only
difference is the type of the data, `i32` and `f64`. Just like we can
parameterize arguments to a function by choosing a name, we can parameterize
the type by choosing a name. In this case, `T`. We could choose any identifier
here, but traditionally, type parameters follow the same style as types
themselves: CamelCase. In addition, they tend to be short, often one letter.
`T` is the traditional choice, short for 'type'. So let's do that:
```text
enum OptionalNumber { enum OptionalFloatingPointNumber {
Some(T), Some(T),
None, None,
} }
```
We've replaced `i32` and `f64` with `T`. There's one problem, though: we've
*used* `T`, but not defined it. This would be similar to using an argument to
a function without declaring it. We need to tell Rust that we've introduced a
generic parameter. We can do that with the angle brackets, let's try it:
```text
enum OptionalNumber<T> { enum OptionalFloatingPointNumber<T> {
Some(T), Some(T),
None, None,
} }
```
The `<>`s indicate a list of type parameters, just like `()` indicates a
list of value parameters. Now, the only difference between our two `enum`s
is the name. And since we've made them generic, they're not specific to numbers
or floating point numbers. So let's give them the same name:
```text
enum Option<T> { enum Option<T> {
Some(T), Some(T),
None, None,
} }
```
Now they're identical! We've made our type fully generic. Understanding this
process is important, because the compiler actually does the exact opposite of
this when compiling your code. This is called *monomorphization*, and it's why
Rust's generics are extremely efficient. Consider this code:
```rust
// This is in the standard library, but we're including it to make the example
// a bit more obvious.
enum Option<T> {
Some(T),
None,
}
let integer = Some(5);
let float = Some(5.0);
```
When Rust compiles this code, it will perform monomorphization. What this means
is that the compiler will see that we've used two kinds of `Option<T>`: one
where `T` is `i32`, and one where `T` is `f64`. As such, it will expand the
generic definition of `Option<T>` into `Option<i32>` and `Option<f64>`, and
replace the calls with the specific versions. Like this:
```rust
enum OptionInteger {
Some(i32),
None,
}
enum OptionFloat {
Some(f64),
None,
}
let integer = OptionInteger::Some(5);
let float = OptionFloat::Some(5.0);
```
In other words, we can write the non-duplicated form, but Rust will act as
though we wrote the specific type out in each instance. This means that we
pay no runtime cost for using generics; it's just like we copy/pasted
each particular definition.
In a similar fashion, we can use `<>`s with structs as well:
```rust
struct Point<T> {
x: T,
y: T,
}
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
```
The process is the same: add a `<T>` by the name, then put `T` in where the
type name goes.
If we need multiple type parameters, we can use a comma. Consider a universe in
which `x` and `y` need different types:
```rust
struct Point<X, Y> {
x: X,
y: Y,
}
```
Now `x` will have the type of `X`, and `y` will have the type of `Y`. We can
make `enum`s with multiple type parameters as well. Remember `Result<T, E>`
from the error handling chapter? Here's its definition:
```rust
enum Result<T, E> {
Ok(T),
Err(E),
}
```
Each variant stores a different kind of information, and they're both generic.
You can have as many type parameters as you'd like. Similarly to parameters of
values, if you have a lot of them, it can get quite confusing, so try to keep
the number of them small if you can.
## Generic functions and methods
In a similar way to data structures, we can use the `<>` syntax to write
functions:
```rust
fn generic_function<T>(argument: T) {
// code goes here
}
```
and methods:
```rust
struct Foo;
impl Foo {
fn method<T>(argument: T) {
// code goes here
}
}
```
It's the same process: if we had these two functions:
```text
fn takes_integer(argument: i32) { fn takes_float(argument: f64) {
// code goes here // code goes here
} }
```
We'd replace their parameter with `T`:
```text
fn takes_integer(argument: T) { fn takes_float(argument: T) {
// code goes here // code goes here
} }
```
Add the `T` parameter to the type parameter list:
```text
fn takes_integer<T>(argument: T) { fn takes_float<T>(argument: T) {
// code goes here // code goes here
} }
```
And then rename them to be the same:
```text
fn takes<T>(argument: T) { fn takes<T>(argument: T) {
// code goes here // code goes here
} }
```
Now they're the same!
There's one problem though. We've got some function _definitions_ that work,
but if we try to do something with our argument, we'll get an error. To see
what we mean here, try out this function:
```rust,ignore
fn print<T>(argument: T) {
println!("Got an argument: {}", argument);
}
```
You'll get an error that looks like this:
```text
error[E0277]: the trait bound `T: std::fmt::Display` is not satisfied
--> <anon>:3:37
|
3 | println!("Got an argument: {}", argument);
| ^^^^^^^^ trait `T: std::fmt::Display` not satisfied
|
= help: consider adding a `where T: std::fmt::Display` bound
= note: required by `std::fmt::Display::fmt`
error: aborting due to previous error(s)
```
This error mentions something we haven't learned about yet: traits. In the next
section, we'll figure out how to make this compile.

View File

@ -1 +1,231 @@
# Traits
At the end of the last section, we had this code:
```rust,ignore
fn print<T>(argument: T) {
println!("Got an argument: {}", argument);
}
```
Which gave this error:
```text
error[E0277]: the trait bound `T: std::fmt::Display` is not satisfied
--> <anon>:3:37
|
3 | println!("Got an argument: {}", argument);
| ^^^^^^^^ trait `T: std::fmt::Display` not satisfied
|
= help: consider adding a `where T: std::fmt::Display` bound
= note: required by `std::fmt::Display::fmt`
error: aborting due to previous error(s)
```
The error message here refers to a *trait bound*. What's up with that?
Rust has a feature called *traits*. Traits are similar to a feature often
called 'interfaces' in other languages, but are also different. Traits let us
do another kind of abstraction: they let us abstract over a group of methods.
Here's a trait:
```rust
trait Printable {
fn print(&self);
}
```
We declare a trait with the `trait` keyword, and then the trait's name. In this
case, our trait will describe types which can be printed. Inside of some curly
braces, we declare a method signature, but instead of providing an
implementation, we use a semicolon. A trait can also have multiple methods:
```rust
trait Printable {
fn print(&self);
fn print_debug(&self);
}
```
Once we have a trait, we can use the `impl` keyword to implement that trait
for a type. It works like this:
```rust
struct Point {
x: i32,
y: i32,
}
trait Printable {
fn print(&self);
}
impl Printable for Point {
fn print(&self) {
println!("I'm a Point! I have an x of {} and a y of {}.", self.x, self.y);
}
}
```
In the same way `impl` let us define methods, we've also defined methods that
pertain to our trait. We can call methods that our trait has defined just like
we called other methods:
```rust
# struct Point {
# x: i32,
# y: i32,
# }
#
# trait Printable {
# fn print(&self);
# }
#
# impl Printable for Point {
# fn print(&self) {
# println!("I'm a Point! I have an x of {} and a y of {}.", self.x, self.y);
# }
# }
#
let p = Point { x: 1, y: 10 };
p.print();
```
There's a twist, though. We can only do this if our trait is in scope. For example,
if we had our trait in a module:
```rust
mod point {
pub struct Point {
pub x: i32,
pub y: i32,
}
pub trait Printable {
fn print(&self);
}
impl Printable for Point {
fn print(&self) {
println!("I'm a Point! I have an x of {} and a y of {}.", self.x, self.y);
}
}
}
// Without this line, we'd get an error:
use point::Printable;
fn main() {
let p = point::Point { x: 1, y: 10 };
p.print();
}
```
You'll notice we also had to make everything `pub`, as per the privacy rules we
talked about in Chapter 7.
Why do we need the trait in scope? Imagine we had two traits with the same
method definition, and our `Point` struct implemented both. We wouldn't know
which method we were trying to call. `use` makes it explicit.
## Trait bounds
We previously knew how to define methods, so what makes traits special? Well,
imagine we had a function that wanted to call `print` for any type that supports
printing. We could write it like this:
```rust,ignore
fn print(value: v) {
v.print();
}
```
But we have a problem. What happens if we tried to pass something to `print`
that did not implement the `print` method? Because of this, Rust won't let the
above code compile.
Let's take a step back and think about what we've written. There's a mis-match:
above, we said "a function that wanted to call `print` for any type that
supports it", but what we said in our code was "for any type T, any type at
all." So how do we say "for any type T that implements `Printable`? Like this:
```rust
trait Printable {
fn print(&self);
}
fn print<T: Printable>(value: T) {
value.print();
}
```
The `T: Printable` syntax says, "the type parameter `T` represents any type
that implements the `Printable` trait." This full example will work just
fine:
```rust
struct Point {
x: i32,
y: i32,
}
trait Printable {
fn print(&self);
}
impl Printable for Point {
fn print(&self) {
println!("I'm a Point! I have an x of {} and a y of {}.", self.x, self.y);
}
}
fn print<T: Printable>(value: T) {
value.print();
}
let p = Point { x: 0, y: 10 };
print(p);
```
Traits are an extremely useful feature of Rust. You'll almost never see generic
functions without an accompanying trait bound. There are many traits in the
standard library, and they're used for many, many different things. For
example, our `Printable` trait is similar to one of those traits, `Display`.
And in fact, that's how `println!` decides how to format things with `{}`. The
`Display` trait has a `fmt` method that determines how to format something.
Here's our original example, but fixed:
```rust
use std::fmt::Display;
fn print<T: Display>(argument: T) {
println!("Got an argument: {}", argument);
}
```
Now that we've said "for any type that implements `Display`," this works well.
## Where syntax
When bounds start getting complicated, there is another syntax that's a bit
cleaner: `where`. And in fact, our original error referred to it. It looks
like this:
```rust
use std::fmt::Display;
fn print<T>(argument: T) where T: Display {
println!("Got an argument: {}", argument);
}
```
Instead of the `T: Display` going inside the angle brackets, they go after a
`where`, placed at the end of the function signature. This can make complex
signatures easier to read.