diff --git a/src/ch10-00-generics.md b/src/ch10-00-generics.md index fcc8be2..3c36e33 100644 --- a/src/ch10-00-generics.md +++ b/src/ch10-00-generics.md @@ -97,7 +97,13 @@ let numbers = vec![102, 34, 6000, 89, 54, 2, 43, 8]; largest(numbers); ``` -But functions aren't the only way to eliminate duplication. For example, our +The function takes an argument, `numbers`, which represents any concrete +`Vec` that we might pass into the function. The code in the function +definition operates on the `numbers` representation of any `Vec`. When +we call the `largest` function, the code actually runs on the specific values +that we pass in. + +Functions aren't the only way to eliminate duplication. For example, our `largest` function only works for vectors of `i32`. What if we wanted to find the largest number in a list of floats? Or the largest value in some sort of custom `struct` or `enum`? We can't solve those kinds of duplication with diff --git a/src/ch10-01-syntax.md b/src/ch10-01-syntax.md index e2ca69b..6909892 100644 --- a/src/ch10-01-syntax.md +++ b/src/ch10-01-syntax.md @@ -1,23 +1,32 @@ ## Generics Syntax -We've already hinted at generics previously, but never dug into what exactly -they are. You can always recognize when generics are used by the way that they -fit into Rust's syntax: Any time you see angle brackets, `<>`, you're dealing -with generics. +We've already hinted at the concept of generics in previous chapters, but we +never dug into what exactly they are or how to use them. In places where we +specify a type, like function signatures or structs, instead we can use +*generics*. Generics are stand-ins that represent more than one particular +concept. In this section, we're going to cover generic *data types*. -The types we've seen before like `Vec`? That's employing generics. The -proper type for vectors is `Vec`. That `T` is called a *type parameter*, and -it serves a similar function as parameters to functions: you fill in the -parameter with a concrete type, and that determines how the overall type works. -In the same way that a function like `foo(x: i32)` can be called with a -specific value such as `foo(5)`, a `Vec` can be created with a specific -type, like `Vec`. +You can recognize when any kind of generics are used by the way that they fit +into Rust's syntax: any time you see angle brackets, `<>`, you're dealing with +generics. Types we've seen before, like in Chapter 8 where we discussed vectors +with types like `Vec`, employ generics. The type that the standard library +defines for vectors is `Vec`. That `T` is called a *type parameter*, and it +serves a similar function as parameters to functions: you fill in the parameter +with a concrete type, and that determines how the overall type works. In the +same way that a function like `foo(x: i32)` can be called with a specific value +such as `foo(5)`, a `Vec` can be created with a specific type, like +`Vec`. -### Generic Data Types +### Duplicated Enum Definitions -Let's dive into generic data types in a bit more detail. We previously learned -about the `Option` type, but we never examined its definition. Let's try to -imagine how we'd write it. First, let's consider an `Option` of only a number: +Let's dive into generic data types in more detail. We learned about how to use +the `Option` enum in Chapter 6, but we never examined its definition. Let's +try to imagine how we'd write it! We'll start from duplicated code like we did +in the "Removing Duplication by Extracting a Function" section. This time, +we'll remove the duplication by extracting a generic data type instead of +extracting a function, but the mechanics of doing the extraction will be +similar. First, let's consider an `Option` enum with a `Some` variant that can +only hold an `i32`. We'll call this enum `OptionalNumber`: ```rust enum OptionalNumber { @@ -29,8 +38,10 @@ let number = OptionalNumber::Some(5); let no_number = OptionalNumber::None; ``` -This works just fine for `i32`s. But what if we also wanted to store `f64`s? -Or `String`s? We would have to add code like this for each type we wanted: +This works just fine for `i32`s. But what if we also wanted to store `f64`s? We +would have to duplicate code to define a separate `Option` enum type for each +type we wanted to be able to hold in the `Some` variants. For example, here is +how we could define and use `OptionalFloatingPointNumber`: ```rust enum OptionalFloatingPointNumber { @@ -42,13 +53,17 @@ let number = OptionalFloatingPointNumber::Some(5.0); let no_number = OptionalFloatingPointNumber::None; ``` -We've made the enum's name a bit long in order to drive the point home. With our -current knowledge, we would have to write a unique type for every single kind -of option. In other words, the idea of "an optional value" is a higher-order -concept than one specific type. We want it to work for any type at all. +We've made the enum's name a bit long in order to drive the point home. With +what we currently know how to do in Rust, we would have to write a unique type +for every single kind of value we wanted to have either `Some` or `None` of. In +other words, the idea of "an optional value" is a higher-order concept than one +specific type. We want it to work for any type at all. -We can do that with generics. In fact, that's how the actual `Option` type works -in Rust. Let's check out its definition: +### Generic Type Parameters + +We can define an `Option` type that works for any type by using generics. In +fact, that's how the actual `Option` type works in Rust. Let's check out its +definition: ```rust enum Option { @@ -57,26 +72,36 @@ enum Option { } ``` -There's those angle brackets. If we were to read this definition aloud, we'd -say "`Option` is an `enum` with one type parameter, `T`. It has two variants: -`Some`, which has a value with type `T`, and `None`, which has no value." A -bit of a mouthful! But this will work with any type: +The angle brackets (`<>`) go after the enum name and before the curly brace in +the enum definition. Inside the brackets is the declaration of the generic type +name `T`. The defintion of the enum's variants can then use `T` to stand in for +any type. In this case, the definition `Some(T)` says that the `Some` variant +can hold a value of any type. If we were to read this definition aloud, we'd +say, "`Option` is an `enum` with one type parameter, `T`. It has two variants: +`Some`, which has a value with type `T`, and `None`, which has no value." A bit +of a mouthful! + +This definition allows using `Option` with any type, like these uses of +`Option` with an `i32` and with an `f64`: ```rust let integer = Option::Some(5); let float = Option::Some(5.0); ``` -We've left in the `Option::` bit for consistency with the previous examples, but -since `use Option::*` is in the prelude, it's not needed: +We've left in the `Option::` namespace for consistency with the previous +examples, but since `use Option::*` is in the prelude, it's not needed. Usually +using `Option` looks like this: ```rust let integer = Some(5); let float = Some(5.0); ``` -So, what's up with this syntax. Let's compare our two non-generic `enum`s side -by side: +### Removing Duplication by Extracting a Generic Data Type + +Let's return to our two non-generic `enum`s to see how to get from duplicated +types to the generic type. Here are the definitions side-by-side: ```text enum OptionalNumber { enum OptionalFloatingPointNumber { @@ -85,13 +110,17 @@ enum OptionalNumber { enum OptionalFloatingPointNumber { } } ``` -We have one line that's very close, but different: the `Some` bit. The only -difference is the type of the data, `i32` and `f64`. Just like we can -parameterize arguments to a function by choosing a name, we can parameterize -the type by choosing a name, in this case, `T`. We could choose any identifier -here, but traditionally, type parameters follow the same style as types -themselves: CamelCase. In addition, they tend to be short, often one letter. -`T` is the traditional choice, short for 'type'. So let's do that: +Aside from the names, we have one line where the two definitions are very +close, but still different: the line with the `Some` definitions. The only +difference is the type of the data in that variant, `i32` and `f64`. + +Just like we can parameterize arguments to a function by choosing a name, we +can parameterize the type by choosing a name. In this case, we've chosen the +name `T`. We could choose any identifier here, but Rust style has type +parameters follow the same style as types themselves: CamelCase. In addition, +they tend to be short, often one letter. `T` is the traditional default choice, +short for 'type'. Let's use that name in our `Some` variant definitions where +the `i32` and `f64` types were: ```text enum OptionalNumber { enum OptionalFloatingPointNumber { @@ -100,10 +129,10 @@ enum OptionalNumber { enum OptionalFloatingPointNumber { } } ``` -We've replaced `i32` and `f64` with `T`. There's one problem, though: we've -*used* `T`, but not defined it. This would be similar to using an argument to -a function without declaring it. We need to tell Rust that we've introduced a -generic parameter. We can do that with the angle brackets; let's try it: +There's one problem, though: we've *used* `T`, but not defined it. This would +be similar to using an argument to a function in the body without declaring it +in the signature. We need to tell Rust that we've introduced a generic +parameter. The syntax to do that is the angle brackets, like this: ```text enum OptionalNumber { enum OptionalFloatingPointNumber { @@ -112,10 +141,11 @@ enum OptionalNumber { enum OptionalFloatingPointNumber { } } ``` -The `<>`s indicate a list of type parameters, just like `()` indicates a -list of value parameters. Now, the only difference between our two `enum`s -is the name. And since we've made them generic, they're not specific to numbers -or floating point numbers. So let's give them the same name: +The `<>`s after the enum name indicate a list of type parameters, just like +`()` after a function name indicates a list of value parameters. Now the only +difference between our two `enum`s is the name. Since we've made them generic, +they're not specific to integers or floating point numbers anymore, so they can +have the same name: ```text enum Option { enum Option { @@ -124,13 +154,19 @@ enum Option { enum Option { } } ``` -Now they're identical! We've made our type fully generic. Understanding this -process is important, because the compiler actually does the exact opposite of -this when compiling your code. Taking code that is generic over some type and -generating code that is specific for the concrete types that are used with the -generic code is called *monomorphization*, and it's why Rust's generics are -extremely efficient. Consider this code that uses the standard library's -`Option`: +Now they're identical! We've made our type fully generic. When you recognize +situations with almost-duplicate types like this, you can follow this process +to reduce duplication using generics. + +### Monomorphization at Compile Time + +Understanding this refactoring process is also useful in understanding how +generics work behind the scenes: the compiler does the exact opposite of this +process when compiling your code. *Monomorphization* means taking code that is +generic over some type and generating code that is specific for the concrete +types that are used with the generic code. Monomorphization is why Rust's +generics are extremely efficient. Consider this code that uses the standard +library's `Option`: ```rust let integer = Some(5); @@ -138,10 +174,12 @@ let float = Some(5.0); ``` When Rust compiles this code, it will perform monomorphization. What this means -is that the compiler will see that we've used two kinds of `Option`: one -where `T` is `i32`, and one where `T` is `f64`. As such, it will expand the -generic definition of `Option` into `Option` and `Option`, and -replace the calls with the specific versions. Like this: +is the compiler will see that we've used two kinds of `Option`: one where +`T` is `i32`, and one where `T` is `f64`. As such, it will expand the generic +definition of `Option` into `OptionInteger` and `OptionFloat`, +thereby replacing the generic definition with the specific ones. The more +specific version looks like the duplicated code we started with at the +beginning of this section: ```rust enum OptionInteger { @@ -158,12 +196,17 @@ let integer = OptionInteger::Some(5); let float = OptionFloat::Some(5.0); ``` -In other words, we can write the non-duplicated form, but Rust will act as -though we wrote the specific type out in each instance. This means we pay no -runtime cost for using generics; it's just like we copy/pasted each particular -definition. +In other words, we can write the non-duplicated form that uses generics in our +code, but Rust will compile that into code that acts as though we wrote the +specific type out in each instance. This means we pay no runtime cost for using +generics; it's just like we duplicated each particular definition. -In a similar fashion, we can use `<>`s with structs as well: +### Generic Structs + +In a similar fashion as we did with enums, we can use `<>`s with structs as +well in order to define structs that could hold data values of any type. +Listing 10-2 shows the definition and use of a `Point` struct that could hold +`x` and `y` coordinate values that are any type: ```rust struct Point { @@ -175,22 +218,65 @@ let integer = Point { x: 5, y: 10 }; let float = Point { x: 1.0, y: 4.0 }; ``` -The process is the same: add a `` by the name, then put `T` in where the -type name goes. + +Listing 10-2: A `Point` struct that holds `x` and `y` values of type `T` + -If we need multiple type parameters, we can use a comma. Consider a universe in -which `x` and `y` need different types: +The syntax is the same with structs: add a `` after the name of the struct, +then use `T` in the definition where you want to use that generic type instead +of a specific type. + +### Multiple Type Parameters + +Note that in the `Point` definition in Listing 10-2, we've used the same `T` +parameter for both fields. This means `x` and `y` must always be values of the +same type. Trying to instantiate a `Point` that uses an `i32` for `x` and an +`f64` for `y`, like this: + +```rust,ignore +let p = Point { x: 5, y: 20.0 }; +``` + +results in a compile-time error that indicates the type of `y` must match the +type of `x`: + +```bash +error[E0308]: mismatched types + | +7 | let p = Point { x: 5, y: 20.0 }; + | ^^^^ expected integral variable, found floating-point variable + | + = note: expected type `{integer}` + = note: found type `{float}` +``` + +If we need to be able to have fields with generic but different types, we can +declare multiple type parameters within the angle brackets, separated by a +comma. Listing 10-3 shows how to define a `Point` that can have different types +for `x` and `y`: ```rust struct Point { x: X, y: Y, } + +let integer = Point { x: 5, y: 10 }; +let float = Point { x: 1.0, y: 4.0 }; +let p = Point { x: 5, y: 20.0 }; ``` -Now `x` will have the type of `X`, and `y` will have the type of `Y`. We can -make `enum`s with multiple type parameters as well. Remember `Result` -from the error handling chapter? Here's its definition: + +Listing 10-2: A `Point` struct that holds an `x` value of type `X` and a `y` +value of type `Y` + + +Now `x` will have the type of `X`, and `y` will have the type of `Y`, and we +can instantiate a `Point` with an `i32` for `x` and an `f64` for `y`. + +We can make `enum`s with multiple type parameters as well. Recall the enum +`Result` from Chapter 9 that we used for recoverable errors. Here's its +definition: ```rust enum Result { @@ -200,14 +286,17 @@ enum Result { ``` Each variant stores a different kind of information, and they're both generic. + You can have as many type parameters as you'd like. Similarly to parameters of -values, if you have a lot of them, it can get quite confusing, so try to keep -the number of them small if you can. +values in function signatures, if you have a lot of parameters, the code can +get quite confusing, so try to keep the number of parameters defined in any one +type small if you can. ### Generic Functions and Methods -In a similar way to data structures, we can use the `<>` syntax to write -functions: +In a similar way to data structures, we can use the `<>` syntax in function or +method definitions. The angle brackets for type parameters go after the +function or method name and before the argument list in parentheses: ```rust fn generic_function(argument: T) { @@ -215,20 +304,10 @@ fn generic_function(argument: T) { } ``` -and methods: - -```rust -struct Foo; - -impl Foo { - fn method(argument: T) { - // code goes here - } -} -``` - -We can use the same process to refactor duplicated specific code into code that -uses generics. If we had these two functions: +We can use the same process that we used to refactor duplicated type +definitions using generics to refactor duplicated function definitions using +generics. Consider these two side-by-side function signatures that differ in +the type of `argument`: ```text fn takes_integer(argument: i32) { fn takes_float(argument: f64) { @@ -236,35 +315,26 @@ fn takes_integer(argument: i32) { fn takes_float(argument: f64) { } } ``` -We'd replace their parameter with `T`: +We can add a type parameter list that declares the generic type `T` after the function names, then use `T` where the specific `i32` and `f64` types were: ```text -fn takes_integer(argument: T) { fn takes_float(argument: T) { +fn takes_integer(argument: T) { fn takes_float(argument: T) { // code goes here // code goes here } } ``` -Add the `T` parameter to the type parameter list: +At this point, only the names differ, so we could unify the two functions into one: -```text -fn takes_integer(argument: T) { fn takes_float(argument: T) { - // code goes here // code goes here -} } +```rust,ignore +fn takes(argument: T) { + // code goes here +} ``` -And then rename them: - -```text -fn takes(argument: T) { fn takes(argument: T) { - // code goes here // code goes here -} } -``` - -Now they're the same! - There's one problem though. We've got some function _definitions_ that work, -but if we try to use our argument in the function body, we'll get an error. To -see what we mean here, try compiling this function: +but if we try to use `argument` in code in the function body, we'll get an +error. For example, this function definition tries to print out the value of +`argument` in its body: ```rust,ignore fn print(argument: T) { @@ -272,9 +342,9 @@ fn print(argument: T) { } ``` -You'll get an error that looks like this: +Compiling this definition results in an error: -```text +```bash error[E0277]: the trait bound `T: std::fmt::Display` is not satisfied --> :3:37 | @@ -288,4 +358,4 @@ error: aborting due to previous error(s) ``` This error mentions something we haven't learned about yet: traits. In the next -section, we'll figure out how to make this compile. +section, we'll learn how to make this compile.