mirror of
https://github.com/rust-lang-cn/book-cn.git
synced 2025-01-23 23:50:25 +08:00
More edits and some TODOs for strings
This commit is contained in:
parent
00e5b22d9e
commit
ccd4dcf178
@ -250,6 +250,8 @@ should `answer` be `208`? `208` is not a valid character on its own, though.
|
|||||||
Plus, for latin letters, this would not return the answer most people would
|
Plus, for latin letters, this would not return the answer most people would
|
||||||
expect: `&"hello"[0]` would then return `104`, not `h`.
|
expect: `&"hello"[0]` would then return `104`, not `h`.
|
||||||
|
|
||||||
|
### Bytes, and Scalar Values, and Grapheme Clusters! Oh my!
|
||||||
|
|
||||||
This leads to another point about UTF-8: there are really three relevant ways
|
This leads to another point about UTF-8: there are really three relevant ways
|
||||||
to look at strings, from Rust's perspective: bytes, scalar values, and grapheme
|
to look at strings, from Rust's perspective: bytes, scalar values, and grapheme
|
||||||
clusters. If we look at the string "नमस्ते", it is ultimately stored as a `Vec`
|
clusters. If we look at the string "नमस्ते", it is ultimately stored as a `Vec`
|
||||||
@ -281,10 +283,11 @@ really complicated yet?
|
|||||||
All of these problems mean that Rust does not implement `[]` for `String`, so
|
All of these problems mean that Rust does not implement `[]` for `String`, so
|
||||||
we cannot directly do this.
|
we cannot directly do this.
|
||||||
|
|
||||||
However.
|
## Slicing Strings
|
||||||
|
|
||||||
Sometimes, indexing the bytes of a string is useful. So while you can't use `[]`
|
However, indexing the bytes of a string is very useful. While you can't use
|
||||||
with a single number, you _can_ use `[]` with a range:
|
`[]` with a single number, you _can_ use `[]` with a range to create a string
|
||||||
|
slice from particular bytes:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
let hello = "Здравствуйте";
|
let hello = "Здравствуйте";
|
||||||
@ -294,8 +297,19 @@ let s = &hello[0..4];
|
|||||||
|
|
||||||
Here, `s` will be a `&str` that contains the first four bytes of the string.
|
Here, `s` will be a `&str` that contains the first four bytes of the string.
|
||||||
Earlier, we mentioned that each of these characters was two bytes, so that means
|
Earlier, we mentioned that each of these characters was two bytes, so that means
|
||||||
that `s` will be 'Зд'.
|
that `s` will be "Зд".
|
||||||
|
|
||||||
What would happen if we did `&hello[0..1]`? We said each of these characters
|
What would happen if we did `&hello[0..1]`? We said each of these characters
|
||||||
required two bytes. The answer: it will panic, in the same way that accessing
|
required two bytes. The answer: it will panic at runtime, in the same way that
|
||||||
an invalid index in a vector does.
|
accessing an invalid index in a vector does:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
thread 'main' panicked at 'index 0 and/or 1 in `Здравствуйте` do not lie on
|
||||||
|
character boundary', ../src/libcore/str/mod.rs:1694
|
||||||
|
```
|
||||||
|
|
||||||
|
## Methods for Iterating Over Strings
|
||||||
|
|
||||||
|
TODO: Add examples of using `bytes`, `chars`, since those are recommended?
|
||||||
|
|
||||||
|
TODO: summary that ties this to the next chapter
|
||||||
|
Loading…
Reference in New Issue
Block a user