|
| constexpr size_t | num_bytes (char8_t c) noexcept |
| | Returns the expected number of bytes for an UTF-8 char sequence by inspecting the first byte.
|
| constexpr char32_t | append (char32_t c, char8_t b) noexcept |
| | Append b to c for converting UTF-8 to UTF-32.
|
| constexpr char32_t | first (char32_t c, char32_t num) noexcept |
| | Get relevant bits of first UTF-8 byte c of a multi-byte sequence consisting of num bytes.
|
| constexpr char32_t | min_code_point (size_t num) noexcept |
| | Minimum Unicode scalar value representable in an UTF-8 sequence of num bytes.
|
| constexpr bool | is_scalar_value (char32_t c) noexcept |
| | Is c a valid Unicode scalar value?
|
| constexpr char8_t | is_valid234 (char8_t c) noexcept |
| | Is the 2nd, 3rd, or 4th byte of an UTF-8 byte sequence valid?
|
| char32_t | decode (std::istream &is) |
| | Decodes the next UTF-8 sequence from is into a single char32_t.
|
| bool | encode (std::ostream &os, char32_t c32) |
| | Encodes c32 as UTF-8 and writes the resulting bytes to os.
|
Safe char32_t-style wrappers for <ctype> functions:
Like all other functions from <cctype>, the behavior of std::isalnum is undefined if the argument's value is neither representable as unsigned char nor equal to EOF.
|
| bool | isalnum (char32_t c) noexcept |
| bool | isalpha (char32_t c) noexcept |
| bool | isblank (char32_t c) noexcept |
| bool | iscntrl (char32_t c) noexcept |
| bool | isdigit (char32_t c) noexcept |
| bool | isgraph (char32_t c) noexcept |
| bool | islower (char32_t c) noexcept |
| bool | isprint (char32_t c) noexcept |
| bool | ispunct (char32_t c) noexcept |
| bool | isspace (char32_t c) noexcept |
| bool | isupper (char32_t c) noexcept |
| bool | isxdigit (char32_t c) noexcept |
| bool | isascii (char32_t c) noexcept |
| char32_t | tolower (char32_t c) noexcept |
| char32_t | toupper (char32_t c) noexcept |
| constexpr bool | isrange (char32_t c, char32_t begin, char32_t finis) noexcept |
| | Is c within [begin, finis]?
|
| constexpr auto | isrange (char32_t begin, char32_t finis) noexcept |
| constexpr bool | isodigit (char32_t c) noexcept |
| | Is octal digit?
|
| constexpr bool | isbdigit (char32_t c) noexcept |
| | Is binary digit?
|
Build a predicate that checks whether a code point matches any of the given values.
|
| bool | _any (char32_t c, char32_t d) |
| template<class... T> |
| bool | _any (char32_t c, char32_t d, T... args) |
| template<class... T> |
| auto | any (T... args) |
UTF-8 helpers for decoding byte streams, encoding char32_t values, and running ASCII-style character classification on char32_t.
The central entry points are decode and encode. Decoding returns sentinel values such as EoF and Invalid instead of throwing.
| char32_t fe::utf8::decode |
( |
std::istream & | is | ) |
|
|
inline |
Decodes the next UTF-8 sequence from is into a single char32_t.
Returns EoF when the stream is exhausted and Invalid for malformed, overlong, surrogate, or otherwise non-scalar encodings.
Definition at line 64 of file utf8.h.
References append(), EoF, first(), Invalid, is_scalar_value(), is_valid234(), min_code_point(), and num_bytes().
Referenced by fe::Lexer< K, S >::next().