pstore2
Functions
utf.cpp File Reference

Implementation of functions for processing UTF-8 strings. More...

#include "pstore/support/utf.hpp"
#include <algorithm>
#include <cctype>
#include <cstring>
#include <type_traits>
#include "pstore/support/assert.hpp"
Include dependency graph for utf.cpp:

Functions

auto pstore::utf::slice (gsl::czstring str, std::ptrdiff_t start, std::ptrdiff_t end) -> std::pair< std::ptrdiff_t, std::ptrdiff_t >
 Converts codepoint indices start and end to byte offsets in the buffer at str. More...
 
auto pstore::utf::length (char const *str, std::size_t nbytes) -> std::size_t
 Returns the number of UTF-8 code points in the buffer given by a start address and length. More...
 
auto pstore::utf::length (gsl::czstring str) -> std::size_t
 Returns the number of UTF-8 code points in the null-terminated buffer at str.
 
auto pstore::utf::length (std::string const &str) -> std::size_t
 
auto pstore::utf::index (gsl::czstring str, std::size_t pos) -> gsl::czstring
 Returns a pointer to the beginning of the pos'th UTF-8 codepoint in the buffer at str or nullptr if either str is nullptr or if index was too large. More...
 

Detailed Description

Implementation of functions for processing UTF-8 strings.

Function Documentation

◆ index()

auto pstore::utf::index ( gsl::czstring  str,
std::size_t  pos 
) -> gsl::czstring

Returns a pointer to the beginning of the pos'th UTF-8 codepoint in the buffer at str or nullptr if either str is nullptr or if index was too large.

Returns a reference to the beginning of the pos'th UTF-8 code-point in a sequence.

◆ length()

auto pstore::utf::length ( char const *  str,
std::size_t  nbytes 
) -> std::size_t

Returns the number of UTF-8 code points in the buffer given by a start address and length.

Parameters
strThe buffer start address.
nbytesThe number of bytes in the buffer.
Returns
The number of UTF-8 code points in the buffer given by 'str' and 'nbytes'.

◆ slice()

auto pstore::utf::slice ( gsl::czstring  str,
std::ptrdiff_t  start,
std::ptrdiff_t  end 
) -> std::pair<std::ptrdiff_t, std::ptrdiff_t>

Converts codepoint indices start and end to byte offsets in the buffer at str.

Parameters
strA UTF-8 encoded character string.
startThe code-point index of the start of a character range within the string 'str'.
endThe code-point index of the end of a character range within the string 'str'.
Returns
A pair containing the the byte offset of the start UTF-8 code-unit and the byte offset of the end UTF-8 code-unit. Either value may be -1 if they were out-of-range.