Skip to content

File include/brisk/core/Encoding.hpp


UTFPolicy enum

enum class UTFPolicy

UTFPolicyEnum class representing policies for handling invalid UTF characters.

SkipInvalid enumerator (UTFPolicy::SkipInvalid)

Skip invalid characters.

ReplaceInvalid enumerator (UTFPolicy::ReplaceInvalid)

Replace invalid characters with a replacement character.

Default enumerator (UTFPolicy::Default)

Default policy for handling invalid characters.


replacementChar variable

constexpr inline char32_t replacementChar = U'\U0000FFFD'

The replacement character used for invalid UTF characters.


utf8_bom variable

extern U8StringView utf8_bom

UTF-8 Byte Order Mark (BOM) as a string view.


utf16_bom variable

extern U16StringView utf16_bom

UTF-16 Byte Order Mark (BOM) as a string view.


utf32_bom variable

extern U32StringView utf32_bom

UTF-32 Byte Order Mark (BOM) as a string view.


utfSkipBom function

template <typename OutChar, typename InChar>
std::basic_string_view<OutChar>
utfSkipBom(std::basic_string_view<InChar> text)

Skips BOM in a UTF encoded text.
Param text A string view of the input text.
Returns A string view of the text without BOM.


utf8SkipBom function

inline U8StringView utf8SkipBom(U8StringView text)

Skips BOM in UTF-8 encoded text.
Param text A string view of the input text.
Returns A string view of the text without BOM.


utf16SkipBom function

inline U16StringView utf16SkipBom(U16StringView text)

Skips BOM in UTF-16 encoded text.
Param text A string view of the input text.
Returns A string view of the text without BOM.


utf32SkipBom function

inline U32StringView utf32SkipBom(U32StringView text)

Skips BOM in UTF-32 encoded text.
Param text A string view of the input text.
Returns A string view of the text without BOM.


utfToUtf function

template <typename OutChar, typename InChar>
std::basic_string<OutChar>
utfToUtf(std::basic_string_view<InChar> text,
         UTFPolicy policy)

Converts text from one UTF encoding to another.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A string of the text in the target UTF encoding.


toUtf8 function

template <typename InChar>
inline U8String
toUtf8(std::basic_string_view<InChar> text,
       UTFPolicy policy = UTFPolicy::Default)

Converts text from any encoding to UTF-8.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A UTF-8 encoded string.


toUtf16 function

template <typename InChar>
inline U16String
toUtf16(std::basic_string_view<InChar> text,
        UTFPolicy policy = UTFPolicy::Default)

Converts text from any encoding to UTF-16.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A UTF-16 encoded string.


toUtf32 function

template <typename InChar>
inline U32String
toUtf32(std::basic_string_view<InChar> text,
        UTFPolicy policy = UTFPolicy::Default)

Converts text from any encoding to UTF-32.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A UTF-32 encoded string.


utf8ToUtf16 function

inline std::u16string
utf8ToUtf16(U8StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Converts UTF-8 text to UTF-16.
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-16 encoded string.


utf8ToUtf32 function

inline std::u32string
utf8ToUtf32(U8StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Converts UTF-8 text to UTF-32.
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-32 encoded string.


utf16ToUtf8 function

inline std::string
utf16ToUtf8(U16StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Converts UTF-16 text to UTF-8.
Param text A UTF-16 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-8 encoded string.


utf16ToUtf32 function

inline std::u32string
utf16ToUtf32(U16StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Converts UTF-16 text to UTF-32.
Param text A UTF-16 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-32 encoded string.


utf32ToUtf8 function

inline std::string
utf32ToUtf8(U32StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Converts UTF-32 text to UTF-8.
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-8 encoded string.


utf32ToUtf16 function

inline std::u16string
utf32ToUtf16(U32StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Converts UTF-32 text to UTF-16.
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-16 encoded string.


wcsToUtf8 function

inline string
wcsToUtf8(WStringView text,
          UTFPolicy policy = UTFPolicy::Default)

Converts wide character string (wchar_t) to UTF-8.
Param text A wide character string view.
Param policy The policy to handle invalid characters.
Returns A UTF-8 encoded string.


utf8ToWcs function

inline wstring
utf8ToWcs(U8StringView text,
          UTFPolicy policy = UTFPolicy::Default)

Converts UTF-8 text to wide character string (wchar_t).
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns A wide character string.


wcsToUtf32 function

inline u32string
wcsToUtf32(WStringView text,
           UTFPolicy policy = UTFPolicy::Default)

Converts wide character string (wchar_t) to UTF-32.
Param text A wide character string view.
Param policy The policy to handle invalid characters.
Returns A UTF-32 encoded string.


utf32ToWcs function

inline wstring
utf32ToWcs(U32StringView text,
           UTFPolicy policy = UTFPolicy::Default)

Converts UTF-32 text to wide character string (wchar_t).
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns A wide character string.


utfCodepoints function

template <typename Char>
size_t utfCodepoints(std::basic_string_view<Char> text,
                     UTFPolicy policy = UTFPolicy::Default)

Counts the number of UTF codepoints in the text.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns The number of UTF codepoints in the text.


utf8Codepoints function

inline size_t
utf8Codepoints(U8StringView text,
               UTFPolicy policy = UTFPolicy::Default)

Counts the number of UTF-8 codepoints in the text.
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns The number of UTF-8 codepoints in the text.


utf16Codepoints function

inline size_t
utf16Codepoints(U16StringView text,
                UTFPolicy policy = UTFPolicy::Default)

Counts the number of UTF-16 codepoints in the text.
Param text A UTF-16 encoded string view.
Param policy The policy to handle invalid characters.
Returns The number of UTF-16 codepoints in the text.


utf32Codepoints function

inline size_t
utf32Codepoints(U32StringView text,
                UTFPolicy policy = UTFPolicy::Default)

Counts the number of UTF-32 codepoints in the text.
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns The number of UTF-32 codepoints in the text.


utfCleanup function

template <typename Char>
std::basic_string<Char>
utfCleanup(std::basic_string_view<Char> text,
           UTFPolicy policy = UTFPolicy::Default)

Cleans up invalid UTF characters in the text.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A cleaned up string with invalid UTF characters handled according to the policy.


utf8Cleanup function

inline string
utf8Cleanup(U8StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Cleans up invalid UTF-8 characters in the text.
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns A cleaned up UTF-8 encoded string.


utf16Cleanup function

inline u16string
utf16Cleanup(U16StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Cleans up invalid UTF-16 characters in the text.
Param text A UTF-16 encoded string view.
Param policy The policy to handle invalid characters.
Returns A cleaned up UTF-16 encoded string.


utf32Cleanup function

inline u32string
utf32Cleanup(U32StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Cleans up invalid UTF-32 characters in the text.
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns A cleaned up UTF-32 encoded string.


UTFValidation enum

enum class UTFValidation

UTFValidationEnum class representing the types of UTF validation results.

Valid enumerator (UTFValidation::Valid)

The UTF sequence is valid.

Invalid enumerator (UTFValidation::Invalid)

The UTF sequence is invalid.

Overlong enumerator (UTFValidation::Overlong)

The UTF sequence is overlong.

Truncated enumerator (UTFValidation::Truncated)

The UTF sequence is truncated.


isAscii function

bool isAscii(U8StringView text)

Checks if the text contains only ASCII characters.
Param text A UTF-8 encoded string view.
Returns True if the text is ASCII, otherwise false.


utfValidate function

template <typename Char>
UTFValidation utfValidate(std::basic_string_view<Char> text)

Validates a UTF encoded text.
Param text A string view of the input text.
Returns The validation result of the UTF text.


utf8Validate function

inline UTFValidation utf8Validate(U8StringView text)

Validates UTF-8 encoded text.
Param text A UTF-8 encoded string view.
Returns The validation result of the UTF-8 text.


utf16Validate function

inline UTFValidation utf16Validate(U16StringView text)

Validates UTF-16 encoded text.
Param text A UTF-16 encoded string view.
Returns The validation result of the UTF-16 text.


utf32Validate function

inline UTFValidation utf32Validate(U32StringView text)

Validates UTF-32 encoded text.
Param text A UTF-32 encoded string view.
Returns The validation result of the UTF-32 text.


(unnamed) enum

enum : char32_t

Enum values representing special UTF codepoints for error handling.

UtfInvalid enumerator ((unnamed))

Represents an invalid UTF codepoint.

UtfOverlong enumerator ((unnamed))

Represents an overlong UTF codepoint.

UtfTruncated enumerator ((unnamed))

Represents a truncated UTF codepoint.


utfTransform function

template <typename Char>
std::basic_string<Char>
utfTransform(std::basic_string_view<Char> text,
             const function<char32_t(char32_t)> &fn,
             UTFPolicy policy = UTFPolicy::Default)

Transforms UTF text using a provided function.
Param text A string view of the input text.
Param fn A function to transform each UTF codepoint.
Param policy The policy to handle invalid characters.
Returns A transformed string.


asciiTransform function

string
asciiTransform(AsciiStringView text,
               const function<char32_t(char32_t)> &fn)

Transforms ASCII text using a provided function.
Param text An ASCII string view.
Param fn A function to transform each UTF codepoint.
Returns A transformed ASCII string.


utf8Transform function

inline string
utf8Transform(U8StringView text,
              const function<char32_t(char32_t)> &fn,
              UTFPolicy policy = UTFPolicy::Default)

Transforms UTF-8 text using a provided function.
Param text A UTF-8 encoded string view.
Param fn A function to transform each UTF codepoint.
Param policy The policy to handle invalid characters.
Returns A transformed UTF-8 encoded string.


utf16Transform function

inline u16string
utf16Transform(U16StringView text,
               const function<char32_t(char32_t)> &fn,
               UTFPolicy policy = UTFPolicy::Default)

Transforms UTF-16 text using a provided function.
Param text A UTF-16 encoded string view.
Param fn A function to transform each UTF codepoint.
Param policy The policy to handle invalid characters.
Returns A transformed UTF-16 encoded string.


utf32Transform function

inline u32string
utf32Transform(U32StringView text,
               const function<char32_t(char32_t)> &fn,
               UTFPolicy policy = UTFPolicy::Default)

Transforms UTF-32 text using a provided function.
Param text A UTF-32 encoded string view.
Param fn A function to transform each UTF codepoint.
Param policy The policy to handle invalid characters.
Returns A transformed UTF-32 encoded string.


utfRead function

char32_t utfRead(const char *&text, const char *end)

Reads a UTF codepoint from a text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Returns The UTF codepoint read from the text.


char32_t utfRead(const char16_t *&text, const char16_t *end)

Reads a UTF codepoint from a UTF-16 text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Returns The UTF codepoint read from the text.


char32_t utfRead(const char32_t *&text, const char32_t *end)

Reads a UTF codepoint from a UTF-32 text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Returns The UTF codepoint read from the text.


char32_t utfRead(const wchar_t *&text, const wchar_t *end)

Reads a UTF codepoint from a wide character text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Returns The UTF codepoint read from the text.


utfWrite function

void utfWrite(char *&text, char *end, char32_t ch)

Writes a UTF codepoint to a text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Param ch The UTF codepoint to write.


void utfWrite(char16_t *&text, char16_t *end, char32_t ch)

Writes a UTF codepoint to a UTF-16 text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Param ch The UTF codepoint to write.


void utfWrite(char32_t *&text, char32_t *end, char32_t ch)

Writes a UTF codepoint to a UTF-32 text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Param ch The UTF codepoint to write.


void utfWrite(wchar_t *&text, wchar_t *end, char32_t ch)

Writes a UTF codepoint to a wide character text range.
Param text Pointer to the current position in the text.
Param end Pointer to the end of the text.
Param ch The UTF codepoint to write.


UtfIterator class

template <typename InChar> UtfIterator

Struct representing a UTF iterator for iterating over UTF text.
Template param InChar The character type of the input text.

end_iterator class (UtfIterator::end_iterator)

end_iterator

iterator class (UtfIterator::iterator)

iterator

utfIterate function

template <typename InChar>
UtfIterator<InChar>
utfIterate(std::basic_string_view<InChar> text,
           UTFPolicy policy = UTFPolicy::Default)

Creates a UTF iterator for iterating over UTF text.
Param text A string view of the input text.
Param policy The policy to handle invalid characters.
Returns A UTF iterator for the given text.


utf8Iterate function

inline UtfIterator<U8Char>
utf8Iterate(U8StringView text,
            UTFPolicy policy = UTFPolicy::Default)

Creates a UTF-8 iterator for iterating over UTF-8 text.
Param text A UTF-8 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-8 iterator for the given text.


utf16Iterate function

inline UtfIterator<char16_t>
utf16Iterate(U16StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Creates a UTF-16 iterator for iterating over UTF-16 text.
Param text A UTF-16 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-16 iterator for the given text.


utf32Iterate function

inline UtfIterator<char32_t>
utf32Iterate(U32StringView text,
             UTFPolicy policy = UTFPolicy::Default)

Creates a UTF-32 iterator for iterating over UTF-32 text.
Param text A UTF-32 encoded string view.
Param policy The policy to handle invalid characters.
Returns A UTF-32 iterator for the given text.


UTFNormalization enum

enum class UTFNormalization

UTFNormalizationEnum class representing the types of UTF normalization.

Compose enumerator (UTFNormalization::Compose)

Compose normalization.

Decompose enumerator (UTFNormalization::Decompose)

Decompose normalization.

Compat enumerator (UTFNormalization::Compat)

Compatibility normalization.

NFC enumerator (UTFNormalization::NFC)

Compose normalization (alias for Compose).

NFD enumerator (UTFNormalization::NFD)

Decompose normalization (alias for Decompose).

NFKC enumerator (UTFNormalization::NFKC)

Compatibility and Compose normalization (alias for NFKC).

NFKD enumerator (UTFNormalization::NFKD)

Compatibility and Decompose normalization (alias for NFKD).


utfNormalize function

template <typename Char>
std::basic_string<Char>
utfNormalize(std::basic_string_view<Char> text,
             UTFNormalization normalization,
             UTFPolicy policy = UTFPolicy::Default)

Normalizes UTF text according to the specified normalization type.
Param text A string view of the input text.
Param normalization The normalization type to apply.
Param policy The policy to handle invalid characters.
Returns A normalized string.


utf8Normalize function

inline U8String
utf8Normalize(U8StringView text,
              UTFNormalization normalization,
              UTFPolicy policy = UTFPolicy::Default)

Normalizes UTF-8 text according to the specified normalization type.
Param text A UTF-8 encoded string view.
Param normalization The normalization type to apply.
Param policy The policy to handle invalid characters.
Returns A normalized UTF-8 encoded string.


utf16Normalize function

inline U16String
utf16Normalize(U16StringView text,
               UTFNormalization normalization,
               UTFPolicy policy = UTFPolicy::Default)

Normalizes UTF-16 text according to the specified normalization type.
Param text A UTF-16 encoded string view.
Param normalization The normalization type to apply.
Param policy The policy to handle invalid characters.
Returns A normalized UTF-16 encoded string.


utf32Normalize function

inline U32String
utf32Normalize(U32StringView text,
               UTFNormalization normalization,
               UTFPolicy policy = UTFPolicy::Default)

Normalizes UTF-32 text according to the specified normalization type.
Param text A UTF-32 encoded string view.
Param normalization The normalization type to apply.
Param policy The policy to handle invalid characters.
Returns A normalized UTF-32 encoded string.


toJson function (std::toJson)

bool toJson(Brisk::Json &j, const std::u32string &s)

Serializes a UTF-32 string to JSON.
Param j The JSON object to serialize to.
Param s The UTF-32 string to serialize.
Returns True if serialization was successful, otherwise false.


bool toJson(Brisk::Json &j, const std::u16string &s)

Serializes a UTF-16 string to JSON.
Param j The JSON object to serialize to.
Param s The UTF-16 string to serialize.
Returns True if serialization was successful, otherwise false.


bool toJson(Brisk::Json &j, const std::wstring &s)

Serializes a wide string to JSON.
Param j The JSON object to serialize to.
Param s The wide string to serialize.
Returns True if serialization was successful, otherwise false.


fromJson function (std::fromJson)

bool fromJson(const Brisk::Json &j, std::u32string &s)

Deserializes a UTF-32 string from JSON.
Param j The JSON object to deserialize from.
Param s The UTF-32 string to deserialize into.
Returns True if deserialization was successful, otherwise false.


bool fromJson(const Brisk::Json &j, std::u16string &s)

Deserializes a UTF-16 string from JSON.
Param j The JSON object to deserialize from.
Param s The UTF-16 string to deserialize into.
Returns True if deserialization was successful, otherwise false.


bool fromJson(const Brisk::Json &j, std::wstring &s)

Deserializes a wide string from JSON.
Param j The JSON object to deserialize from.
Param s The wide string to deserialize into.
Returns True if deserialization was successful, otherwise false.


Auto-generated from sources, Revision , https://github.com/brisklib/brisk/blob//include/brisk/