Windwalker\String\Utf8String

class Utf8String (View source)

String handling class for utf-8 data Wraps the phputf8 library All functions assume the validity of utf-8 strings.

This class is based on Joomla String package

Methods

static boolean

is_ascii(string $str)

Tests whether a string contains only 7bit ASCII bytes.

static mixed

strpos(string $str, string $search, integer $offset = false)

UTF-8 aware alternative to strpos.

static mixed

strrpos(string $str, string $search, integer $offset)

UTF-8 aware alternative to strrpos Finds position of last occurrence of a string

static mixed

substr(string $str, integer $offset, integer $length = false)

UTF-8 aware alternative to substr Return part of a string given character offset (and optionally length)

static mixed

strtolower(string $str)

UTF-8 aware alternative to strtlower

static mixed

strtoupper(string $str)

UTF-8 aware alternative to strtoupper Make a string uppercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings

static integer

strlen(string $str)

UTF-8 aware alternative to strlen.

static string

str_ireplace(string $search, string $replace, string $str, integer $count = null)

UTF-8 aware alternative to strireplace Case-insensitive version of strreplace

static array

str_split(string $str, integer $split_len = 1)

UTF-8 aware alternative to str_split Convert a string to an array

static integer

strcasecmp(string $str1, string $str2, mixed $locale = false)

UTF-8/LOCALE aware alternative to strcasecmp A case insensitive string comparison

static integer

strcmp(string $str1, string $str2, mixed $locale = false)

UTF-8/LOCALE aware alternative to strcmp A case sensitive string comparison

static integer

strcspn(string $str, string $mask, integer $start = null, integer $length = null)

UTF-8 aware alternative to strcspn Find length of initial segment not matching mask

static string

stristr(string $str, string $search)

UTF-8 aware alternative to stristr Returns all of haystack from the first occurrence of needle to the end.

static string

strrev(string $str)

UTF-8 aware alternative to strrev Reverse a string

static integer

strspn(string $str, string $mask, integer $start = null, integer $length = null)

UTF-8 aware alternative to strspn Find length of initial segment matching mask

static string

substr_replace(string $str, string $repl, integer $start, integer $length = null)

UTF-8 aware substr_replace Replace text within a portion of a string

static string

ltrim(string $str, string $charlist = null)

UTF-8 aware replacement for ltrim()

static string

rtrim(string $str, string $charlist = null)

UTF-8 aware replacement for rtrim() Strip whitespace (or other characters) from the end of a string You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise rtrim will work normally on a UTF-8 string

static string

trim(string $str, string $charlist = null)

UTF-8 aware replacement for trim() Strip whitespace (or other characters) from the beginning and end of a string Note: you only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise trim will work normally on a UTF-8 string

static string

ucfirst(string $str, string $delimiter = null, string $newDelimiter = null)

UTF-8 aware alternative to ucfirst Make a string's first character uppercase or all words' first character uppercase

static string

ucwords(string $str)

UTF-8 aware alternative to ucwords Uppercase the first character of each word in a string

static mixed

transcode(string $source, string $from_encoding, string $to_encoding)

Transcode a string.

static boolean

valid(string $str)

Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard.

static boolean

compliant(string $str)

Tests whether a string complies as UTF-8. This will be much faster than utf8isvalid but will pass five and six octet UTF-8 sequences, which are not supported by Unicode and so cannot be displayed correctly in a browser. In other words it is not as strict as utf8isvalid but it's faster. If you use it to validate user input, you place yourself at the risk that attackers will be able to inject 5 and 6 byte sequences (which may or may not be a significant risk, depending on what you are are doing)

static string

unicode_to_utf8(string $str)

Converts Unicode sequences to UTF-8 string

static string

unicode_to_utf16(string $str)

Converts Unicode sequences to UTF-16 string

Details

at line line 82
`static boolean is_ascii(string $str)`

Tests whether a string contains only 7bit ASCII bytes.

You might use this to conditionally check whether a string needs handling as UTF-8 or not, potentially offering performance benefits by using the native PHP equivalent if it's just ASCII e.g.;

php if (String::is_ascii($someString)) { // It's just ASCII - use the native PHP version $someString = strtolower($someString); } else { $someString = String::strtolower($someString); }

Parameters

string

$str

The string to test.

Return Value

boolean

True if the string is all ASCII

at line line 102
`static mixed strpos(string $str, string $search, integer $offset = false)`

UTF-8 aware alternative to strpos.

Find position of first occurrence of a string.

Parameters

string	$str	String being examined
string	$search	String being searched for
integer	$offset	Optional, specifies the position from which the search should be performed

Return Value

mixed

Number of characters before the first match or FALSE on failure

at line line 125
`static mixed strrpos(string $str, string $search, integer $offset)`

UTF-8 aware alternative to strrpos Finds position of last occurrence of a string

Parameters

string	$str	String being examined.
string	$search	String being searched for.
integer	$offset	Offset from the left of the string.

Return Value

mixed

Number of characters before the last match or false on failure

at line line 143
`static mixed substr(string $str, integer $offset, integer $length = false)`

UTF-8 aware alternative to substr Return part of a string given character offset (and optionally length)

Parameters

string	$str	String being processed
integer	$offset	Number of UTF-8 characters offset (from left)
integer	$length	Optional length in UTF-8 characters from offset

Return Value

mixed

string or FALSE if failure

at line line 169
`static mixed strtolower(string $str)`

UTF-8 aware alternative to strtlower

Make a string lowercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings

Parameters

string

$str

String being processed

Return Value

mixed

Either string in lowercase or FALSE is UTF-8 invalid

at line line 189
`static mixed strtoupper(string $str)`

Parameters

string

$str

String being processed

Return Value

mixed

Either string in uppercase or FALSE is UTF-8 invalid

at line line 206
`static integer strlen(string $str)`

UTF-8 aware alternative to strlen.

Returns the number of characters in the string (NOT THE NUMBER OF BYTES),

Parameters

string

$str

UTF-8 string.

Return Value

integer

Number of UTF-8 characters in string.

at line line 225
`static string str_ireplace(string $search, string $replace, string $str, integer $count = null)`

UTF-8 aware alternative to strireplace Case-insensitive version of strreplace

Parameters

string	$search	String to search
string	$replace	Existing string to replace
string	$str	New string to replace with
integer	$count	Optional count value to be passed by reference

Return Value

string

UTF-8 String

at line line 252
`static array str_split(string $str, integer $split_len = 1)`

UTF-8 aware alternative to str_split Convert a string to an array

Parameters

string	$str	UTF-8 encoded string to process
integer	$split_len	Number to characters to split string by

Return Value

array

at line line 277
`static integer strcasecmp(string $str1, string $str2, mixed $locale = false)`

UTF-8/LOCALE aware alternative to strcasecmp A case insensitive string comparison

Parameters

string	$str1	string 1 to compare
string	$str2	string 2 to compare
mixed	$locale	The locale used by strcoll or false to use classical comparison

Return Value

integer

< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.

at line line 333
`static integer strcmp(string $str1, string $str2, mixed $locale = false)`

UTF-8/LOCALE aware alternative to strcmp A case sensitive string comparison

Parameters

string	$str1	string 1 to compare
string	$str2	string 2 to compare
mixed	$locale	The locale used by strcoll or false to use classical comparison

Return Value

integer

< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.

at line line 385
`static integer strcspn(string $str, string $mask, integer $start = null, integer $length = null)`

UTF-8 aware alternative to strcspn Find length of initial segment not matching mask

Parameters

string	$str	The string to process
string	$mask	The mask
integer	$start	Optional starting character position (in characters)
integer	$length	Optional length

Return Value

integer

The length of the initial segment of str1 which does not contain any of the characters in str2

at line line 419
`static string stristr(string $str, string $search)`

UTF-8 aware alternative to stristr Returns all of haystack from the first occurrence of needle to the end.

needle and haystack are examined in a case-insensitive manner Find first occurrence of a string using case insensitive comparison

Parameters

string	$str	The haystack
string	$search	The needle

Return Value

string

the sub string

at line line 440
`static string strrev(string $str)`

UTF-8 aware alternative to strrev Reverse a string

Parameters

string

$str

String to be reversed

Return Value

string

The string in reverse character order

at line line 464
`static integer strspn(string $str, string $mask, integer $start = null, integer $length = null)`

UTF-8 aware alternative to strspn Find length of initial segment matching mask

Parameters

string	$str	The haystack
string	$mask	The mask
integer	$start	Start optional
integer	$length	Length optional

Return Value

integer

at line line 498
`static string substr_replace(string $str, string $repl, integer $start, integer $length = null)`

UTF-8 aware substr_replace Replace text within a portion of a string

Parameters

string	$str	The haystack
string	$repl	The replacement string
integer	$start	Start
integer	$length	Length (optional)

Return Value

string

at line line 525
`static string ltrim(string $str, string $charlist = null)`

UTF-8 aware replacement for ltrim()

Strip whitespace (or other characters) from the beginning of a string You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise ltrim will work normally on a UTF-8 string

Parameters

string	$str	The string to be trimmed
string	$charlist	The optional charlist of additional characters to trim

Return Value

string

The trimmed string

at line line 560
`static string rtrim(string $str, string $charlist = null)`

Parameters

string	$str	The string to be trimmed
string	$charlist	The optional charlist of additional characters to trim

Return Value

string

The trimmed string

at line line 595
`static string trim(string $str, string $charlist = null)`

Parameters

string	$str	The string to be trimmed
string	$charlist	The optional charlist of additional characters to trim

Return Value

string

The trimmed string

at line line 630
`static string ucfirst(string $str, string $delimiter = null, string $newDelimiter = null)`

UTF-8 aware alternative to ucfirst Make a string's first character uppercase or all words' first character uppercase

Parameters

string	$str	String to be processed
string	$delimiter	The words delimiter (null means do not split the string)
string	$newDelimiter	The new words delimiter (null means equal to $delimiter)

Return Value

string

If $delimiter is null, return the string with first character as upper case (if applicable) else consider the string of words separated by the delimiter, apply the ucfirst to each words and return the string with the new delimiter

at line line 661
`static string ucwords(string $str)`

UTF-8 aware alternative to ucwords Uppercase the first character of each word in a string

Parameters

string

$str

String to be processed

Return Value

string

String with first char of each word uppercase

at line line 684
`static mixed transcode(string $source, string $from_encoding, string $to_encoding)`

Transcode a string.

Parameters

string	$source	The string to transcode.
string	$from_encoding	The source encoding.
string	$to_encoding	The target encoding.

Return Value

mixed

The transcoded string, or null if the source was not a string.

at line line 718
`static boolean valid(string $str)`

Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard.

Note: this function has been modified to simple return true or false.

Parameters

string

$str

UTF-8 encoded string.

Return Value

boolean

true if valid

at line line 744
`static boolean compliant(string $str)`

Parameters

string

$str

UTF-8 string to check

Return Value

boolean

TRUE if string is valid UTF-8

at line line 760
`static string unicode_to_utf8(string $str)`

Converts Unicode sequences to UTF-8 string

Parameters

string

$str

Unicode string to convert

Return Value

string

UTF-8 string

at line line 786
`static string unicode_to_utf16(string $str)`

Converts Unicode sequences to UTF-16 string

Parameters

string

$str

Unicode string to convert

Return Value

string

UTF-16 string

http://www.php.net/strcasecmp
http://www.php.net/strcoll
http://www.php.net/setlocale

http://www.php.net/strcmp
http://www.php.net/strcoll
http://www.php.net/setlocale

valid
http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php#54805

Utf8String

Methods

Details

at line line 82 static boolean is_ascii(string $str)

Parameters

Return Value

at line line 102 static mixed strpos(string $str, string $search, integer $offset = false)

Parameters

Return Value

See also

at line line 125 static mixed strrpos(string $str, string $search, integer $offset)

Parameters

Return Value

See also

at line line 143 static mixed substr(string $str, integer $offset, integer $length = false)

Parameters

Return Value

See also

at line line 169 static mixed strtolower(string $str)

Parameters

Return Value

See also

at line line 189 static mixed strtoupper(string $str)

Parameters

Return Value

See also

at line line 206 static integer strlen(string $str)

Parameters

Return Value

See also

at line line 225 static string str_ireplace(string $search, string $replace, string $str, integer $count = null)

Parameters

Return Value

See also

at line line 252 static array str_split(string $str, integer $split_len = 1)

Parameters

Return Value

See also

at line line 277 static integer strcasecmp(string $str1, string $str2, mixed $locale = false)

Parameters

Return Value

See also

at line line 333 static integer strcmp(string $str1, string $str2, mixed $locale = false)

Parameters

Return Value

See also

at line line 385 static integer strcspn(string $str, string $mask, integer $start = null, integer $length = null)

Parameters

Return Value

See also

at line line 419 static string stristr(string $str, string $search)

Parameters

Return Value

See also

at line line 440 static string strrev(string $str)

Parameters

Return Value

See also

at line line 464 static integer strspn(string $str, string $mask, integer $start = null, integer $length = null)

Parameters

Return Value

See also

at line line 498 static string substr_replace(string $str, string $repl, integer $start, integer $length = null)

Parameters

Return Value

See also

at line line 525 static string ltrim(string $str, string $charlist = null)

Parameters

Return Value

See also

at line line 560 static string rtrim(string $str, string $charlist = null)

Parameters

Return Value

See also

at line line 595 static string trim(string $str, string $charlist = null)

Parameters

Return Value

See also

at line line 630 static string ucfirst(string $str, string $delimiter = null, string $newDelimiter = null)

Parameters

at line line 82
`static boolean is_ascii(string $str)`

at line line 102
`static mixed strpos(string $str, string $search, integer $offset = false)`

at line line 125
`static mixed strrpos(string $str, string $search, integer $offset)`

at line line 143
`static mixed substr(string $str, integer $offset, integer $length = false)`

at line line 169
`static mixed strtolower(string $str)`

at line line 189
`static mixed strtoupper(string $str)`

at line line 206
`static integer strlen(string $str)`

at line line 225
`static string str_ireplace(string $search, string $replace, string $str, integer $count = null)`

at line line 252
`static array str_split(string $str, integer $split_len = 1)`

at line line 277
`static integer strcasecmp(string $str1, string $str2, mixed $locale = false)`

at line line 333
`static integer strcmp(string $str1, string $str2, mixed $locale = false)`

at line line 385
`static integer strcspn(string $str, string $mask, integer $start = null, integer $length = null)`

at line line 419
`static string stristr(string $str, string $search)`

at line line 440
`static string strrev(string $str)`

at line line 464
`static integer strspn(string $str, string $mask, integer $start = null, integer $length = null)`

at line line 498
`static string substr_replace(string $str, string $repl, integer $start, integer $length = null)`

at line line 525
`static string ltrim(string $str, string $charlist = null)`

at line line 560
`static string rtrim(string $str, string $charlist = null)`

at line line 595
`static string trim(string $str, string $charlist = null)`

at line line 630
`static string ucfirst(string $str, string $delimiter = null, string $newDelimiter = null)`

at line line 661
`static string ucwords(string $str)`

at line line 684
`static mixed transcode(string $source, string $from_encoding, string $to_encoding)`

at line line 718
`static boolean valid(string $str)`

at line line 744
`static boolean compliant(string $str)`

at line line 760
`static string unicode_to_utf8(string $str)`

at line line 786
`static string unicode_to_utf16(string $str)`