Skip to main content
Version: 2.0.0

Class: StringUtils

Utility class for string operations. This class cannot be instantiated and all methods are static.

Abstract

Methods

fromBytes()

@staticmethod fromBytes(bytes: List[int], charset: str = None) -> str

Converts a list of bytes to a string using the specified charset.

Parameters

  • bytes: List[int]

    The list of bytes to convert.

  • charset: str, optional

    The charset to use for conversion. Defaults to UTF-8 if not specified.

    CharsetDescription
    US-ASCIISeven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
    ISO-8859-1ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
    UTF-8Eight-bit UCS Transformation Format
    UTF-16BESixteen-bit UCS Transformation Format, big-endian byte order
    UTF-16LESixteen-bit UCS Transformation Format, little-endian byte order
    UTF-16Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
    The UTF-8charset is specified by RFC 2279; the transformation format upon which it is based is specified in Amendment 2 of ISO 10646-1 and is also described in the Unicode Standard.

    The UTF-16 charsets are specified by RFC 2781; the transformation formats upon which they are based are specified in Amendment 1 of ISO 10646-1 and are also described in the Unicode Standard.

    The UTF-16 charsets use sixteen-bit quantities and are therefore sensitive to byte order. In these encodings the byte order of a stream may be indicated by an initial byte-order mark represented by the Unicode character '\uFEFF'. Byte-order marks are handled as follows:

    When decoding, the UTF-16BE and UTF-16LE charsets interpret the initial byte-order marks as a ZERO-WIDTH NON-BREAKING SPACE; when encoding, they do not write byte-order marks.

    When decoding, the UTF-16 charset interprets the byte-order mark at the beginning of the input stream to indicate the byte-order of the stream but defaults to big-endian if there is no byte-order mark; when encoding, it uses big-endian byte order and writes a big-endian byte-order mark.

    In any case, byte order marks occurring after the first element of an input sequence are not omitted since the same code is used to represent ZERO-WIDTH NON-BREAKING SPACE. Every instance of the Java virtual machine has a default charset, which may or may not be one of the standard charsets. The default charset is determined during virtual-machine startup and typically depends upon the locale and charset being used by the underlying operating system.

    The StandardCharsets class defines constants for each of the standard charsets.

Returns

str - The resulting string.

Example

# Convert bytes to string using default UTF-8 charset
hello_array = [72, 101, 108, 108, 111]
result = StringUtils.fromBytes(hello_array) # Assuming some_byte_content is a list of bytes. Could be a bytes object or any other byte sequence.

# Convert bytes to string using a specific charset
result_with_charset = StringUtils.fromBytes(hello_array, "US-ASCII")
print(result_with_charset) # Output: "Hello"

toBytes()

@staticmethod toBytes(value: str, charset: str = None) -> List[int]

Converts a string to an array of bytes using the specified charset.

Parameters

  • value: str

    The string to convert.

  • charset: str, optional

    The charset to use for conversion. Defaults to UTF-8 if not specified.

    CharsetDescription
    US-ASCIISeven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set
    ISO-8859-1ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1
    UTF-8Eight-bit UCS Transformation Format
    UTF-16BESixteen-bit UCS Transformation Format, big-endian byte order
    UTF-16LESixteen-bit UCS Transformation Format, little-endian byte order
    UTF-16Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
    The UTF-8charset is specified by RFC 2279; the transformation format upon which it is based is specified in Amendment 2 of ISO 10646-1 and is also described in the Unicode Standard.

    The UTF-16 charsets are specified by RFC 2781; the transformation formats upon which they are based are specified in Amendment 1 of ISO 10646-1 and are also described in the Unicode Standard.

    The UTF-16 charsets use sixteen-bit quantities and are therefore sensitive to byte order. In these encodings the byte order of a stream may be indicated by an initial byte-order mark represented by the Unicode character '\uFEFF'. Byte-order marks are handled as follows:

    When decoding, the UTF-16BE and UTF-16LE charsets interpret the initial byte-order marks as a ZERO-WIDTH NON-BREAKING SPACE; when encoding, they do not write byte-order marks.

    When decoding, the UTF-16 charset interprets the byte-order mark at the beginning of the input stream to indicate the byte-order of the stream but defaults to big-endian if there is no byte-order mark; when encoding, it uses big-endian byte order and writes a big-endian byte-order mark.

    In any case, byte order marks occurring after the first element of an input sequence are not omitted since the same code is used to represent ZERO-WIDTH NON-BREAKING SPACE. Every instance of the Java virtual machine has a default charset, which may or may not be one of the standard charsets. The default charset is determined during virtual-machine startup and typically depends upon the locale and charset being used by the underlying operating system.

    The StandardCharsets class defines constants for each of the standard charsets.

Returns

List[int] - The resulting array of bytes.

Example

# Convert string to bytes using default UTF-8 charset
hello_string = "Hello"
result = StringUtils.toBytes(hello_string) # Assuming someStringContent is a string.

# Convert string to bytes using a specific charset
result_with_charset = StringUtils.toBytes(hello_string, "US-ASCII")
print(result_with_charset) # Output: [72, 101, 108, 108, 111]

isNullOrBlank()

@staticmethod isNullOrBlank(value: str) -> bool

Checks if a string is None or contains only whitespace characters.

Parameters

  • value: str

    The string to check.

Returns

bool - True if the string is None or contains only whitespace; otherwise, False.

Example

# Returns True
StringUtils.isNullOrBlank(None)
StringUtils.isNullOrBlank("")
StringUtils.isNullOrBlank(" ")

# Returns False
StringUtils.isNullOrBlank("Hello")
StringUtils.isNullOrBlank(" Hello ")

isNullOrEmpty()

@staticmethod isNullOrEmpty(value: str) -> bool

Checks if a string is None or empty.

Parameters

  • value: str

    The string to check.

Returns

bool - True if the string is None or empty; otherwise, False.

Example

# Returns True
StringUtils.isNullOrEmpty(None)
StringUtils.isNullOrEmpty("")

# Returns False
StringUtils.isNullOrEmpty("Hello")
StringUtils.isNullOrEmpty(" ")

isNumeric()

@staticmethod isNumeric(value: str) -> bool

Checks if a string contains only numeric characters.

Parameters

  • value: str

    The string to check.

Returns

bool - True if the string contains only numeric characters; otherwise, False.

Example

# Returns True
StringUtils.isNumeric("123")
StringUtils.isNumeric("0")

# Returns False
StringUtils.isNumeric("12.3")
StringUtils.isNumeric("abc")
StringUtils.isNumeric("1a2b3c")