Qucs-S S-parameter Viewer & RF Synthesis Tools
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Static Public Attributes | Protected Attributes | List of all members
pip._vendor.chardet.utf1632prober.UTF1632Prober Class Reference
Inheritance diagram for pip._vendor.chardet.utf1632prober.UTF1632Prober:
Inheritance graph
[legend]
Collaboration diagram for pip._vendor.chardet.utf1632prober.UTF1632Prober:
Collaboration graph
[legend]

Public Member Functions

None __init__ (self)
 
None reset (self)
 
str charset_name (self)
 
str language (self)
 
float approx_32bit_chars (self)
 
float approx_16bit_chars (self)
 
bool is_likely_utf32be (self)
 
bool is_likely_utf32le (self)
 
bool is_likely_utf16be (self)
 
bool is_likely_utf16le (self)
 
None validate_utf32_characters (self, List[int] quad)
 
None validate_utf16_characters (self, List[int] pair)
 
ProbingState feed (self, Union[bytes, bytearray] byte_str)
 
ProbingState state (self)
 
float get_confidence (self)
 

Public Attributes

 position
 
 zeros_at_mod
 
 nonzeros_at_mod
 
 quad
 
 invalid_utf16be
 
 invalid_utf16le
 
 invalid_utf32be
 
 invalid_utf32le
 
 first_half_surrogate_pair_detected_16be
 
 first_half_surrogate_pair_detected_16le
 
- Public Attributes inherited from pip._vendor.chardet.charsetprober.CharSetProber
 active
 
 lang_filter
 
 logger
 

Static Public Attributes

int MIN_CHARS_FOR_DETECTION = 20
 
float EXPECTED_RATIO = 0.94
 
- Static Public Attributes inherited from pip._vendor.chardet.charsetprober.CharSetProber
float SHORTCUT_THRESHOLD = 0.95
 

Protected Attributes

 _state
 
- Protected Attributes inherited from pip._vendor.chardet.charsetprober.CharSetProber
 _state
 

Additional Inherited Members

- Static Public Member Functions inherited from pip._vendor.chardet.charsetprober.CharSetProber
bytes filter_high_byte_only (Union[bytes, bytearray] buf)
 
bytearray filter_international_words (Union[bytes, bytearray] buf)
 
bytes remove_xml_tags (Union[bytes, bytearray] buf)
 

Detailed Description

This class simply looks for occurrences of zero bytes, and infers
whether the file is UTF16 or UTF32 (low-endian or big-endian)
For instance, files looking like ( \0 \0 \0 [nonzero] )+
have a good probability to be UTF32BE.  Files looking like ( \0 [nonzero] )+
may be guessed to be UTF16BE, and inversely for little-endian varieties.

Constructor & Destructor Documentation

◆ __init__()

None pip._vendor.chardet.utf1632prober.UTF1632Prober.__init__ (   self)

Member Function Documentation

◆ charset_name()

str pip._vendor.chardet.utf1632prober.UTF1632Prober.charset_name (   self)

◆ feed()

ProbingState pip._vendor.chardet.utf1632prober.UTF1632Prober.feed (   self,
Union[bytes, bytearray]  byte_str 
)

◆ get_confidence()

float pip._vendor.chardet.utf1632prober.UTF1632Prober.get_confidence (   self)

◆ language()

str pip._vendor.chardet.utf1632prober.UTF1632Prober.language (   self)

◆ reset()

None pip._vendor.chardet.utf1632prober.UTF1632Prober.reset (   self)

◆ state()

ProbingState pip._vendor.chardet.utf1632prober.UTF1632Prober.state (   self)

◆ validate_utf16_characters()

None pip._vendor.chardet.utf1632prober.UTF1632Prober.validate_utf16_characters (   self,
List[int]  pair 
)
Validate if the pair of bytes is  valid UTF-16.

UTF-16 is valid in the range 0x0000 - 0xFFFF excluding 0xD800 - 0xFFFF
with an exception for surrogate pairs, which must be in the range
0xD800-0xDBFF followed by 0xDC00-0xDFFF

https://en.wikipedia.org/wiki/UTF-16

◆ validate_utf32_characters()

None pip._vendor.chardet.utf1632prober.UTF1632Prober.validate_utf32_characters (   self,
List[int]  quad 
)
Validate if the quad of bytes is valid UTF-32.

UTF-32 is valid in the range 0x00000000 - 0x0010FFFF
excluding 0x0000D800 - 0x0000DFFF

https://en.wikipedia.org/wiki/UTF-32

The documentation for this class was generated from the following file: