Qucs-S S-parameter Viewer & RF Synthesis Tools
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Static Public Attributes | Protected Attributes | List of all members
pip._vendor.chardet.universaldetector.UniversalDetector Class Reference

Public Member Functions

None __init__ (self, LanguageFilter lang_filter=LanguageFilter.ALL, bool should_rename_legacy=False)
 
int input_state (self)
 
bool has_win_bytes (self)
 
List[CharSetProbercharset_probers (self)
 
None reset (self)
 
None feed (self, Union[bytes, bytearray] byte_str)
 
ResultDict close (self)
 

Public Attributes

 done
 
 lang_filter
 
 logger
 
 should_rename_legacy
 
 result
 
 MINIMUM_THRESHOLD
 

Static Public Attributes

float MINIMUM_THRESHOLD = 0.20
 
 HIGH_BYTE_DETECTOR = re.compile(b"[\x80-\xFF]")
 
 ESC_DETECTOR = re.compile(b"(\033|~{)")
 
 WIN_BYTE_DETECTOR = re.compile(b"[\x80-\x9F]")
 
dict ISO_WIN_MAP
 
dict LEGACY_MAP
 

Protected Attributes

 _got_data
 
 _input_state
 
 _last_char
 
 _has_win_bytes
 
 _utf1632_prober
 
 _esc_charset_prober
 
 _charset_probers
 

Detailed Description

The ``UniversalDetector`` class underlies the ``chardet.detect`` function
and coordinates all of the different charset probers.

To get a ``dict`` containing an encoding and its confidence, you can simply
run:

.. code::

        u = UniversalDetector()
        u.feed(some_bytes)
        u.close()
        detected = u.result

Member Function Documentation

◆ close()

ResultDict pip._vendor.chardet.universaldetector.UniversalDetector.close (   self)
Stop analyzing the current document and come up with a final
prediction.

:returns:  The ``result`` attribute, a ``dict`` with the keys
           `encoding`, `confidence`, and `language`.

◆ feed()

None pip._vendor.chardet.universaldetector.UniversalDetector.feed (   self,
Union[bytes, bytearray]  byte_str 
)
Takes a chunk of a document and feeds it through all of the relevant
charset probers.

After calling ``feed``, you can check the value of the ``done``
attribute to see if you need to continue feeding the
``UniversalDetector`` more data, or if it has made a prediction
(in the ``result`` attribute).

.. note::
   You should always call ``close`` when you're done feeding in your
   document if ``done`` is not already ``True``.

◆ reset()

None pip._vendor.chardet.universaldetector.UniversalDetector.reset (   self)
Reset the UniversalDetector and all of its probers back to their
initial states.  This is called by ``__init__``, so you only need to
call this directly in between analyses of different documents.

Member Data Documentation

◆ ISO_WIN_MAP

dict pip._vendor.chardet.universaldetector.UniversalDetector.ISO_WIN_MAP
static
Initial value:
= {
"iso-8859-1": "Windows-1252",
"iso-8859-2": "Windows-1250",
"iso-8859-5": "Windows-1251",
"iso-8859-6": "Windows-1256",
"iso-8859-7": "Windows-1253",
"iso-8859-8": "Windows-1255",
"iso-8859-9": "Windows-1254",
"iso-8859-13": "Windows-1257",
}

◆ LEGACY_MAP

dict pip._vendor.chardet.universaldetector.UniversalDetector.LEGACY_MAP
static
Initial value:
= {
"ascii": "Windows-1252",
"iso-8859-1": "Windows-1252",
"tis-620": "ISO-8859-11",
"iso-8859-9": "Windows-1254",
"gb2312": "GB18030",
"euc-kr": "CP949",
"utf-16le": "UTF-16",
}

The documentation for this class was generated from the following file: