Qucs-S S-parameter Viewer & RF Synthesis Tools
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Static Public Attributes | Protected Member Functions | Protected Attributes | List of all members
bs4.dammit.EntitySubstitution Class Reference
Inheritance diagram for bs4.dammit.EntitySubstitution:
Inheritance graph
[legend]
Collaboration diagram for bs4.dammit.EntitySubstitution:
Collaboration graph
[legend]

Public Member Functions

str quoted_attribute_value (cls, str value)
 
str substitute_xml (cls, str value, bool make_quoted_attribute=False)
 
str substitute_xml_containing_entities (cls, str value, bool make_quoted_attribute=False)
 
str substitute_html (cls, str s)
 
str substitute_html5 (cls, str s)
 
str substitute_html5_raw (cls, str s)
 

Public Attributes

 CHARACTER_TO_HTML_ENTITY
 
 HTML_ENTITY_TO_CHARACTER
 
 CHARACTER_TO_HTML_ENTITY_RE
 
 CHARACTER_TO_HTML_ENTITY_WITH_AMPERSAND_RE
 

Static Public Attributes

Dict HTML_ENTITY_TO_CHARACTER [str, str]
 
Dict CHARACTER_TO_HTML_ENTITY [str, str]
 
Pattern CHARACTER_TO_HTML_ENTITY_RE [str]
 
Pattern CHARACTER_TO_HTML_ENTITY_WITH_AMPERSAND_RE [str]
 
dict CHARACTER_TO_XML_ENTITY
 
 ANY_ENTITY_RE = re.compile("&(#\\d+|#x[0-9a-fA-F]+|\\w+);", re.I)
 
Pattern BARE_AMPERSAND_OR_BRACKET
 
Pattern AMPERSAND_OR_BRACKET = re.compile("([<>&])")
 

Protected Member Functions

None _populate_class_variables (cls)
 
str _substitute_html_entity (cls, re.Match matchobj)
 
str _substitute_xml_entity (cls, re.Match matchobj)
 
str _escape_entity_name (cls, re.Match matchobj)
 
str _escape_unrecognized_entity_name (cls, re.Match matchobj)
 

Protected Attributes

 _substitute_html_entity
 

Detailed Description

The ability to substitute XML or HTML entities for certain characters.

Member Function Documentation

◆ _populate_class_variables()

None bs4.dammit.EntitySubstitution._populate_class_variables (   cls)
protected
Initialize variables used by this class to manage the plethora of
HTML5 named entities.

This function sets the following class variables:

CHARACTER_TO_HTML_ENTITY - A mapping of Unicode strings like "⦨" to
entity names like "angmsdaa". When a single Unicode string has
multiple entity names, we try to choose the most commonly-used
name.

HTML_ENTITY_TO_CHARACTER: A mapping of entity names like "angmsdaa" to
Unicode strings like "⦨".

CHARACTER_TO_HTML_ENTITY_RE: A regular expression matching (almost) any
Unicode string that corresponds to an HTML5 named entity.

CHARACTER_TO_HTML_ENTITY_WITH_AMPERSAND_RE: A very similar
regular expression to CHARACTER_TO_HTML_ENTITY_RE, but which
also matches unescaped ampersands. This is used by the 'html'
formatted to provide backwards-compatibility, even though the HTML5
spec allows most ampersands to go unescaped.

◆ _substitute_html_entity()

str bs4.dammit.EntitySubstitution._substitute_html_entity (   cls,
re.Match  matchobj 
)
protected
Used with a regular expression to substitute the
appropriate HTML entity for a special character string.

◆ _substitute_xml_entity()

str bs4.dammit.EntitySubstitution._substitute_xml_entity (   cls,
re.Match  matchobj 
)
protected
Used with a regular expression to substitute the
appropriate XML entity for a special character string.

◆ quoted_attribute_value()

str bs4.dammit.EntitySubstitution.quoted_attribute_value (   cls,
str  value 
)
Make a value into a quoted XML attribute, possibly escaping it.

 Most strings will be quoted using double quotes.

  Bob's Bar -> "Bob's Bar"

 If a string contains double quotes, it will be quoted using
 single quotes.

  Welcome to "my bar" -> 'Welcome to "my bar"'

 If a string contains both single and double quotes, the
 double quotes will be escaped, and the string will be quoted
 using double quotes.

  Welcome to "Bob's Bar" -> Welcome to &quot;Bob's bar&quot;

:param value: The XML attribute value to quote
:return: The quoted value

◆ substitute_html()

str bs4.dammit.EntitySubstitution.substitute_html (   cls,
str  s 
)
Replace certain Unicode characters with named HTML entities.

This differs from ``data.encode(encoding, 'xmlcharrefreplace')``
in that the goal is to make the result more readable (to those
with ASCII displays) rather than to recover from
errors. There's absolutely nothing wrong with a UTF-8 string
containg a LATIN SMALL LETTER E WITH ACUTE, but replacing that
character with "&eacute;" will make it more readable to some
people.

:param s: The string to be modified.
:return: The string with some Unicode characters replaced with
   HTML entities.

◆ substitute_html5()

str bs4.dammit.EntitySubstitution.substitute_html5 (   cls,
str  s 
)
Replace certain Unicode characters with named HTML entities
using HTML5 rules.

Specifically, this method is much less aggressive about
escaping ampersands than substitute_html. Only ambiguous
ampersands are escaped, per the HTML5 standard:

"An ambiguous ampersand is a U+0026 AMPERSAND character (&)
that is followed by one or more ASCII alphanumerics, followed
by a U+003B SEMICOLON character (;), where these characters do
not match any of the names given in the named character
references section."

Unlike substitute_html5_raw, this method assumes HTML entities
were converted to Unicode characters on the way in, as
Beautiful Soup does. By the time Beautiful Soup does its work,
the only ambiguous ampersands that need to be escaped are the
ones that were escaped in the original markup when mentioning
HTML entities.

:param s: The string to be modified.
:return: The string with some Unicode characters replaced with
   HTML entities.

◆ substitute_html5_raw()

str bs4.dammit.EntitySubstitution.substitute_html5_raw (   cls,
str  s 
)
Replace certain Unicode characters with named HTML entities
using HTML5 rules.

substitute_html5_raw is similar to substitute_html5 but it is
designed for standalone use (whereas substitute_html5 is
designed for use with Beautiful Soup).

:param s: The string to be modified.
:return: The string with some Unicode characters replaced with
   HTML entities.

◆ substitute_xml()

str bs4.dammit.EntitySubstitution.substitute_xml (   cls,
str  value,
bool   make_quoted_attribute = False 
)
Replace special XML characters with named XML entities.

The less-than sign will become &lt;, the greater-than sign
will become &gt;, and any ampersands will become &amp;. If you
want ampersands that seem to be part of an entity definition
to be left alone, use `substitute_xml_containing_entities`
instead.

:param value: A string to be substituted.

:param make_quoted_attribute: If True, then the string will be
 quoted, as befits an attribute value.

:return: A version of ``value`` with special characters replaced
 with named entities.

◆ substitute_xml_containing_entities()

str bs4.dammit.EntitySubstitution.substitute_xml_containing_entities (   cls,
str  value,
bool   make_quoted_attribute = False 
)
Substitute XML entities for special XML characters.

:param value: A string to be substituted. The less-than sign will
  become &lt;, the greater-than sign will become &gt;, and any
  ampersands that are not part of an entity defition will
  become &amp;.

:param make_quoted_attribute: If True, then the string will be
 quoted, as befits an attribute value.

Member Data Documentation

◆ BARE_AMPERSAND_OR_BRACKET

Pattern bs4.dammit.EntitySubstitution.BARE_AMPERSAND_OR_BRACKET
static
Initial value:
= re.compile(
"([<>]|" "&(?!#\\d+;|#x[0-9a-fA-F]+;|\\w+;)" ")"
)

◆ CHARACTER_TO_XML_ENTITY

dict bs4.dammit.EntitySubstitution.CHARACTER_TO_XML_ENTITY
static
Initial value:
= {
"'": "apos",
'"': "quot",
"&": "amp",
"<": "lt",
">": "gt",
}

The documentation for this class was generated from the following file: