ubit
Classes | Public Member Functions | Protected Member Functions | List of all members
ubit::UXmlParser Class Reference

XML parser. More...

#include <uxmlparser.hpp>

Inheritance diagram for ubit::UXmlParser:
ubit::UHtmlParser

Classes

struct  ParseError
 

Public Member Functions

 UXmlParser ()
 creates a new XML parser.
 
void addGrammar (const UXmlGrammar &)
 adds a grammar to the parser. More...
 
UXmlDocumentread (const UStr &path)
 reads and parses a XML file and returns the corresponding XML tree. More...
 
UXmlDocumentreadBuffer (const UStr &docname, const UStr &buffer)
 synonym for parse();
 
UXmlDocumentparse (const UStr &docname, const UStr &buffer)
 reads and parses a XML buffer and returns the corresponding XML tree. More...
 
int getStatus ()
 returns the reading/parsing status.
 
UErrorHandlergetErrorHandler ()
 returns the current error handler.
 
void setErrorHandler (UErrorHandler *eh)
 changes the error handler (UAppli default handler used if argument is null).
 
void setPermissive (bool b)
 parses documents in permissive mode (default is false). More...
 
void setCollapseSpaces (bool b)
 collapses whitespaces (and tabs and newlines) in elements (default is false). More...
 

Protected Member Functions

void readElement (UElem *parent)
 
void readText (UElem *parent)
 
bool readXMLDeclaration ()
 
void readXMLInstruction (UElem *parent)
 
void readSGMLData (UElem *parent)
 
void skipSpaces ()
 
UChar readCharEntityReference ()
 
bool readName (UStr &)
 
bool readQuotedValue (UStr &, UChar quoting_char)
 
bool readUnquotedValue (UStr &)
 
bool readNameValuePair (UStr &name, UStr &value)
 
UElemreadElementStartTag (UStr &elem_name, int &stat)
 
int readElementEndTag (const UStr &elem_name)
 
void error (const char *msg, const UChar *line)
 
void error (const char *msg_start, const UStr &name, const char *msg_end, const UChar *line)
 
void unexpected (const char *msg, const UChar *line)
 

Detailed Description

XML parser.

See also
: use UHtmlParser to parse HTML code (and see and setPermissive() and setCollapseSpaces() for more details)

Member Function Documentation

§ addGrammar()

void ubit::UXmlParser::addGrammar ( const UXmlGrammar g)

adds a grammar to the parser.

note that order matters: an element or an attribute name is searched in the first grammar, then is the second one, and so on.

§ parse()

UXmlDocument * ubit::UXmlParser::parse ( const UStr docname,
const UStr buffer 
)

reads and parses a XML buffer and returns the corresponding XML tree.

'buffer' contains the text to parse and 'docname' the anme of this document

§ read()

UXmlDocument * ubit::UXmlParser::read ( const UStr path)

reads and parses a XML file and returns the corresponding XML tree.

'path' is the qualified filename of the XML document.

§ setCollapseSpaces()

void ubit::UXmlParser::setCollapseSpaces ( bool  b)
inline

collapses whitespaces (and tabs and newlines) in elements (default is false).

should be set to false when parsing actual XML code, and true for HTML code. Note that whitespaces are never collapsed for elements which UElemClass

§ setPermissive()

void ubit::UXmlParser::setPermissive ( bool  b)
inline

parses documents in permissive mode (default is false).

allows for constructs which are invalid in XML code but OK in HTML code:

  • attributes value dont need to be quoted, missing values are also accepted.
  • closing tags can be omitted for EMPTY_ELEMENTs (eg. ). EMPTY_ELEMENT is a mode of UElemClass,
    See also
    UClass::getMode()
  • the textual content of DONT_PARSE_CONTENT elements is not parsed and their comments are stored as a text element (eg. <style> <script>)

The documentation for this class was generated from the following files: