XmHTMLParser provides an Object capable of parsing HTML 3.2 text. It offers both HTML 3.2 verification and repair of non-conforming HTML documents as well as incremental document parsing. XmHTMLParser objects can be used for creating a fully interactive HTML 3.2 parser application thru XmHTMLParser's callback resources.
Include File: | <Xm/Parser.h> |
Class Name: | XmHTMLParser |
Class Hierarchy: | Object->XmHTMLParser |
Class Pointer: | xmHTMLParserObjectClass |
Functions/Macros: | XmCreateHTMLParser, XmHTMLParser... routines, XmIsHTMLParser. |
XmNmimeType | String | text/html | CSG |
XmNparserIsProgressive | Boolean | False | CSG |
XmNretainSource | Boolean | False | CSG |
XmNstrictHTMLChecking | Boolean | False | CSG |
XmNuserData | Pointer | NULL | CSG |
XmNdocumentCallback | XmCR_HTML_DOCUMENT | XmHTMLDocumentCallbackStruct |
XmNmodifyVerifyCallback | XmCR_HTML_MODIFYING_TEXT_VALUE | XmHTMLVerifyCallbackStruct |
XmNparserCallback | XmCR_HTML_PARSER | XmXmHTMLParserCallbackStruct |
All callback resources also reference XmAnyCallbackStruct.
XmNdocumentCallback is activated when XmHTMLParser has finished parsing a document and before XmHTMLParserSetString or XmHTMLParserUpdateSource returns.
XmNmodifyVerifyCallback is activated when XmHTMLParser is about to insert or remove text in or from the current source text.
XmNparserCallback is activated when XmHTMLParser encounters a HTML element that is in error. XmHTMLParser detects unknown, unbalanced, badly placed as well as unterminated HTML elements and HTML 3.2 violations.
typedef struct { int reason; /* the reason the callback was called */ XEvent *event; /* always NULL */ Boolean html32; /* True when document was HTML 3.2 conforming */ Boolean verified; /* True when document has been verified */ Boolean balanced; /* True when parser tree is balanced */ Boolean terminated; /* True if parser is terminated prematurely */ int pass_level; /* current parser level count. */ Boolean redo; /* See below */ }XmHTMLDocumentCallbackStruct;The The
The
The
The
The
Setting the
When no XmNdocumentCallback callback resource is installed, XmHTML will make at most two passes on the current document. See the Parser Description document for more information.
typedef struct{ int reason; /* the reason the callback was called */ XEvent *event; /* always NULL */ Boolean doit; /* unused */ int action; /* type of modification */ int line_no; /* current line number in input text */ int start_pos; /* start of text to change */ int end_pos; /* end of text to change */ XmHTMLTextBlock text; /* describes text to remove or insert */ }XmHTMLVerifyCallbackStruct, *XmHTMLVerifyPtr;The The
The
The
The
typedef struct{
String ptr; /* pointer to text to remove/insert */
int len; /* length of this text */
}XmHTMLTextBlockRec, *XmHTMLTextBlock;
The
typedef struct{ int reason; /* the reason the callback was called */ XEvent *event; /* always NULL */ int errno; /* total error count uptil now */ int line_no; /* current line number in input text */ int start_pos; /* start of text in error */ int end_pos; /* end of text in error */ parserError error; /* type of error */ unsigned char action; /* suggested correction action */ XmHTMLTextBlock repair; /* proposed element to insert */ XmHTMLTextBlock current; /* current element */ XmHTMLTextBlock offender; /* offending element */ }XmHTMLParserCallbackStruct, *XmHTMLParserPtr;The The
The
The
The
current/repair | |||
HTML_BAD | An element is completely out of order and the internal autocorrection routines cannot find a proper place for this element. | HTML_REMOVE | Yes/No |
HTML_CLOSE_BLOCK | A closing block level element is encountered while it was never opened. | HTML_REMOVE | No/Yes |
HTML_INTERNAL | An internal error was encountered. | HTML_TERMINATE | No/No |
HTML_NOTIFY | Notification of insertion of an optional opening/closing element. | HTML_INSERT | No/Yes |
HTML_OPEN_BLOCK | A new block-level element is encountered while a previous block element is still open. | HTML_INSERT | Yes/Yes |
HTML_OPEN_ELEMENT | an unbalanced terminator is encountered. | HTML_SWITCH | Yes/Yes |
HTML_VIOLATION | a HTML 3.2 violation was encountered. | HTML_INSERT, HTML_KEEP or HTML_REMOVE | Yes/Dynamic |
HTML_UNKNOWN_ELEMENT | an unknown element was encountered. | HTML_REMOVE | No/No |
The HTML_VIOLATION error is a special case. When XmHTMLParser can find a suitable element that will cause the offending element to be no longer in violation of the HTML 3.2 standard, it will propose to insert this new element. When it can't find one, the default action depends on the value of the XmNstrictHTMLChecking resource. When this resource is set to
HTML_ALIAS | Replace | HTML_UNKNOWN_ELEMENT |
HTML_IGNORE | Ignore this error, proceed as if nothing happened | HTML_BAD, HTML_INTERNAL |
HTML_INSERT | Insert | HTML_CLOSE_BLOCK, HTML_NOTIFY, HTML_OPEN_BLOCK, HTML_VIOLATION |
HTML_KEEP | Keep | HTML_CLOSE_BLOCK, HTML_OPEN_BLOCK, HTML_VIOLATION |
HTML_REMOVE | Remove | all |
HTML_SWITCH | Switch | HTML_OPEN_ELEMENT |
HTML_TERMINATE | Terminate parser | All errors |
XmNdestroyCallback | Object |
©Copyright 1996-1997 by Ripley Software Development
Last update: September 19, 1997 by Koen