gate.html
Class HtmlDocumentHandler

java.lang.Object
  |
  +--javax.swing.text.html.HTMLEditorKit.ParserCallback
        |
        +--gate.html.HtmlDocumentHandler

public class HtmlDocumentHandler
extends HTMLEditorKit.ParserCallback

Implements the behaviour of the HTML reader. Methods of an object of this class are called by the HTML parser when events will appear. The idea is to parse the HTML document and construct Gate annotations objects. This class also will replace the content of the Gate document with a new one containing anly text from the HTML document.


Inner Class Summary
(package private)  class HtmlDocumentHandler.CustomObject
          The objects belonging to this class are used inside the stack.
 
Field Summary
private  AnnotationSet basicAS
           
private  LinkedList colector
           
protected  long customObjectsId
           
private static boolean DEBUG
          Debug flag
private  Document doc
           
private  int elements
           
(package private) static int ELEMENTS_RATE
          This method verifies if data contained by the CustomObject can be used to create a GATE annotation.
private  <>Map markupElementsMap
           
protected  List myStatusListeners
           
private  Stack stack
           
private  StringBuffer tmpDocContent
           
 
Fields inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback
IMPLIED
 
Constructor Summary
HtmlDocumentHandler(Document aDocument, <>Map aMarkupElementsMap)
          Constructor initialises all the private memeber data.
HtmlDocumentHandler(Document aDocument, <>Map aMarkupElementsMap, AnnotationSet anAnnotationSet)
          Constructor initialises all the private memeber data
 
Method Summary
 void addStatusListener(StatusListener listener)
           
protected  void customizeAppearanceOfDocumentWithEndTag(HTML.Tag t)
          This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document.
protected  void customizeAppearanceOfDocumentWithSimpleTag(HTML.Tag t)
          This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document.
protected  void customizeAppearanceOfDocumentWithStartTag(HTML.Tag t)
          This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document.
protected  void fireStatusChangedEvent(String text)
           
 void flush()
          This method is called once, when the HTML parser reaches the end of its input streamin order to notify the parserCallback that there is nothing more to parse.
 void handleComment(char[] text, int pos)
          This method is called when the HTML parser encounts a comment
 void handleEndTag(HTML.Tag t, int pos)
          This method is called when the HTML parser encounts the end of a tag that means that the tag is paired by a beginning tag
 void handleError(String errorMsg, int pos)
          This method is called when the HTML parser encounts an error it depends on the programmer if he wants to deal with that error
 void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos)
          This method is called when the HTML parser encounts an empty tag
 void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos)
          This method is called when the HTML parser encounts the beginning of a tag that means that the tag is paired by an end tag and it's not an empty one.
 void handleText(char[] text, int pos)
          This method is called when the HTML parser encounts text (PCDATA)
 void removeStatusListener(StatusListener listener)
           
 
Methods inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback
, handleEndOfLineString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

DEBUG

private static final boolean DEBUG
Debug flag

ELEMENTS_RATE

static final int ELEMENTS_RATE
This method verifies if data contained by the CustomObject can be used to create a GATE annotation.

markupElementsMap

private <>Map markupElementsMap

tmpDocContent

private StringBuffer tmpDocContent

stack

private Stack stack

doc

private Document doc

basicAS

private AnnotationSet basicAS

myStatusListeners

protected List myStatusListeners

elements

private int elements

customObjectsId

protected long customObjectsId

colector

private LinkedList colector
Constructor Detail

HtmlDocumentHandler

public HtmlDocumentHandler(Document aDocument,
                           <>Map aMarkupElementsMap)
Constructor initialises all the private memeber data. This will use the default annotation set taken from the gate document.
Parameters:
aDocument - The gate document that will be processed
aMarkupElementsMap - The map containing the elements that will transform into annotations

HtmlDocumentHandler

public HtmlDocumentHandler(Document aDocument,
                           <>Map aMarkupElementsMap,
                           AnnotationSet anAnnotationSet)
Constructor initialises all the private memeber data
Parameters:
aDocument - The gate document that will be processed
aMarkupElementsMap - The map containing the elements that will transform into annotations
anAnnoatationSet - The annotation set that will contain annotations resulted from the processing of the gate document
Method Detail

handleStartTag

public void handleStartTag(HTML.Tag t,
                           MutableAttributeSet a,
                           int pos)
This method is called when the HTML parser encounts the beginning of a tag that means that the tag is paired by an end tag and it's not an empty one.
Overrides:
handleStartTag in class HTMLEditorKit.ParserCallback

handleEndTag

public void handleEndTag(HTML.Tag t,
                         int pos)
This method is called when the HTML parser encounts the end of a tag that means that the tag is paired by a beginning tag
Overrides:
handleEndTag in class HTMLEditorKit.ParserCallback

handleSimpleTag

public void handleSimpleTag(HTML.Tag t,
                            MutableAttributeSet a,
                            int pos)
This method is called when the HTML parser encounts an empty tag
Overrides:
handleSimpleTag in class HTMLEditorKit.ParserCallback

handleText

public void handleText(char[] text,
                       int pos)
This method is called when the HTML parser encounts text (PCDATA)
Overrides:
handleText in class HTMLEditorKit.ParserCallback

customizeAppearanceOfDocumentWithSimpleTag

protected void customizeAppearanceOfDocumentWithSimpleTag(HTML.Tag t)
This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document. This method modifies the content of tmpDocContent.
Parameters:
t - the Html tag encounted by the HTML parser

customizeAppearanceOfDocumentWithStartTag

protected void customizeAppearanceOfDocumentWithStartTag(HTML.Tag t)
This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document. This method modifies the content of tmpDocContent.
Parameters:
t - the Html tag encounted by the HTML parser

customizeAppearanceOfDocumentWithEndTag

protected void customizeAppearanceOfDocumentWithEndTag(HTML.Tag t)
This method analizes the tag t and adds some \n chars and spaces to the tmpDocContent.The reason behind is that we need to have a readable form for the final document. This method modifies the content of tmpDocContent.
Parameters:
t - the Html tag encounted by the HTML parser

handleError

public void handleError(String errorMsg,
                        int pos)
This method is called when the HTML parser encounts an error it depends on the programmer if he wants to deal with that error
Overrides:
handleError in class HTMLEditorKit.ParserCallback

flush

public void flush()
           throws BadLocationException
This method is called once, when the HTML parser reaches the end of its input streamin order to notify the parserCallback that there is nothing more to parse.
Overrides:
flush in class HTMLEditorKit.ParserCallback

handleComment

public void handleComment(char[] text,
                          int pos)
This method is called when the HTML parser encounts a comment
Overrides:
handleComment in class HTMLEditorKit.ParserCallback

addStatusListener

public void addStatusListener(StatusListener listener)

removeStatusListener

public void removeStatusListener(StatusListener listener)

fireStatusChangedEvent

protected void fireStatusChangedEvent(String text)