|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--gate.sgml.Sgml2Xml
Not so fast... This class is not a realy Sgml2Xml convertor. It takes an SGML document and tries to prepare it for an XML parser For a true conversion we need an Java SGML parser... If you know one let me know.... What does it do:
Field Summary | |
private int |
attrEnd
|
private int |
attrStart
|
private int |
charPos
|
private int |
closePos
|
private static boolean |
DEBUG
Debug flag |
private List |
dubiousElements
|
private String |
elemName
|
private int |
elemNameEnd
|
private int |
elemNameStart
|
private char |
endPair
|
private char |
m_currChar
|
private int |
m_currState
|
private int |
m_cursor
|
private Document |
m_doc
|
private StringBuffer |
m_modifier
|
private Stack |
stack
|
Constructor Summary | |
Sgml2Xml(Document doc)
The other constructor |
|
Sgml2Xml(String SgmlDoc)
The constructor initialises some member fields |
Method Summary | |
String |
convert()
This method is responsable with document conversion |
private void |
doState1(char currChar)
It analises the char that was red in state 1 If it finds '<' it then goes to state 2 Otherwise it stays in state 1 and keeps track about the text that is not white spaces. |
private void |
doState10(char currChar)
If any C -> state 4 If '=' state 6 Stays here while reads WS |
private void |
doState11(char currChar)
We are preparing to read the and definition of an element Stays in this state while reading WS |
private void |
doState12(char currChar)
Here we read the element's name ...this is an end tag Stays here while reads a char |
private void |
doState13(char currChar)
If '>' -> state 1 Stays here while reads WS |
private void |
doState2(char currChar)
We came from state 1 and just read '<' If currChar == '/' -> state 11 If is a char != white spaces -> state 3 stay in state 2 while there are only white spaces |
private void |
doState3(char currChar)
Just read the first char from the element's name and now analize the next char. |
private void |
doState4(char currChar)
We read the name of the element and we prepare for '>' or attributes '>' -> state 1 any char !- white space -> state 5 |
private void |
doState5(char currChar)
'=' -> state 6 '>' -> state 4 (we didn't read an attribute but a value of the defaultAtt ) WS (white spaces) we don't know yet if we read an attribute or the value of the defaultAttr -> state 10 This state modifies the content onf m_modifier ... |
private void |
doState6(char currChar)
IF we read ' or " then we have to get prepared to read everything until the next ' or " If we read a char then -> state 8; Stay here while we read WS |
private void |
doState7(char currChar)
If we find the pair ' or " go to state 9 Otherwhise read everything and stay in state 7 If in state 7 we read '>' then we add automaticaly a " at the end and go to state 1 |
private void |
doState8(char currChar)
If '>' go to state 1 If WS go to state 9 Stays in state 8 and read the attribute's value |
private void |
doState9(char currChar)
Here we prepare to read another attrib, value pair (any char -> state 5) If '>' we just read a beggining tag -> state 1 Stay here while read WS |
private boolean |
isWhiteSpace(char c)
Tests if c is a white space char |
private void |
makeFinalModifications(CustomObject aCustomObject)
This method is called after we read the entire SGML document It resolves the dobious Elements this way: 1. |
private void |
performActionWithEndElem(String elemName)
This is the action performed when an end tag is read. |
private void |
performFinalAction(String elemName,
int pos)
This is the action when we finished to read the entire tag The action means that we put the tag into stack and consider that is empty as default |
private char |
read()
This method reads a char and increments the m_cursor |
private boolean |
thereAreCharsToBeProcessed()
This method tests to see if there are more char to be read It will return false when there are no more chars to be read |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private static final boolean DEBUG
private Document m_doc
private StringBuffer m_modifier
private Stack stack
private List dubiousElements
private int m_cursor
private int m_currState
private char m_currChar
private int charPos
private String elemName
private int elemNameStart
private int elemNameEnd
private int closePos
private int attrStart
private int attrEnd
private char endPair
Constructor Detail |
public Sgml2Xml(String SgmlDoc)
SgmlDoc
- the content of the Sgml document that will be modifiedpublic Sgml2Xml(Document doc)
doc
- The Gate document that will be transformed to XMLMethod Detail |
private void doState1(char currChar)
private void doState2(char currChar)
private void doState3(char currChar)
private void doState4(char currChar)
private void doState5(char currChar)
private void doState6(char currChar)
private void doState7(char currChar)
private void doState8(char currChar)
private void doState9(char currChar)
private void doState10(char currChar)
private void doState11(char currChar)
private void doState12(char currChar)
private void doState13(char currChar)
public String convert() throws IOException, MalformedURLException
IOException
MalformedURLException
private boolean thereAreCharsToBeProcessed()
private char read()
private void performFinalAction(String elemName, int pos)
private void performActionWithEndElem(String elemName)
elemName
- is the the name of the end tag that was readprivate void makeFinalModifications(CustomObject aCustomObject)
private boolean isWhiteSpace(char c)
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |