gate.corpora
Class RepositioningInfo

java.lang.Object
  |
  +--java.util.AbstractCollection
        |
        +--java.util.AbstractList
              |
              +--java.util.ArrayList
                    |
                    +--gate.corpora.RepositioningInfo
All Implemented Interfaces:
Cloneable, Collection, List, Serializable

public class RepositioningInfo
extends ArrayList

RepositioningInfo keep information about correspondence of positions between the original and extracted document content. With this information this class could be used for computing of this correspondence in the strict way (return -1 where is no correspondence) or in "flow" way (return near computable position)

See Also:
Serialized Form

Inner Class Summary
 class RepositioningInfo.PositionInfo
          Just information keeper inner class.
 
Field Summary
private static boolean DEBUG
          Debug flag
(package private) static long serialVersionUID
          Freeze the serialization UID.
 
Fields inherited from class java.util.ArrayList
elementData, size
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
RepositioningInfo()
          Default constructor
 
Method Summary
 void addPositionInfo(long origPos, long origLength, long currPos, long currLength)
          Create a new position information record.
 void correctInformation(long originalPos, long origLen, long newLen)
          Correct the RepositioningInfo structure for shrink/expand changes.
 void correctInformationOriginalMove(long originalPos, long moveLen)
          Correct the original position information in the records.
 long getExtractedPos(long absPos)
          Compute position in extracted content by position in the original content.
 long getExtractedPosFlow(long absPos)
          Not finished yet
 int getIndexByOriginalPosition(long absPos)
          Return the position info index containing @param absPos If there is no such position info return -1.
 int getIndexByOriginalPositionFlow(long absPos)
          Return the position info index containing @param absPos or the index of record before this position.
 long getOriginalPos(long relPos)
           
 long getOriginalPos(long relPos, boolean afterChar)
          Compute position in original content by position in the extracted content.
 long getOriginalPosFlow(long relPos)
          Not finished yet
 
Methods inherited from class java.util.ArrayList
add, add, addAll, addAll, clear, clone, contains, ensureCapacity, get, indexOf, isEmpty, lastIndexOf, RangeCheck, readObject, remove, removeRange, set, size, toArray, toArray, trimToSize, writeObject
 
Methods inherited from class java.util.AbstractList
equals, hashCode, iterator, listIterator, listIterator, subList
 
Methods inherited from class java.util.AbstractCollection
containsAll, remove, removeAll, retainAll, toString
 
Methods inherited from class java.lang.Object
, finalize, getClass, notify, notifyAll, registerNatives, wait, wait, wait
 
Methods inherited from interface java.util.List
containsAll, equals, hashCode, iterator, listIterator, listIterator, remove, removeAll, retainAll, subList
 

Field Detail

serialVersionUID

static final long serialVersionUID
Freeze the serialization UID.

DEBUG

private static final boolean DEBUG
Debug flag
Constructor Detail

RepositioningInfo

public RepositioningInfo()
Default constructor
Method Detail

addPositionInfo

public void addPositionInfo(long origPos,
                            long origLength,
                            long currPos,
                            long currLength)
Create a new position information record.

getExtractedPos

public long getExtractedPos(long absPos)
Compute position in extracted content by position in the original content. If there is no correspondence return -1.

getOriginalPos

public long getOriginalPos(long relPos)

getOriginalPos

public long getOriginalPos(long relPos,
                           boolean afterChar)
Compute position in original content by position in the extracted content. If there is no correspondence return -1.

getExtractedPosFlow

public long getExtractedPosFlow(long absPos)
Not finished yet

getOriginalPosFlow

public long getOriginalPosFlow(long relPos)
Not finished yet

getIndexByOriginalPosition

public int getIndexByOriginalPosition(long absPos)
Return the position info index containing @param absPos If there is no such position info return -1.

getIndexByOriginalPositionFlow

public int getIndexByOriginalPositionFlow(long absPos)
Return the position info index containing @param absPos or the index of record before this position. Result is -1 if the position is before the first record. Rezult is size() if the position is after the last record.

correctInformation

public void correctInformation(long originalPos,
                               long origLen,
                               long newLen)
Correct the RepositioningInfo structure for shrink/expand changes.
Normaly the text peaces have same sizes in both original text and extracted text. But in some cases there are nonlinear substitutions. For example the sequence "<" is converted to "<".
The correction will split the corresponding PositionInfo structure to 3 new records - before correction, correction record and after correction. Front and end records are the same maner like the original record - m_origLength == m_currLength, since the middle record has different values because of shrink/expand changes. All records after this middle record should be corrected with the difference between these values.
All m_currPos above the current information record should be corrected with (origLen - newLen) i.e. m_currPos -= origLen - newLen;
Parameters:
originalPos - Position of changed text in the original content.
origLen - Length of changed peace of text in the original content.
newLen - Length of new peace of text substiting the original peace.

correctInformationOriginalMove

public void correctInformationOriginalMove(long originalPos,
                                           long moveLen)
Correct the original position information in the records. When some text is shrinked/expanded by the parser. With this method is corrected the substitution of "\r\n" with "\n".