|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--gate.util.AbstractFeatureBearer | +--gate.creole.AbstractResource | +--gate.creole.AbstractProcessingResource | +--gate.creole.gazetteer.DefaultGazetteer
This component is responsible for doing lists lookup. The implementaion is based on finite state machines. The phrases to be recognised should be listed in a set of files, one for each type of occurences. The gazeteer is build with the information from a file that contains the set of lists (which are files as well) and the associated type for each list. The file defining the set of lists should have the following syntax: each list definition should be written on its own line and should contain:
personmale.lst:person:male:english
Each list file named in the lists definition file is just a list containing
one entry per line.
When this gazetter will be run over some input text (a Gate document) it
will generate annotations of type Lookup having the attributes specified in
the definition file.
Field Summary | |
protected String |
annotationSetName
Used to store the annotation set currently being used for the newly generated annotations |
private Boolean |
caseSensitive
Should this gazetteer be case sensitive. |
private static boolean |
DEBUG
Debug flag |
protected Document |
document
Used to store the document currently being parsed |
private String |
encoding
|
protected FeatureMap |
features
|
(package private) Set |
fsmStates
A set containing all the states of the FSM backing the gazetteer |
(package private) FSMState |
initialState
The initial state of the FSM that backs this gazetteer |
private URL |
listsURL
The value of this property is the URL that will be used for reading the lists dtaht define this Gazetteer |
private Vector |
progressListeners
|
private Vector |
statusListeners
|
Fields inherited from class gate.creole.AbstractProcessingResource |
executionException |
Fields inherited from class gate.creole.AbstractResource |
serialVersionUID |
Constructor Summary | |
DefaultGazetteer()
Build a gazetter using the default lists from the agte resources {@see init()} |
Method Summary | |
void |
addLookup(String text,
Lookup lookup)
Adds one phrase to the list of phrases recognised by this gazetteer |
void |
addProgressListener(ProgressListener l)
|
void |
addStatusListener(StatusListener l)
|
protected void |
fireProcessFinished()
|
protected void |
fireProgressChanged(int e)
|
protected void |
fireStatusChanged(String e)
|
Boolean |
getCaseSensitive()
|
String |
getEncoding()
|
FeatureMap |
getFeatures()
Get the feature set |
String |
getFSMgml()
Returns a string representation of the deterministic FSM graph using GML. |
URL |
getListsURL()
|
Resource |
init()
Does the actual loading and parsing of the lists. |
(package private) void |
readList(String listDesc,
boolean add)
Reads one lists (one file) of phrases |
void |
removeLookup(String text,
Lookup lookup)
Removes one phrase to the list of phrases recognised by this gazetteer |
void |
removeProgressListener(ProgressListener l)
|
void |
removeStatusListener(StatusListener l)
|
void |
reset()
Resets this resource preparing it for a new run |
void |
run()
This method runs the gazetteer. |
void |
setAnnotationSetName(String newAnnotationSetName)
Sets the AnnotationSet that will be used at the next run for the newly produced annotations. |
void |
setCaseSensitive(Boolean newCaseSensitive)
|
void |
setDocument(Document newDocument)
Sets the document to be processed by the next run |
void |
setEncoding(String newEncoding)
|
void |
setFeatures(FeatureMap features)
Set the feature set |
void |
setListsURL(URL newListsURL)
|
Methods inherited from class gate.creole.AbstractProcessingResource |
check, reInit |
Methods inherited from class gate.creole.AbstractResource |
getName, setName |
Methods inherited from class java.lang.Object |
|
Methods inherited from interface gate.ProcessingResource |
check, reInit |
Methods inherited from interface gate.util.FeatureBearer |
getName, setName |
Field Detail |
private static final boolean DEBUG
FSMState initialState
Set fsmStates
protected FeatureMap features
protected Document document
protected String annotationSetName
private transient Vector progressListeners
private transient Vector statusListeners
private String encoding
private URL listsURL
private Boolean caseSensitive
Constructor Detail |
public DefaultGazetteer()
Method Detail |
public Resource init() throws ResourceInstantiationException
init
in interface Resource
init
in class AbstractProcessingResource
public void reset()
void readList(String listDesc, boolean add) throws FileNotFoundException, IOException, GazetteerException
listDesc
- the line from the definition fileadd
- public void addLookup(String text, Lookup lookup)
text
- the phrase to be addedlookup
- the description of the annotation to be added when this
phrase is recognisedpublic void removeLookup(String text, Lookup lookup)
text
- the phrase to be removedlookup
- the description of the annotation associated to this phrasepublic String getFSMgml()
public FeatureMap getFeatures()
FeatureBearer
getFeatures
in interface FeatureBearer
getFeatures
in class AbstractFeatureBearer
public void setFeatures(FeatureMap features)
FeatureBearer
setFeatures
in interface FeatureBearer
setFeatures
in class AbstractFeatureBearer
public void run()
run
in interface Runnable
run
in class AbstractProcessingResource
public void setDocument(Document newDocument)
public void setAnnotationSetName(String newAnnotationSetName)
public void removeProgressListener(ProgressListener l)
public void addProgressListener(ProgressListener l)
protected void fireProgressChanged(int e)
protected void fireProcessFinished()
public void removeStatusListener(StatusListener l)
public void addStatusListener(StatusListener l)
protected void fireStatusChanged(String e)
public void setEncoding(String newEncoding)
public String getEncoding()
public void setListsURL(URL newListsURL)
public URL getListsURL()
public void setCaseSensitive(Boolean newCaseSensitive)
public Boolean getCaseSensitive()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |