|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.util.Dictionary<K,V>
java.util.Hashtable<String,Integer>
hultig.sumo.HNgram
public class HNgram
An efficient representation of a large set of n-grams. Based on a
HashMap
, it associates an integer - the frequency of the
corresponding n-gram.
(9:37:45 13 April 2009)
Constructor Summary | |
---|---|
HNgram()
|
|
HNgram(String fname)
Creates this object and loads the n-gram table from a given file. |
|
HNgram(String fname,
int n)
Creates this object and loads the n-gram table from a given file. |
Method Summary | |
---|---|
void |
countNGram(String sngram)
|
int |
exclude(String pattern)
Removes all n-grams from this table, that satisfies a given string pattern (regular expression). |
int |
freq(String ngram)
Returns the frequency of a given n-gram. |
int |
freq(String[] v)
Returns the frequency of a given n-gram, indicated through an array of strings. |
long |
getSum()
Gives the sum of frequencies for all n-grams stored in this table, a value necessary for n-gram probability estimation. |
static void |
main(String[] args)
The main is used for demonstration. |
double |
prob(String ngram)
The estimated probability of a given n-gram, for the data in this table. |
double |
prob(String[] v)
The estimated probability of a given n-gram, for the data in this table. |
double |
probabilidade(String sws)
Computes the log-likelihood of a given word sequence, based on the n-gram model stored in this object. |
double |
probability(String sws)
Computes the likelihood of a given word sequence, based on the n-gram model stored in this object. |
void |
set(FileIN f)
This method loads the n-gram table from a given file. |
static boolean |
test201110191137()
|
Methods inherited from class java.util.Hashtable |
---|
clear, clone, contains, containsKey, containsValue, elements, entrySet, equals, get, hashCode, isEmpty, keys, keySet, put, putAll, rehash, remove, size, toString, values |
Methods inherited from class java.lang.Object |
---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public HNgram()
public HNgram(String fname)
set(FileIN)
method.
fname
- The name of the text file to be processed.public HNgram(String fname, int n)
HNgram(String)
method.
fname
- The name of the text file to be processed.n
- The n-gram dimensionality value.Method Detail |
---|
public final void set(FileIN f)
f
- Represents the n-gram table file to be processed.public void countNGram(String sngram)
public int exclude(String pattern)
pattern
- The regular expression.
public int freq(String ngram)
ngram
- The indicated n-gram.
public int freq(String[] v)
v
- The array containing the n-gram word sequence.
public double prob(String ngram)
ngram
- The indicated n-gram
public double prob(String[] v)
v
- The n-gram representation.
public double probabilidade(String sws)
sws
- The word sequence.
]-00, 0]
interval.public double probability(String sws)
sws
- The word sequence.
[0,1]
interval.public long getSum()
public static void main(String[] args)
main
is used for demonstration. It creates an instance
of this class by loading a given table of a 4-gram model of
part-of-speech tags. Afterwards, tag sequence likelihood is
calculated.
args
- No arguments are expected.public static boolean test201110191137()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |