|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjava.util.Dictionary<K,V>
java.util.Hashtable<String,Integer>
hultig.sumo.HNgram
public class HNgram
An efficient representation of a large set of n-grams. Based on a
HashMap, it associates an integer - the frequency of the
corresponding n-gram.
(9:37:45 13 April 2009)
| Constructor Summary | |
|---|---|
HNgram()
|
|
HNgram(String fname)
Creates this object and loads the n-gram table from a given file. |
|
HNgram(String fname,
int n)
Creates this object and loads the n-gram table from a given file. |
|
| Method Summary | |
|---|---|
void |
countNGram(String sngram)
|
int |
exclude(String pattern)
Removes all n-grams from this table, that satisfies a given string pattern (regular expression). |
int |
freq(String ngram)
Returns the frequency of a given n-gram. |
int |
freq(String[] v)
Returns the frequency of a given n-gram, indicated through an array of strings. |
long |
getSum()
Gives the sum of frequencies for all n-grams stored in this table, a value necessary for n-gram probability estimation. |
static void |
main(String[] args)
The main is used for demonstration. |
double |
prob(String ngram)
The estimated probability of a given n-gram, for the data in this table. |
double |
prob(String[] v)
The estimated probability of a given n-gram, for the data in this table. |
double |
probabilidade(String sws)
Computes the log-likelihood of a given word sequence, based on the n-gram model stored in this object. |
double |
probability(String sws)
Computes the likelihood of a given word sequence, based on the n-gram model stored in this object. |
void |
set(FileIN f)
This method loads the n-gram table from a given file. |
static boolean |
test201110191137()
|
| Methods inherited from class java.util.Hashtable |
|---|
clear, clone, contains, containsKey, containsValue, elements, entrySet, equals, get, hashCode, isEmpty, keys, keySet, put, putAll, rehash, remove, size, toString, values |
| Methods inherited from class java.lang.Object |
|---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public HNgram()
public HNgram(String fname)
set(FileIN) method.
fname - The name of the text file to be processed.
public HNgram(String fname,
int n)
HNgram(String)
method.
fname - The name of the text file to be processed.n - The n-gram dimensionality value.| Method Detail |
|---|
public final void set(FileIN f)
f - Represents the n-gram table file to be processed.public void countNGram(String sngram)
public int exclude(String pattern)
pattern - The regular expression.
public int freq(String ngram)
ngram - The indicated n-gram.
public int freq(String[] v)
v - The array containing the n-gram word sequence.
public double prob(String ngram)
ngram - The indicated n-gram
public double prob(String[] v)
v - The n-gram representation.
public double probabilidade(String sws)
sws - The word sequence.
]-00, 0] interval.public double probability(String sws)
sws - The word sequence.
[0,1] interval.public long getSum()
public static void main(String[] args)
main is used for demonstration. It creates an instance
of this class by loading a given table of a 4-gram model of
part-of-speech tags. Afterwards, tag sequence likelihood is
calculated.
args - No arguments are expected.public static boolean test201110191137()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||