hultig.sumo
Class ChunkType

java.lang.Object
  extended by hultig.sumo.ChunkType

public class ChunkType
extends Object

Represents a chunk type obtained from a shallow parser. The tag set used is the Penn Treebank. Only six different tags are considered here for sentence chunks: NP, VP, PP, PRT, ADVP, and UNDEFINED. In the future more chunk tags can be added/defined.

University of Beira Interior (UBI)
Centre For Human Language Technology and Bioinformatics (HULTIG)


Field Summary
static int ADVP
          The code representing an Adverb Phrase.
static String[] DOMAIN
          The set of string tags for labeling chunks, stored in this array of strings.
static int NP
          The code representing a Noun Phrase.
static int PP
          The code representing a Prepositional Phrase.
static int PRT
          The code representing a particle.
static int UND
          Internal representation of an undefined chunk.
static int VP
          The code representing a Verb Phrase.
 
Constructor Summary
ChunkType()
           
 
Method Summary
static String cod2str(int cod)
          Giving the tag codes defined in this class (static fields), converts a tag code into its corresponding string.
static void main(String[] args)
          This main method lists the set of chunk tags and their corresponding codes.
static int str2cod(String scod)
          Gives the tag code corresponding to a given chunk tag, taking into account the defined internal codes, in this class (the static int fields).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UND

public static final int UND
Internal representation of an undefined chunk.

See Also:
Constant Field Values

NP

public static final int NP
The code representing a Noun Phrase.

See Also:
Constant Field Values

VP

public static final int VP
The code representing a Verb Phrase.

See Also:
Constant Field Values

PP

public static final int PP
The code representing a Prepositional Phrase.

See Also:
Constant Field Values

PRT

public static final int PRT
The code representing a particle. Category for words that should be tagged RP.

See Also:
Constant Field Values

ADVP

public static final int ADVP
The code representing an Adverb Phrase.

See Also:
Constant Field Values

DOMAIN

public static final String[] DOMAIN
The set of string tags for labeling chunks, stored in this array of strings.

Constructor Detail

ChunkType

public ChunkType()
Method Detail

str2cod

public static int str2cod(String scod)
Gives the tag code corresponding to a given chunk tag, taking into account the defined internal codes, in this class (the static int fields).

Parameters:
scod - The chunk tag string.
Returns:
The code of the given chunk tag, or UND code (undefined/not known) if the string is not recognizable. return

cod2str

public static String cod2str(int cod)
Giving the tag codes defined in this class (static fields), converts a tag code into its corresponding string.

Parameters:
cod - The chunk tag code.
Returns:
The chunk tag. If the code is unknown it will return the "UNDEFINED" string.

main

public static void main(String[] args)
This main method lists the set of chunk tags and their corresponding codes. It also tests the conversion methods.

Parameters:
args - No arguments are expected.