hultig.align
Class NWunsch

java.lang.Object
  extended by hultig.align.NWunsch

public class NWunsch
extends Object

This class contains an implementation of the Needleman Wunsch algorithm for globally align sequence pairs (the whole sequences). This algorithm have been used for DNA sequence alignment, in genetics, and is here adapted for the alignment of words, between sentence pairs.

University of Beira Interior (UBI)
Centre For Human Language Technology and Bioinformatics (HULTIG)


Constructor Summary
NWunsch(int[] sa, int[] sb)
           
NWunsch(Sentence sa, Sentence sb)
           
NWunsch(String sa, String sb)
           
 
Method Summary
 void buildAlignment()
           
 void buildMatrix()
          Compute the similarity matrix, using dynamic programming strategy.
 String codes2str(int[] vs)
           
static void computeAlignOut(Text txt, int idpar)
           
 String[] getAlignmentH()
           
 CorpusIndex getDict()
           
static void main(String[] args)
           
static void mainX(String[] args)
           
static void mainX1(String[] args)
           
 void printAlignmentH(int idpar)
           
 void printAlignmentV()
           
 void printMatrix()
           
 void printMatrixLatex()
          Print the alignment matrix to be used in Latex.
 void printvectors()
           
static boolean processParaFile(String filename)
          To process a paraphrase file.
 void setDic(CorpusIndex dict)
           
 double similarity(int i, int j)
          Define metrica de dist├óncia entre "simbolos" do alfabeto.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NWunsch

public NWunsch(int[] sa,
               int[] sb)

NWunsch

public NWunsch(String sa,
               String sb)

NWunsch

public NWunsch(Sentence sa,
               Sentence sb)
Method Detail

setDic

public void setDic(CorpusIndex dict)

getDict

public CorpusIndex getDict()

codes2str

public String codes2str(int[] vs)

similarity

public double similarity(int i,
                         int j)
Define metrica de distância entre "simbolos" do alfabeto.

Parameters:
i - int
j - int
Returns:
double

buildMatrix

public void buildMatrix()
Compute the similarity matrix, using dynamic programming strategy. The mutex matrix will be generated.


buildAlignment

public void buildAlignment()

getAlignmentH

public String[] getAlignmentH()

printAlignmentH

public void printAlignmentH(int idpar)

printAlignmentV

public void printAlignmentV()

printvectors

public void printvectors()

printMatrix

public void printMatrix()

printMatrixLatex

public void printMatrixLatex()
Print the alignment matrix to be used in Latex. Created on 10 June 2010. For example: \tiny \begin{equation*}\label{MATRIX:Fexample} \left( \begin{array}{rrrrrrrrrrrrr} & \_\_ & Gilbert & was & the & most & intense & storm & on & record & in & the & west \\ \_\_ & 0.0 & -3.0 & -6.0 & -9.0 & -12.0& -15.0 & -18.0 & -21.0& -24.0 & -27.0& -30.0& -33.0\\ Gilbert & -3.0 & 10.0 & 7.0 & 4.0 & 1.0& -2.0 & -5.0 & -8.0& -11.0 & -14.0& -17.0& -20.0\\ was & -6.0 & 7.0 & 20.0 & 17.0 & 14.0& 11.0 & 8.0 & 5.0& 2.0 & -1.0& -4.0& -7.0\\ the & -9.0 & 4.0 & 17.0 & 30.0 & 27.0& 24.0 & 21.0 & 18.0& 15.0 & 12.0& 9.0& 6.0\\ most & -12.0 & 1.0 & 14.0 & 27.0 & 40.0& 37.0 & 34.0 & 31.0& 28.0 & 25.0& 22.0& 19.0\\ intense & -15.0 & -2.0 & 11.0 & 24.0 & 37.0& 50.0 & 47.0 & 44.0& 41.0 & 38.0& 35.0& 32.0\\ hurricane & -18.0 & -5.0 & 8.0 & 21.0 & 34.0& 47.0 & 44.0 & 41.0& 38.0 & 35.0& 32.0& 29.0\\ ever & -21.0 & -8.0 & 5.0 & 18.0 & 31.0& 44.0 & 41.0 & 38.0& 35.0 & 36.0& 33.0& 30.0\\ recorded & -24.0 & -11.0 & 2.0 & 15.0 & 28.0& 41.0 & 38.0 & 35.0& 35.4 & 33.0& 30.0& 27.0\\ in & -27.0 & -14.0 & -1.0 & 12.0 & 25.0& 38.0 & 35.0 & 32.0& 32.4 & 45.4& 42.4& 39.4\\ western & -30.0 & -17.0 & -4.0 & 9.0 & 22.0& 35.0 & 32.0 & 29.0& 29.4 & 42.4& 39.4& 36.4\\ hemisfere & -33.0 & -20.0 & -7.0 & 6.0 & 19.0& 32.0 & 29.0 & 26.0& 26.4 & 39.4& 37.2& 34.2 \end{array} \right) \end{equation*} \normalsize


processParaFile

public static boolean processParaFile(String filename)
To process a paraphrase file. The output will be an equivalent set of aligned paraphrases.

Parameters:
filename - String
Returns:
boolean

computeAlignOut

public static void computeAlignOut(Text txt,
                                   int idpar)

main

public static void main(String[] args)

mainX1

public static void mainX1(String[] args)

mainX

public static void mainX(String[] args)