Proposta de dissertação do MEI
Título: Study on efficient implementations of statistical Correlation of Relevant Expressions (RE) metrics in natural language documents
Proponente(s): Joaquim Francisco Ferreira da Silva
Victor Manuel Alves Duarte
Créditos: 42 ECTS
Área científica: Information Systems Technology
Início preferencial: Qualquer semestre
URL:
Já estão em curso trabalhos preliminares executados pelo alunos:
João Simões
Breve descrição: The combinatorial explosion of ER pairs that occur in corpora requires the use of efficient computational methods for the evaluation of correlation metrics that allow the establishment of semantic bridges between documents. These bridges are useful in the context of document search, without this search being confined to the explicit occurrence of RE in each document.
The purpose of the dissertation includes, on the one hand, the evaluation of the statistical correlation metrics that are most adequate to treat this problem and, on the other hand, to identify the alternatives for its efficient implementation, using parallel computing methods.
Following this study, the work should develop a parallel implementation of a prototype capable of producing a list of ER pairs, given a large ER collection, ordered according to the applied statistical correlation criterion. Finally, the execution performance of the prototype should be evaluated in comparison to a sequential implementation.
Observações: To perform this work, the student must have good training in the areas of Probability and Statistics, as well as Parallel Computing.