Proposta de dissertação do MEI
Título: Automatic Detection of Unlikely Combinations of Categorical Variables in Big Data context
Proponente(s): Joaquim Francisco Ferreira da Silva
Maria Celeste Rodrigues Jacinto
Créditos: 42 ECTS
Área científica: Information Systems Technology
Início preferencial: Qualquer semestre
Já estão em curso trabalhos preliminares executados pelo alunos:
Breve descrição: This dissertation involves Data Mining and Machine Learning areas. The nature of the data belongs to the ​​Industrial Engineering area, in particular, occupational safety and accident prevention.
The data that characterize the declarations of occupational accidents use 22 categorical variables, each with several possible discrete values. The combinatorial explosion of possible accident patterns results, therefore, in a very large number. This number constitutes a computational challenge for Machine Learning systems, regarding the learning of the patterns that characterize the families of accidents, and of the value of the associations between the subsets of variables that characterize these patterns.
After this learning, the system to be developed in this dissertation should be able to detect automatically if a new declaration of accident is atypical / unlikely, taking into account the knowledge acquired during the training phase. This way, the quality of accident Data will increase.