Analyzing chemical safety reports by text mining techniques


Jarque Villanueva, Pol

For years, the chemical industry has been obliged to report accidents that occur in its plants. However, this information is presented as unstructured reports and it is difficult to extract meaningful information. This project focuses on the use of text mining techniques to systematize and make sense of this information.
The accident reports that have been worked on in this project have been extracted from the European Union database eMARS, 'the Major Accident Reporting System'. The eMARS database contains the major accidents produced in the chemical industry plants that are under the legislation of the Seveso Directive.
Throughout the project, different aspects of the accidents have been analyzed from the information obtained. Among these aspects, the main highlights are the identification of the substances involved, the classification of accidents based on a taxonomy, the unsupervised classification of accidents based on the content in the text of the respective reports, and a quantitative and qualitative analysis of the consequences. human-related.
The results of the analysis of the reports of the 1,076 accidents contained in the eMARS database show the possibility of analyzing these documents by means of the text mining application. 404 different chemical compounds have been identified in the database, with chlorine gas being the compound with the highest frequency, present in 60 accidents. The classification of accidents based on the selected taxonomy has shown how the most frequent main event is escape, present in 530 accidents. The most frequent causes of accidents are those related to the plant and equipment, present in 519 accidents. Regarding the unsupervised classification, the accidents have been grouped into 10 clusters. Finally, in relation to the consequences related with humans, in 423 accidents there was at least one injury or death.



Cuadros Margarit, Jordi


IQS SE - Master’s Degree in Chemical Engineering