Automated Chemical Ontology Expansion using Deep Learning


Ontologies provide a shared vocabulary and semantic resource for a domain. Manual construction enables them to achieve high quality and capture subtle semantic nuances, essential for wide acceptance and applicability across a community. However, the manual curation process does not scale for large domains. I will present a methodology for automatic ontology extension based on deep learning using ontology annotations, and show how we apply this methodology to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We used a Transformer-based deep learning architecture trained on the chemical structures from ontology leaf nodes, and the system learns to predict membership in multiple mid-level ontology classes as a multi-class classification task. Additionally, I will illustrate how visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions.