Ontological Representation and Machine Learning Prediction of Drugs for COVID-19 Treatment – UROP Summer Symposium 2021

Ontological Representation and Machine Learning Prediction of Drugs for COVID-19 Treatment

Zalan Shah

Zalan Shah

Pronouns: He/Him/His

UROP Fellowship: Engineering
Research Mentor(s): Yongqun Oliver He, PhD
Research Mentor Institution/Department: Michigan Medicine, Center for Computational Medicine and Bioinformatics

Presentation Date: Wednesday, August 4th
Session: Session 2 (4pm-4:50pm EDT)
Breakout Room: Room 1
Presenter: 5

Event Link


Background: SARS-CoV-2 is a human coronavirus that has caused COVID-19 and is able to rapidly mutate and spread throughout the world. While the usage of COVID-19 vaccines has drastically reduced illness, new variants of the virus continue to show up and reduce vaccination efficiency. Given the continuous spreading of the disease, effective drugs for treating COVID-19 are urgently needed; however, very effective drugs for COVID-19 have not yet been approved for public use. Drug repurposing is a strategy to discover new uses for thousands of approved drugs previously used for other illnesses. It is possible to use the drug repurposing strategy to find drugs for effective COVID-19 treatment. This study aims to analyze drugs and their effects on the human body to further predict effective drugs for COVID-19 using machine learning algorithms.

Methods: Ontology is a particular branch in the field of AI dedicated to human— and computer— understandable representation of entities and the relations among entities in a specific domain. The Coronavirus Infectious Disease Ontology (CIDO) is a community-based ontology that represents various coronavirus-associated topics such as COVID-19 drugs, vaccines, and their mechanisms. DrugBANK contains a database of many drugs reported to be effective for COVID-19 treatment at the level of clinical trials or experiment studies. In our study, we used CIDO to logically represent those drugs reported in DrugBANK and their drug targets, and applied the information for further drug prediction. OPA2Vec, an existing ontology-based vector generation algorithm, is used to transform ontology data and other associated data to a vector format, which is then applied for TSNE and neural network analysis. We hypothesize that such CIDO representation would enhance our machine learning performance in COVID-19 drug prediction.

Results: We have found 65 novel treatments for COVID-19, 51 of which were previously not in CIDO. These drugs, along with the proteins they target and their function, have been identified from DrugBANK. The information of the associated drug chemicals, drugs, and protein targets have been mapped to corresponding IDs in the Chemical Entities of Biological Interest (ChEBI), the Drug Ontology (DrON), and the Protein Ontology (PRO), respectively, providing any and all relevant information regarding each entity. The processed data was then represented in the CIDO ontology. The methods of OPA2Vec and convolutional neural network (CNN) analysis have been tested. We are currently processing our data for further OPA2Vec and CNN studies.

Conclusion: CIDO has been used for logical representation of the drugs and drug targets. Ontology-based machine learning is an emerging approach that can be applied to predict COVID-19 drugs.

Authors: Zalan Shah, Yongqun “Oliver” He, PhD
Research Method: Computer Programming

lsa logoum logo