Automated pipeline combining multiple approaches to annotate cell types – UROP Spring Symposium 2022

Automated pipeline combining multiple approaches to annotate cell types

photo of presenter

Neha Kankanala

Pronouns:

Research Mentor(s): Matthew Patrick
Co-Presenter:
Research Mentor School/College/Department: Department of Dermatology / Medicine
Presentation Date: April 20
Presentation Type: Oral5
Session: Session 6 – 4:40pm – 5:30 pm
Room: Breakout room 1
Authors: Neha Kankanala
Presenter: 2

Abstract

Single cell genomics is an up and coming field. This method of application for Computer Science in the world of medicine and biology is highly contributive and advantageous as it allows scientists to quickly analyze the composition of a cell and find patterns and correlations between the structure of a cell and the genes it expresses to diseases it may correlate to. There are various methods created in the R programming language that output the label for a cell based on the genes it expressed, but may not always be accurate which is why it is important to test them with existing data sets. I analyzed three semi-supervised cell annotation methods in the R language (AUCell, SCINA, and GSVA) and tested their accuracy by comparing them to a manually labeled data set using Python and Excel. As a result of these testing procedures, I was able to conclude that by combining all three methods, more accurate labels for each cell are achieved and computational labelling of cells is a good time-saving option for scientists. Because of the various cell annotation methods that exist in the various programming languages, scientists will be able to have a fast way to classify cell data, allowing them to focus on drawing connections as to how these cells relate to disease, rather than spending excessive amounts of time annotating and labeling cells by hand.

Presentation link

Biomedical Sciences, Interdisciplinary

lsa logoum logo