Articles via Databases
Articles via Journals
Online Catalog
E-books
Research & Information Literacy
Interlibrary loan
Theses & Dissertations
Collections
Policies
Services
About / Contact Us
Administration
Littman Architecture Library
This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/1716/ in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Model-based deep autoencoders for clustering single-cell RNA sequencing data with side information
Author: Lin, Xiang
View Online: njit-etd2023-057
(xvi, 121 pages ~ 7.0 MB pdf)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Program: Computer Science
Document Type: Dissertation
Advisory Committee: Wei, Zhi (Committee chair)
Koutis, Ioannis (Committee member)
Guo, Wenge (Committee member)
Wang, Junwen (Committee member)
Gao, Nan (Committee member)
Ma, Yao (Committee member)
Date: 2023-12
Keywords: Autoencoder
Clustering
Deep learning
Multimodality
Semi-supervised learning
Single-cell
Availability: Unrestricted
Abstract:

Clustering analysis has been conducted extensively in single-cell RNA sequencing (scRNA-seq) studies. scRNA-seq can profile tens of thousands of genes' activities within a single cell. Thousands or tens of thousands of cells can be captured simultaneously in a typical scRNA-seq experiment. Biologists would like to cluster these cells for exploring and elucidating cell types or subtypes. Numerous methods have been designed for clustering scRNA-seq data. Yet, single-cell technologies develop so fast in the past few years that those existing methods do not catch up with these rapid changes and fail to fully fulfil their potential. For instance, besides profiling transcription expression levels of genes, recent single-cell technologies can capture other auxiliary information at the single-cell level, such as protein expression (multi-omics scRNA-seq) and cells' spatial location information (spatial-resolved scRNA-seq). Most existing clustering methods for scRNA-seq are performed in an unsupervised manner and fail to exploit available side information for optimizing clustering performance.

This dissertation focuses on developing novel computational methods for clustering scRNA-seq data. The basic models are built on a deep autoencoder (AE) framework, which is coupled with a ZINB (zero-inflated negative binomial) loss to characterize the zero-inflated and over-dispersed scRNA-seq count data. To integrate multi-omics scRNA-seq data, a multimodal autoencoder (MAE) is employed. It applies one encoder for the multimodal inputs and two decoders for reconstructing each omics of data. This model is named scMDC (Single-Cell Multi-omics Deep Clustering). Besides, it is expected that cells in spatial proximity tend to be of the same cell types. To exploit cellular spatial information available for spatial-resolved scRNA-seq (sp-scRNA-seq) data, a novel model, DSSC (Deep Spatial-constrained Single-cell Clustering), is developed. DSSC integrates the spatial information of cells into the clustering process by two steps: 1) the spatial information is encoded by using a graphical neural network model; 2) cell-to-cell constraints are built based on the spatially expression pattern of the marker genes and added in the model to guide the clustering process. DSSC is the first model which can utilize the information from both the spatial coordinates and the marker genes to guide the cell/spot clustering. For both scMDC and DSSC, a clustering loss is optimized on the bottleneck layer of autoencoder along with the learning of feature representation. Extensive experiments on both simulated and real datasets demonstrate that scMDC and DSSC boost clustering performance significantly while costing no extra time and space during the training process. These models hold great promise as valuable tools for harnessing the full potential of state-of-the-art single-cell data.


If you have any questions please contact the ETD Team, libetd@njit.edu.

 
ETD Information
Digital Commons @ NJIT
Theses and DIssertations
ETD Policies & Procedures
ETD FAQ's
ETD home

Request a Scan
NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003