Articles via Databases
Articles via Journals
Online Catalog
E-books
Research & Information Literacy
Interlibrary loan
Theses & Dissertations
Collections
Policies
Services
About / Contact Us
Administration
Littman Architecture Library
This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/169 in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Computational methods for the analysis of next generation sequencing data
Author: Wang, Wei
View Online: njit-etd2014-057
(xvi, 93 pages ~ 4.2 MB pdf)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Program: Computer Science
Document Type: Dissertation
Advisory Committee: Wei, Zhi (Committee chair)
Wang, Jason T. L. (Committee member)
Roshan, Usman W. (Committee member)
Oria, Vincent (Committee member)
Zhao, Zhigen (Committee member)
Date: 2014-05
Keywords: Next generation sequencing
Change-point model
Bioinformatics
Variant detection
Statistical modeliing
Alternate polyadenlyation
Availability: Unrestricted
Abstract:

Recently, next generation sequencing (NGS) technology has emerged as a powerful approach and dramatically transformed biomedical research in an unprecedented scale. NGS is expected to replace the traditional hybridization-based microarray technology because of its affordable cost and high digital resolution. Although NGS has significantly extended the ability to study the human genome and to better understand the biology of genomes, the new technology has required profound changes to the data analysis. There is a substantial need for computational methods that allow a convenient analysis of these overwhelmingly high-throughput data sets and address an increasing number of compelling biological questions which are now approachable by NGS technology.

This dissertation focuses on the development of computational methods for NGS data analyses. First, two methods are developed and implemented for detecting variants in analysis of individual or pooled DNA sequencing data. SNVer formulates variant calling as a hypothesis testing problem and employs a binomial-binomial model to test the significance of observed allele frequency by taking account of sequencing error. SNVerGUI is a GUI-based desktop tool that is built upon the SNVer model to facilitate the main users of NGS data, such as biologists, geneticists and clinicians who often lack of the programming expertise. Second, collapsing singletons strategy is explored for associating rare variants in a DNA sequencing study. Specifically, a gene-based genome-wide scan based on singleton collapsing is performed to analyze a whole genome sequencing data set, suggesting that collapsing singletons may boost signals for association studies of rare variants in sequencing study. Third, two approaches are proposed to address the 3’UTR switching problem. PolyASeeker is a novel bioinformatics pipeline for identifying polyadenylation cleavage sites from RNA sequencing data, which helps to enhance the knowledge of alternative polyadenylation mechanisms and their roles in gene regulation. A change-point model based on a likelihood ratio test is also proposed to solve such problem in analysis of RNA sequencing data. To date, this is the first method for detecting 3’UTR switching without relying on any prior knowledge of polyadenylation cleavage sites.


If you have any questions please contact the ETD Team, libetd@njit.edu.

 
ETD Information
Digital Commons @ NJIT
Theses and DIssertations
ETD Policies & Procedures
ETD FAQ's
ETD home

Request a Scan
NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003