Articles via Databases
Articles via Journals
Online Catalog
E-books
Research & Information Literacy
Interlibrary loan
Theses & Dissertations
Collections
Policies
Services
About / Contact Us
Administration
Littman Architecture Library
This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/1475/ in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Hybrid deep neural networks for mining heterogeneous data
Author: Hou, Xiurui
View Online: njit-etd2020-030
(xii, 66 pages ~ 2.2 MB pdf)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Program: Computer Science
Document Type: Dissertation
Advisory Committee: Wei, Zhi (Committee chair)
Calvin, James M. (Committee member)
Basu Roy, Senjuti (Committee member)
Wang, Antai (Committee member)
Guo, Wenge (Committee member)
Date: 2020-08
Keywords: Data mining
Hybrid deep neural network
Machine learning
Availability: Unrestricted
Abstract:

In the era of big data, the rapidly growing flood of data represents an immense opportunity. New computational methods are desired to fully leverage the potential that exists within massive structured and unstructured data. However, decision-makers are often confronted with multiple diverse heterogeneous data sources. The heterogeneity includes different data types, different granularities, and different dimensions, posing a fundamental challenge in many applications. This dissertation focuses on designing hybrid deep neural networks for modeling various kinds of data heterogeneity.

The first part of this dissertation concerns modeling diverse data types, the first kind of data heterogeneity. Specifically, image data and heterogeneous meta data are modeled. Detecting Copy Number Variations (CNVs) in genetic studies is used as a motivating example. A CNN-DNN blended neural network is proposed to authenticate CNV calls made by current state-of-art CNV detection algorithms. It utilizes hybrid deep neural networks to leverage both scatter plot image signal and heterogeneous numerical meta data for improving CNV calling and review efficiency.

The second part of this dissertation deals with data of various frequencies or scales in time series data analysis, the second kind of data heterogeneity. The stock return forecasting problem in the finance field is used as a motivating example. A hybrid framework of Long-Short Term Memory and Deep Neural Network (LSTM-DNN) is developed to enrich the time-series forecasting task with static fundamental information. The application of the proposed framework is not limited to the stock return forecasting problem, but any time-series based prediction tasks.

The third part of this dissertation makes an extension of LSTM-DNN framework to account for both temporal and spatial dependency among variables, common in many applications. For example, it is known that stock prices of relevant firms tend to fluctuate together. Such coherent price changes among relevant stocks are referred to a spatial dependency. In this part, Variational Auto Encoder (VAE) is first utilized to recover the latent graphical dependency structure among variables. Then a hybrid deep neural network of Graph Convolutional Network and Long-Short Term Memory network (GCN-LSTM) is developed to model both the graph structured spatial dependency and temporal dependency of variables at different scales.

Extensive experiments are conducted to demonstrate the effectiveness of the proposed neural networks with application to solve three representative real-world problems. Additionally, the proposed frameworks can also be applied to other areas filled with similar heterogeneous inputs.


If you have any questions please contact the ETD Team, libetd@njit.edu.

 
ETD Information
Digital Commons @ NJIT
Theses and DIssertations
ETD Policies & Procedures
ETD FAQ's
ETD home

Request a Scan
NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003