Spatial and temporal dependencies are ubiquitous properties of data in numerous domains. The popularity of spatial and temporal data mining has thus grown with the increasing prevalence of massive data. The presence of spatial and temporal attributes not only provides complementary useful perspectives, but also poses new challenges to the representation and integration into the learning procedure. In this dissertation, the involved spatial and temporal dependencies are explored with three genres: sample-wise, feature-wise, and target-wise. A family of novel methodologies is developed accordingly for the dependency representation in respective scenarios.
First, dependencies among discrete, continuous and repeated observations are studied using illustrative examples in urban computing and video clicks. Specifically, discrete Markov random field and time-aware latent hierarchical models are developed to capture the underlying spatiotemporal interactions among different spots. In addition, an item-specific effect aware method is proposed to model consistent effects involved in repeated observational records. Second, feature-wise spatiotemporal interactions are investigated under the framework of deep learning with applications to genomic sequences and audience logs. Regarding spatial dependency among homogeneous features (e.g., genomic sequence), a customized convolutional neural network is leveraged to capture underlying motifs formed by spatial interactions. To advance the characterization of spatiotemporal interactions among heterogeneous features, a blended learning scheme is established to keep track of the evolution of involved patterns. For both feature-wise dependencies, a saliency maps based context analysis protocol is introduced to interpret and visualize the manner how spatial-temporal attributes are associated with target responses. Lastly, this dissertation covers the temporal dependence of response target variables with applications to competing risks in financial loans. A hierarchical grading framework is proposed to integrate two risks of loans both qualitatively and quantitatively based on temporal constraints. The framework is then divided into multiple binary classification sub-problems. All of the proposed methods are evaluated by systematic experiments based on synthetic data and real-world data repositories in various scenarios. The empirical results demonstrate the appealing performance in different regards.
Taken together, this dissertation elucidates spatiotemporal data from three perspectives and is dedicated to developing desirable and feasible schemes for the representation of spatial and temporal mechanisms.
|