Articles via Databases
Articles via Journals
Online Catalog
E-books
Research & Information Literacy
Interlibrary loan
Theses & Dissertations
Collections
Policies
Services
About / Contact Us
Administration
Littman Architecture Library
This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/1573/ in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Parameter estimation and inference of spatial autoregressive model by stochastic gradient descent
Author: Luan, Gan
View Online: njit-etd2021-072
(xii, 100 pages ~ 1.0 MB pdf)
Department: Department of Mathematical Sciences
Degree: Doctor of Philosophy
Program: Mathematical Sciences
Document Type: Dissertation
Advisory Committee: Loh, Ji Meng (Committee chair)
Dhar, Sunil Kumar (Committee member)
Guo, Wenge (Committee member)
Subramanian, Sundarraman (Committee member)
Fang, Yixin (Committee member)
Date: 2021-12
Keywords: Bootstrap resampling
Computational complexity
Dependent data
Large-scale data
Machine learning
Online learning
Availability: Unrestricted
Abstract:

Stochastic gradient descent (SGD) is a popular iterative method for model parameter estimation in large-scale data and online learning settings since it goes through the data in only one pass. While SGD has been well studied for independent data, its application to spatially-correlated data largely remains unexplored. This dissertation develops SGD-based parameter estimation and statistical inference algorithms for the spatial autoregressive (SAR) model, a common model for spatial lattice data.

This research contains three parts. (I) The first part concerns SGD estimation and inference for the SAR mean regression model. A new SGD algorithm based on maximum likelihood estimator (MLE) is proposed to accommodate the spatial correlation in the SAR model. Also, a statistical inference algorithm is proposed based on the online bootstrap resampling procedure (Fang et al., 2018). The asymptotic properties are then developed for the estimators and the finite sample properties for the estimators are investigated by simulations. The SGD-based parameter estimation procedures are shown to be more than 40 times faster than MLE for the settings examined. The SGD estimators for all parameters are close to the true values. The empirical coverages of confidence intervals (CIs) are at the nominal levels for the coefficients of the covariates but not for the spatial parameter. Two methods are proposed to improve the empirical coverage of CI for the spatial parameter. (II) The second part is regarding the SAR quantile regression mode. SGD algorithms based on one-stage quantile regression (1SQR) and two-stage quantile regression (2SQR) are developed for parameter estimation and statistical inference. Simulation results show that SGD estimator based on 2SQR is unbiased while that based on 1SQR is biased. Also, the empirical coverages of CIs constructed using SGD based on 2SQR are all at the nominal levels. (III) In the last part, this research analyzes a real dataset on charges for medical services provided by physicians and healthcare professionals. Both SAR mean regression and quantile regression models are fitted to study the effect of location and other characteristics of medical facilities on medical prices. Modeling results show that the spatial correlation parameter is significantly different from 0 (95% CI is (-0.27, -0.23) for the mean regression), suggesting spatial correlation of medical charges. Also the models find that charges depend on the total number of services provided yearly, gender of the provider, facility type, and whether the provider is in a metropolitan area.


If you have any questions please contact the ETD Team, libetd@njit.edu.

 
ETD Information
Digital Commons @ NJIT
Theses and DIssertations
ETD Policies & Procedures
ETD FAQ's
ETD home

Request a Scan
NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003