This dissertation is to develop a parcel-based spatial land use change prediction model by coupling various machine learning and interpretation algorithms such as cellular automata (CA) and decision tree (DT). CA is a collection of cells that evolves through a number of discrete time steps according to a set of transition rules based on the state of each cell and the characteristics of its neighboring cells. DT is a data mining and machine learning tool that extracts the patterns of decision process from observed cell behaviors and their affecting factors. In this dissertation, CA is used to predict the future land use status of cadastral parcels based on a set of transition rules derived from a set of identified land use change driving factors using DT. Although CA and DT have been applied separately in various land use change models in the literature, no studies attempted to integrate them. This DT-based CA model developed in this dissertation represents the first kind of such integration in land use change modeling. The coupled model would be able to handle a large set of driving factors and also avoid subjective bias when deriving the transition rules. The coupled model uses the cadastral parcel as a unit of analysis, which has practical policy implications because the responses of land use changes to various policy usually take place at the parcel level. Since parcel varies by their sizes and shapes, its use as a unit of analysis does make it difficult to apply CA, which initially designed to handle regular grid cells. This dissertation improves the treatment of the irregular cell in CA-based land use change models in literature by defining a cell's neighborhood as a fixed distance buffer along the parcel boundary.
The DT-based CA model was developed and validated in Hunterdon County, New Jersey. The data on historical land uses and various land use change driving factors for Hunterdon County were collected and processed using a Geographic Information System (GIS). Specifically, the county land uses in 1986, I995 and 2002 were overlaid with a parcel map to create parcel-based land use maps. The single land use in each parcel is based on a classification scheme developed thorough literature review and empirical testing in the study area. The possible land use status considered for each parcel is agriculture, barren land, forest, urban, water or wetlands following the land use/land cover classification by the New Jersey Department of Environment Protection. The identified driving factors for the future status of the parcel includes the present land use type, the number of soil restrictions to urban development, and the size of the parcel, the amount of wetlands within the parcel, the distribution of land uses in the neighborhood of the parcel, the distances to the nearest streams, urban centers and major roads. A set of transition rules illustrating the land use change processes during the period 1986-1995 were developed using a TD software J48 Classifier. The derived transition rules were applied to the 1995 land use data in a CA model Agent Analyst/RePast (Recursive Porous Agent Simulation Toolkit) to predict the spatial land use pattern in 2004, which were then validated by the actual land use map in 2002. The DT-based CA model had an overall accuracy of 84.46 percent in terms of the number of parcels and of 80.92 percent in terms of the total acreage in predicting land use changes. The model shows much higher capacity in predicting the quantitative changes than the locational changes in land use. The validated model was applied to simulate the 2011 land use patterns in Hunterdon County based on its actual land uses in 2002 under both "business as usual" and policy scenarios. The simulation results shows that successfully implementing current land use policies such as down zoning, open space and farmland preservation would prevent the total of 7,053 acres (741 acres of wetlands, 3,034 acres of agricultural lands, 250 acres of barren land, and 3,028 acres of forest) from future urban development in Hunterdon County during the period 2002-2011. The neighborhood of a parcel was defined by a 475-foot buffer along the parcel boundary in the study. The results of sensitivity analyses using two additional neighborhoods (237- and 712-foot buffers) indicate the insignificant impacts of the neighborhood size on the model outputs in this application.