::install_github("LukeCe/spflow", force = TRUE) devtools
In-class Exercise 5
Load the necessary packages
Code chunk below is used to install the latest spflow
package.
Next, we will load spflow and other R packages into R environment.
::p_load(tmap,sf,spdep,sp,spflow,Matrix,reshape2,knitr,tidyverse) pacman
Data Preparation
Before we can calibrate the Spatial Econometric Interaction Models by using spflow package, three data sets are required. They are:
- a spatial weight
- a tibble data.frame which consists of the origins, destination, flows and distances between the origins and destination, and
- a tibble data.frame which consists the explanatory variables
Building the geographical area
For the purpose of this study, URA Master Planning 2019 Planning Subzone GIS data will be used.
In the code chunk below, MPSZ-2019 shapefile will be imported into R environment as a sf tibble data frame callled mpsz.
<- st_read(dsn = "data/geospatial",
mpsz layer = "MPSZ-2019") %>%
st_transform(crs = 3414)
<- st_read(dsn = "data/geospatial",
busstop layer = "BusStop") %>%
st_transform(crs = 3414)
In this study, our analysis will be focused on planing subzone with busstop. In view of this, the code chunk below will be used to perform Point-in-Polygon count analysis.
$`BUSSTOP_COUNT` <- lengths(st_intersects(mpsz, busstop)) mpsz
<- mpsz %>%
mpsz_busstop filter(BUSSTOP_COUNT >0)
mpsz_busstop
Preparing the Spatial Weights
There are three different matices that can be used to describe the connectivity between planning subzone. They are: continguity, fixed distance and adaptive distance.
Code chunk below will be used to compute the three spatial weights at one go.
<- suppressWarnings({
centriods st_point_on_surface(st_geometry(mpsz_busstop))}
)
<- list(
mpsz_nb "by_contiguity" = poly2nb(mpsz_busstop),
"by_distance" = dnearneigh(centriods,
d1=0, d2=5000),
"by_knn" = knn2nb(knearneigh(centriods, 3)) # find the 3 nearest neighbours
)
mpsz_nb
Preparing The Flow Data
In this section, we will prepare the flow data at the planning subzone level as shown in the screenshot below.
<- read_rds("data/rds/odbus6_9.rds") odbus6_9
<- st_intersection(busstop, mpsz) %>%
busstop_mpsz select(BUS_STOP_N, SUBZONE_C) %>%
st_drop_geometry()
Before continuing, it is a good practice for us to check for duplicate records.
<- od_data %>%
duplicate group_by_all() %>%
filter(n()>1)%>%
ungroup()
If duplicated records are found, the code chunk below will be used to retain the unique records.
<- unique(od_data) od_data
It will be a good practice to confirm if the duplicating records issue has been addressed fully.
[To be continued]
Lets retrieve by using the code chunk below
<- read_rds("data/rds/mpsz_nb.rds")
mpsz_nb <- read_rds("data/rds/mpsz_flow.rds")
mpsz_flow <- read_rds("data/rds/mpsz_var.rds") mpsz_var
Creating spflow_network-class
objects
spflow_network-class
is an S4 class that contains all information on a spatial network which is composed by a set of nodes that are linked by some neighborhood relation. It can be created by using spflow_netwrork()
of spflow package.
For our model, we choose the contiguity based neighbourhood structure.
<- spflow_network(
mpsz_net id_net = "sg",
node_neighbourhood =
nb2mat(mpsz_nb$by_contiguity),
node_data = mpsz_var,
node_key_column = "SZ_CODE")
mpsz_net
Creating spflow_network-class
object
spflow_network-class
object is an S4 class which holds information on origin-destination (OD) pairs. Each OD pair is composed of two nodes, each belonging to one network. All origin nodes must belonging to the same origin network should be contained in one spflow_network-class
onject and likewise for the destination.
<- spflow_network_pair(
mpsz_net_pairs id_orig_net = "sg",
id_dest_net = "sg",
pair_data = mpsz_flow,
orig_key_column = "ORIGIN_SZ",
dest_key_column = "DESTIN_SZ")
mpsz_net_pairs
Creating sp_multi_network-class
object
The sp_multi_network-class
combines information on the nodes and the node-pairs and also ensures that both data sources are consistent. For example, if some of the origins in the sp_network_pair-class are not identified with the nodes in the sp_network_nodes-class an error will be raised.
<- spflow_network_multi(mpsz_net,
mpsz_multi_net
mpsz_net_pairs) mpsz_multi_net
<- log(1 + TRIPS)~
cor_formula +
BUSSTOP_COUNT +
AGE7_12 +
AGE13_24 +
AGE25_64 +
SCHOOL_COUNT +
BUSINESS_COUNT +
RETAILS_COUNT +
FINSERV_COUNT P_(log(DISTANCE +1))
<- pair_cor(
cor_mat
mpsz_multi_net,spflow_formula = cor_formula,
add_lags_x = FALSE)
colnames(cor_mat) <- paste0( #label the variables
substr(
colnames(cor_mat),1,3),"...")
cor_image(cor_mat) # to construct the correlation matrix
The Base Model
<- sp_flow(
base_model spflow_formula = log(1 + TRIPS) ~
O_(BUSSTOP_COUNT +
+
AGE25_64) D_(SCHOOL_COUNT +
+
BUSINESS_COUNT +
RETAILS_COUNT +
FINSERV_COUNT) P_(log(DISTANCE +1)),
spflow_networks = mpsz_multi_net)
base_model
In the code chunk below, spflow_moran_plots()
is used.
<- par(mfrow = c(1.3),
old_par mar = c(2,2,2,2))
spflow_moran_plots(base_model)
par(old_par)
Next, pair_cor()
can be used to inspect the relationship of the residual and the explanatory variables by using the code chunk below.
<- pair_cor(base_model)
corr_residual colnames(corr_residual) <- substr(colnames(corr_residual),1,3)
cor_image(corr_residual)
Working with Model Control
<- log(1+TRIPS)~
spflow_formula O_(BUSSTOP_COUNT +
+
AGE25_64) D_(SCHOOL_COUNT +
+
BUSINESS_COUNT +
RETAILS_COUNT +
FINSERV_COUNT) P_(log(DISTANCE + 1))
<- spflow_control(
model_control estimation_method = "mle",
model = "model_8")
<- spflow(
mle_model8
spflow_formula,spflow_networks = mpsz_multi_net,
estimation_control = model_control)
mle_model8
<- par(nfrow = c(1,3),
old_par mar = c(2,2,2,2))
spflow_moran_plots(mle_model8)
par(old_par)