plot.tango
21
Arguments
x
An object of class scan to be plotted.
...
Additional graphical parameters passed to
plot function.
ccol
Fill color of the plotted points. Default is NULL, indicating red for the most
likely cluster, and col = 3, 4, ..., up to the remaining number of clusters.
cpch
Plotting character to use for points in each cluster. Default is NULL, indicating
pch = 20 for the most likely cluster and then pch = 2, 3, .., up to the remaining
number of clusters.
add
A logical indicating whether results should be drawn on existing map.
usemap
Logical indicating whether the maps::map function should be used to create a
plot background for the coordinates. Default is FALSE. Use TRUE if you have
longitude/latitude coordinates.
mapargs
A list of arguments for the map function.
See Also
map
Examples
data(nydf)
coords = with(nydf, cbind(longitude, latitude))
out = scan.test(coords = coords, cases = floor(nydf$cases),
pop = nydf$pop, nsim = 49,
lonlat = TRUE, alpha = 0.12,
parallel = FALSE)
## plot output for new york state
# specify desired argument values
mapargs = list(database = "state", region = "new york",
xlim = range(out$coords[,1]), ylim = range(out$coords[,2]))
# needed for "state" database (unless you execute library(maps))
data(stateMapEnv, package = "maps")
plot(out, usemap = TRUE, mapargs = mapargs)
plot.tango
Plots an object of class
tango.
Description
Plots results of
tango.test
. If Monte Carlo simulation was not used to produce
x, then a a density
plot of the (approximate) null distribution of
tstat.chisq is produced, along with a vertical line
for the observed
tstat. If a Monte Carlo test was used to produce x, then a scatterplot of the
gof.sim versus sa.sim is compared to the observed values gof and sa, respectively.
22
scan.stat
Usage
## S3 method for class 'tango'
plot(x, ..., obs.list = list(pch = 20), sim.list = list(pch
= 2))
Arguments
x
An object of class
tango to be plotted.
...
Additional graphical parameters passed to
plot function.
obs.list
A list containing arguments for the
points
function, which is used to plot the
gof and sa components, when appropriate.
sim.list
A list containing arguments for the
points
function, which is used to plot the
gof.sim and sa.sim components, when appropriate.
See Also
tango.test
Examples
data(nydf)
coords = as.matrix(nydf[,c("x", "y")])
w = dweights(coords, kappa = 1)
x1 = tango.test(nydf$cases, nydf$pop, w)
plot(x1)
x2 = tango.test(nydf$cases, nydf$pop, w, nsim = 49)
plot(x2)
scan.stat
Scan Statistic
Description
scan.stat calculates the scan statistic for various distributions.
Usage
scan.stat(yin, ein, eout, ty, type = "poisson")
Arguments
yin
The sum of the response values inside the window. Generally, the sum of the
cases.
ein
The expected value of the response in the window. Generally, the estimated
overall risk for all regions combined, multiplied by the population size of the
window.
scan.test
23
eout
The expected value of the response outside the window.
ty
The sum of all responses in the study area. Generally, the total number of cases.
type
The type of scan statistic to implement. Currently, only "poisson" is imple-
mented.
Value
A vector of scan statistics.
Author(s)
Joshua French
References
Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics – Theory and Methods
26, 1481-1496.
Examples
# statistic for most likely cluster of New York leukemia data
scan.stat(106, 62.13, 552 - 62.13, 552)
scan.test
Spatial Scan Test
Description
scan.test performs the spatial scan test of Kulldorf (1997).
Usage
scan.test(coords, cases, pop, ex = sum(cases)/sum(pop) * pop, nsim = 499,
alpha = 0.1, nreport = nsim + 1, ubpop = 0.5, lonlat = FALSE,
parallel = TRUE, type = "poisson")
Arguments
coords
An n × 2 matrix of centroid coordinates for the regions.
cases
The number of cases observed in each region.
pop
The population size associated with each region.
ex
The expected number of cases for each region. The default is calculated under
the constant risk hypothesis.
nsim
The number of simulations from which to compute the p-value.
alpha
The significance level to determine whether a cluster is signficant. Default is
0.10.
24
scan.test
nreport
The frequency with which to report simulation progress. The default is
nsim+ 1,
meaning no progress will be displayed.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
lonlat
The default is
FALSE, which specifies that Euclidean distance should be used.If
lonlat is TRUE, then the great circle distance is used to calculate the inter-
centroid distance.
parallel
A logical indicating whether the test should be parallelized using the
parallel::mclapply function.
Default is
TRUE. If TRUE, no progress will be reported.
type
The type of scan statistic to implement. Default is
"poisson". Only "poisson"
is currently implemented.
Details
The test is performed using the spatial scan test based on the Poisson test statistic and a fixed number
of cases. Candidate zones are circular and extend from the observed data locations. The clusters
returned are non-overlapping, ordered from most significant to least significant. The first cluster is
the most cflikely to be a cluster. If no significant clusters are found, then the most likely cluster is
returned (along with a warning).
Value
Returns a list of length two of class scan. The first element (clusters) is a list containing the signifi-
cant, non-overlappering clusters, and has the the following components:
locids
The location ids of regions in a significant cluster.
coords
The centroid of the significant clusters.
r
The radius of the cluster (the largest intercentroid distance for regions in the
cluster).
pop
The total population of the regions in the cluster.
cases
The observed number of cases in the cluster.
expected
The expected number of cases in the cluster.
smr
Standarized mortaility ratio (observed/expected) in the cluster.
rr
Relative risk in the cluster.
loglikrat
The loglikelihood ratio for the cluster (i.e., the log of the test statistic).
pvalue
The pvalue of the test statistic associated with the cluster.
The second element of the list is the centroid coordinates. This is needed for plotting purposes.
Author(s)
Joshua French
References
Waller, L.A. and Gotway, C.A. (2005). Applied Spatial Statistics for Public Health Data. Hoboken,
NJ: Wiley. Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics – Theory and
Methods 26, 1481-1496.
scan.zones
25
See Also
scan.stat
,
plot.scan
,
uls.test
,
flex.test
,
dmst.test
,
bn.test
Examples
data(nydf)
coords = with(nydf, cbind(longitude, latitude))
out = scan.test(coords = coords, cases = floor(nydf$cases),
pop = nydf$pop, nsim = 49,
alpha = 0.12, lonlat = TRUE)
## plot output for new york state
# specify desired argument values
mapargs = list(database = "state", region = "new york",
xlim = range(out$coords[,1]), ylim = range(out$coords[,2]))
# needed for "state" database (unless you execute library(maps))
data(stateMapEnv, package = "maps")
plot(out, usemap = TRUE, mapargs = mapargs)
# a second example to match the results of Waller and Gotway (2005)
# in chapter 7 of their book (pp. 220-221).
# Note that the 'longitude' and 'latitude' used by them has
# been switched. When giving their input to SatScan, the coords
# were given in the order 'longitude' and 'latitude'.
# However, the SatScan program takes coordinates in the order
# 'latitude' and 'longitude', so the results are slightly different
# from the example above.
coords = with(nydf, cbind(y, x))
out2 = scan.test(coords = coords, cases = floor(nydf$cases),
pop = nydf$pop, nsim = 49,
alpha = 0.5, lonlat = TRUE)
# the cases observed for the clusters in Waller and Gotway: 117, 47, 44
# the second set of results match
c(out2$clusters[[1]]$cases, out2$clusters[[2]]$cases, out2$clusters[[3]]$cases)
scan.zones
Determine zones for spatial scan test
Description
scan.zones determines the unique zones to consider for the spatial scan test of Kulldorff (1997).
Usage
scan.zones(coords, pop, ubpop = 0.5, lonlat = FALSE)
26
tango.stat
Arguments
coords
An n × 2 matrix of centroid coordinates for the regions.
pop
The population size associated with each region.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
lonlat
The default is
FALSE, which specifies that Euclidean distance should be used.If
lonlat is TRUE, then the great circle distance is used to calculate the inter-
centroid distance.
Value
Returns a list of zones to consider for clustering. Each element of the list contains a vector with the
location ids of the regions in that zone.
Author(s)
Joshua French
References
Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics – Theory and Methods
26, 1481-1496.
Examples
data(nydf)
coords = cbind(nydf$longitude, nydf$latitude)
scan.zones(coords = coords, pop = nydf$pop, ubpop = 0.1, lonlat = TRUE)
tango.stat
Tango’s statistic
Description
tango.stat computes Tango’s index (Tango, 1995), including both the goodness-of-fit and spatial
autocorrelation components. See Waller and Gotway (2005).
Usage
tango.stat(cases, pop, w)
Arguments
cases
The number of cases observed in each region.
pop
The population size associated with each region.
w
An n × n weights matrix.
tango.test
27
Value
Returns a list with the test statistic (
tstat), the goodness-of-fit component (gof), and the spatial
autocorrelation component (
sa).
Author(s)
Joshua French
References
Tango, T. (1995) A class of tests for detecting "general" and "focused" clustering of rare diseases.
Statistics in Medicine. 14:2323-2334.
Waller, L.A. and Gotway, C.A. (2005). Applied Spatial Statistics for Public Health Data. Hoboken,
NJ: Wiley.
Examples
data(nydf)
coords = as.matrix(nydf[,c("longitude", "latitude")])
w = dweights(coords, kappa = 1, type = "tango")
tango.stat(nydf$cases, nydf$pop, w)
tango.test
Tango’s cluster detection test
Description
tango.test performs a test for clustering proposed by Tango (1995). The test uses Tango’s chi-
square approximation for significance testing by default, but also uses Monto Carlo simulation when
nsim > 0.
Usage
tango.test(cases, pop, w, nsim = 0)
Arguments
cases
The number of cases observed in each region.
pop
The population size associated with each region.
w
An n × n weights matrix.
nsim
The number of simulations for which to perform a Monto Carlo test of sig-
nificance. Counts are simulated according to a multinomial distribution with
sum(cases) total cases and class probabilities pop/sum(pop). sum(cases) .
28
tango.test
Details
The
dweights
function can be used to construct a weights matrix
w using the method of Tango
(1995), Rogerson (1999), or a basic style.
Value
Returns a list of class
tango with elements:
tstat
Tango’s index
tstat.chisq
The approximately chi-squared statistic proposed by Tango that is derived from
tstat
dfc
The degrees of freedom of
tstat.chisq
pvalue.chisq
The p-value associated with
tstat.chisq
tstat.sim
The vector of test statistics from the simulated data if
nsim > 0
pvalue.sim
The p-value associated with the Monte Carlo test of significance when
nsim > 0
Additionally, the goodness-of-fit
gof and spatial autocorrelation sa components of the Tango’s
index are provided (and for the simulated data sets also, if appropriate).
Author(s)
Joshua French
References
Tango, T. (1995) A class of tests for detecting "general" and "focused" clustering of rare diseases.
Statistics in Medicine. 14, 2323-2334.
Rogerson, P. (1999) The Detection of Clusters Using A Spatial Version of the Chi-Square Goodness-
of-fit Test. Geographical Analysis. 31, 130-147
Tango, T. (2010) Statistical Methods for Disease Clustering. Springer.
Waller, L.A. and Gotway, C.A. (2005). Applied Spatial Statistics for Public Health Data. Hoboken,
NJ: Wiley.
See Also
dweights
Examples
data(nydf)
coords = as.matrix(nydf[,c("x", "y")])
w = dweights(coords, kappa = 1)
results = tango.test(nydf$cases, nydf$pop, w, nsim = 49)
uls.test
29
uls.test
Upper Level Set Spatial Scan Test
Description
uls.test performs the Upper Level Set (ULS) spatial scan test of Patil and Taillie (2004).
Usage
uls.test(coords, cases, pop, w, ex = sum(cases)/sum(pop) * pop, nsim = 499,
alpha = 0.1, nreport = nsim + 1, ubpop = 0.5, lonlat = FALSE,
parallel = TRUE)
Arguments
coords
An n × 2 matrix of centroid coordinates for the regions.
cases
The number of cases observed in each region.
pop
The population size associated with each region.
w
A binary spatial adjacency matrix.
ex
The expected number of cases for each region. The default is calculated under
the constant risk hypothesis.
nsim
The number of simulations from which to compute the p-value.
alpha
The significance level to determine whether a cluster is signficant. Default is
0.10.
nreport
The frequency with which to report simulation progress. The default is
nsim+ 1,
meaning no progress will be displayed.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
lonlat
The default is
FALSE, which specifies that Euclidean distance should be used.If
lonlat is TRUE, then the great circle distance is used to calculate the inter-
centroid distance.
parallel
A logical indicating whether the test should be parallelized using the
parallel::mclapply function.
Default is
TRUE. If TRUE, no progress will be reported.
Details
The test is performed using the spatial scan test based on the Poisson test statistic and a fixed number
of cases. The windows are based on the Upper Level Sets proposed by Patil and Taillie (2004). The
clusters returned are non-overlapping, ordered from most significant to least significant. The first
cluster is the most likely to be a cluster. If no significant clusters are found, then the most likely
cluster is returned (along with a warning).
30
uls.test
Value
Returns a list of length two of class scan. The first element (clusters) is a list containing the signifi-
cant, non-ovlappering clusters, and has the the following components:
locids
The location ids of regions in a significant cluster.
pop
The total population in the cluser window.
cases
The observed number of cases in the cluster window.
expected
The expected number of cases in the cluster window.
smr
Standarized mortaility ratio (observed/expected) in the cluster window.
rr
Relative risk in the cluster window.
loglikrat
The loglikelihood ratio for the cluster window (i.e., the log of the test statistic).
pvalue
The pvalue of the test statistic associated with the cluster window.
The second element of the list is the centroid coordinates. This is needed for plotting purposes.
Author(s)
Joshua French
References
Waller, L.A. and Gotway, C.A. (2005). Applied Spatial Statistics for Public Health Data. Hoboken,
NJ: Wiley. Kulldorff, M. (1997) A spatial scan statistic. Communications in Statistics – Theory and
Methods 26, 1481-1496.
See Also
scan.stat
,
plot.scan
,
scan.test
,
flex.test
,
dmst.test
,
bn.test
Examples
data(nydf)
data(nyw)
coords = with(nydf, cbind(longitude, latitude))
out = uls.test(coords = coords, cases = floor(nydf$cases),
pop = nydf$pop, w = nyw,
alpha = 0.12, lonlat = TRUE,
nsim = 10, ubpop = 0.1)
## plot output for new york state
# specify desired argument values
mapargs = list(database = "state", region = "new york",
xlim = range(out$coords[,1]), ylim = range(out$coords[,2]))
# needed for "state" database (unless you execute library(maps))
data(stateMapEnv, package = "maps")
plot(out, usemap = TRUE, mapargs = mapargs)
data(nypoly)
library(sp)
plot(nypoly, col = color.clusters(out))
uls.zones
31
uls.zones
Determine sequence of ULS zones.
Description
uls.zones determines the unique zones obtained by implementing the ULS (Upper Level Set)
method of Patil and Taillie (2004).
Usage
uls.zones(cases, pop, w, ubpop = 0.5)
Arguments
cases
The number of cases observed in each region.
pop
The population size associated with each region.
w
A binary spatial adjacency matrix.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
Details
The zones returned must have a total population less than ubpop * the total population of all regions
in the study area.
Value
Returns a list of zones to consider for clustering. Each element of the list contains a vector with the
location ids of the regions in that zone.
Author(s)
Joshua French
References
Patil, G. P., and Taillie, C. (2004). Upper level set scan statistic for detecting arbitrarily shaped
hotspots. Environmental and Ecological Statistics, 11(2), 183-197.
Examples
data(nydf)
data(nyw)
uls.zones(cases = nydf$cases, pop = nydf$population, w = nyw)
Index
bn.test,
2
,
7
,
12
,
25
,
30
casewin,
4
color.clusters,
5
dmst.test,
3
,
6
,
12
,
25
,
30
dmst.zones,
8
dweights,
9
,
28
flex.test,
3
,
7
,
11
,
25
,
30
flex.zones,
13
map,
21
mlf.test,
3
,
14
mlf.zones,
16
nnpop,
17
nydf,
18
nypoly,
19
nyw,
20
plot.scan,
3
,
7
,
12
,
20
,
25
,
30
plot.tango,
21
points,
22
scan.stat,
3
,
7
,
12
,
22
,
25
,
30
scan.test,
3
,
7
,
12
,
23
,
30
scan.zones,
25
tango.stat,
26
tango.test,
10
,
21
,
22
,
27
uls.test,
3
,
7
,
12
,
25
,
29
uls.zones,
31
32
Document Outline - bn.test
- casewin
- color.clusters
- dmst.test
- dmst.zones
- dweights
- flex.test
- flex.zones
- mlf.test
- mlf.zones
- nnpop
- nydf
- nypoly
- nyw
- plot.scan
- plot.tango
- scan.stat
- scan.test
- scan.zones
- tango.stat
- tango.test
- uls.test
- uls.zones
- Index
Dostları ilə paylaş: |