14
mlf.test
mlf.test
Maxima Likelihood First Scan Test
Description
mlf.test implements the Maxima Likelihood First scan test of Yao et al. (2011), which is actually
a special case of the Dynamic Minimum Spanning Tree of Assuncao et al. (2006). Find the single
region that maximizes the likelihood ratio test statistic. Starting with this single region as a current
zone, new candidate zones are constructed by combining the current zone with the connected region
that maximizes the likelihood ratio test static. This procedure is repeated until the population upper
bound is reached.
Usage
mlf.test(coords, cases, pop, w, ex = sum(cases)/sum(pop) * pop, nsim = 499,
alpha = 0.1, nreport = nsim + 1, ubpop = 0.5, ubd = 0.5,
lonlat = FALSE, parallel = TRUE)
Arguments
coords
An n × 2 matrix of centroid coordinates for the regions.
cases
The number of cases in each region.
pop
The population size of each region.
w
The binary spatial adjacency matrix.
ex
The expected number of cases for each region. The default is calculated under
the constant risk hypothesis.
nsim
The number of simulations from which to compute p-value.
alpha
The significance level to determine whether a cluster is signficant. Default is
0.05.
nreport
The frequency with which to report simulation progress. The default is
nsim+ 1,
meaning no progress will be displayed.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
ubd
The upperbound for the proportion of the maximum intercentroid distance to
allow for the maximum size of a zone.
lonlat
If lonlat is TRUE, then the great circle distance is used to calculate the inter-
centroid distance. The default is FALSE, which specifies that Euclidean distance
should be used.
parallel
A logical indicating whether the test should be parallelized using the
parallel::mclapply function.
Default is TRUE. If TRUE, no progress will be reported.
Details
Only a single cluster is ever returned because the algorithm only constructs a single sequence of
starting zones, and overlapping zones are not returned. Only the zone that maximizes the likelihood
ratio test statistic is returned.
mlf.test
15
Value
Returns a list of length two of class scan. The first element (clusters) is a list containing the signifi-
cant, non-ovlappering clusters, and has the the following components:
locids
The location ids of regions in a significant cluster.
pop
The total population in the cluser window.
cases
The observed number of cases in the cluster window.
expected
The expected number of cases in the cluster window.
smr
Standarized mortaility ratio (observed/expected) in the cluster window.
rr
Relative risk in the cluster window.
loglikrat
The loglikelihood ratio for the cluster window (i.e., the log of the test statistic).
pvalue
The pvalue of the test statistic associated with the cluster window.
w
The adjacency matrix of the cluster.
r
The maximum radius of the cluster (in terms of intercentroid distance from the
starting region).
The second element of the list is the centroid coordinates. This is needed for plotting purposes.
Author(s)
Joshua French
References
Yao, Z., Tang, J., & Zhan, F. B. (2011). Detection of arbitrarily-shaped clusters using a neighbor-
expanding approach: A case study on murine typhus in South Texas. International journal of health
geographics, 10(1), 1.
Assuncao, R.M., Costa, M.A., Tavares, A. and Neto, S.J.F. (2006). Fast detection of arbitrarily
shaped disease clusters, Statistics in Medicine, 25, 723-742.
Examples
data(nydf)
data(nyw)
coords = with(nydf, cbind(longitude, latitude))
out = mlf.test(coords = coords, cases = floor(nydf$cases),
pop = nydf$pop, w = nyw,
alpha = 0.12, lonlat = TRUE,
nsim = 10, ubpop = 0.1, ubd = 0.5)
data(nypoly)
library(sp)
plot(nypoly, col = color.clusters(out))
16
mlf.zones
mlf.zones
Determine the candidate zone using the maxima likelihood first algo-
rithm of Yao et al. (2011).
Description
mlf.zones determines the most likely cluster zone obtained by implementing the maxima likeli-
hood first scann method of Yao et al. (2011). Note that this is really just a special case of the
dynamic minimum spanning tree (SMST) algorithm of Assuncao et al. (2006)
Usage
mlf.zones(coords, cases, pop, w, ex = sum(cases)/sum(pop) * pop,
ubpop = 0.5, ubd = 1, lonlat = FALSE, parallel = TRUE,
type = "pruned")
Arguments
coords
An n × 2 matrix of centroid coordinates for the regions.
cases
The number of cases observed in each region.
pop
The population size associated with each region.
w
A binary spatial adjacency matrix.
ex
The expected number of cases for each region. The default is calculated under
the constant risk hypothesis.
ubpop
The upperbound of the proportion of the total population to consider for a clus-
ter.
ubd
The upperbound for the radius of a cluster. This should be a proportion in (0, 1].
The value is the proportion of the maximum intercentroid distance between any
two locations in
coords. See Details.
lonlat
The default is
FALSE, which specifies that Euclidean distance should be used.If
lonlat is TRUE, then the great circle distance is used to calculate the inter-
centroid distance.
parallel
A logical indicating whether the test should be parallelized using the
parallel::mclapply function.
Default is
TRUE. If TRUE, no progress will be reported.
type
One of
"maxonly", "pruned", or "all". Specifying "maxonly" returns only the
maximum test statistic across all candidate zones,
"pruned" returns information
for the zone with the largest test statistic, while
"all" returns information for
all candidate zones. Default is
"pruned".
Details
Each step of the mlf scan test seeks to maximize the likelihood ratio test statistic used in the original
spatial scan test (Kulldorff 1997). The first zone considered is the region that maximizes this likeli-
hood ration test statistic, providing that no more than
ubpop proportion of the total population is in
Dostları ilə paylaş: |