Package ‘smerc’ April 20, 2018

Yüklə 234,58 Kb.

Pdf görüntüsü

səhifə	3/7
tarix	21.04.2018
ölçüsü	234,58 Kb.
	#39538

1 2 3 4 5 6 7

dmst.zones

plot(nypoly, col = color.clusters(out))

## End(Not run)

dmst.zones

Determine zones using the dynamic minimum spanning tree scan test

of Assuncao et al. (2006)

Description

dmst.zones determines the zones that produce the largest test statistic using a greedy algorithm.

Speciﬁcally, starting individually with each region as a starting zone, new (connected) regions are

added to the current zone in the order that results in the largest likelihood ratio test statistic. This is

used to implement the dynamic minimum spanning tree (dmst) scan test of Assuncao et al. (2006).

Usage

dmst.zones(coords, cases, pop, w, ex = sum(cases)/sum(pop) * pop,

ubpop = 0.5, ubd = 1, lonlat = FALSE, parallel = FALSE,

maxonly = FALSE)

Arguments

coords

An n × 2 matrix of centroid coordinates for the regions.

cases

The number of cases observed in each region.

pop

The population size associated with each region.

A binary spatial adjacency matrix.

The expected number of cases for each region. The default is calculated under

the constant risk hypothesis.

ubpop

The upperbound of the proportion of the total population to consider for a clus-

ter.

ubd

The upperbound for the radius of a cluster. This should be a proportion in (0, 1].

The value is the proportion of the maximum intercentroid distance between any

two locations in

coords. See Details.

lonlat

The default is

FALSE, which speciﬁes that Euclidean distance should be used.If

lonlat is TRUE, then the great circle distance is used to calculate the inter-

centroid distance.

parallel

A logical indicating whether the test should be parallelized using the

parallel::mclapply function.

Default is

TRUE. If TRUE, no progress will be reported.

maxonly

A logical value indicating whether to return only the maximum test statistic

across all candidate zones. Default is

FALSE.

dweights

Details

The test is performed using the spatial scan test based on the Poisson test statistic and a ﬁxed number

of cases. The ﬁrst cluster is the most likely to be a cluster. If no signiﬁcant clusters are found, then

the most likely cluster is returned (along with a warning).

Every zone considered must have a total population less than

ubpop * sum(pop). Addition-

ally, the maximum intercentroid distance for the regions within a zone must be no more than

ubd * the maximum intercentroid distance across all regions.

Value

Returns a list of zones to consider for clustering that includes the location id of each zone and the

associated test statistic, number of cases, expected number of cases, and the population in the zone.

maxonly = TRUE, then only the maximum test statistic across all of these zones is returned.

Author(s)

Joshua French

References

Assuncao, R.M., Costa, M.A., Tavares, A. and Neto, S.J.F. (2006). Fast detection of arbitrarily

shaped disease clusters, Statistics in Medicine, 25, 723-742.

Examples

data(nydf)

data(nyw)

coords = as.matrix(nydf[,c("longitude", "latitude")])

# find zone with max statistic starting from each individual region

max_zones = dmst.zones(coords, cases = floor(nydf$cases),

nydf$pop, w = nyw, ubpop = 0.25,

ubd = .25, lonlat = TRUE)

head(max_zones)

dweights

Distance-based weights

Description

dweights constructs a distance-based weights matrix. The dweights function can be used to con-

struct a weights matrix

w using the method of Tango (1995), Rogerson (1999), or a basic style.

Usage

dweights(coords, kappa = 1, lonlat = FALSE, type = "basic",

cases = NULL, pop = NULL)

dweights

Arguments

coords

An n × 2 matrix of centroid coordinates for the regions.

kappa

A positive constant related to strength of spatial autocorrelation.

lonlat

The default is

FALSE, which speciﬁes that Euclidean distance should be used.If

lonlat is TRUE, then the great circle distance is used to calculate the inter-

centroid distance.

type

The type of weights matrix to construct. Current options are

"basic", "tango",

and

"rogerson". Default is "basic". See Details.

cases

The number of cases observed in each region.

pop

The population size associated with each region.

Details

coords is used to construct an n × n distance matrix d.

type = "basic", then w

= exp(−d

/κ).

type = "rogerson", then w

= exp(−d

/κ)/

(cases

/pop

∗ cases

/pop

type = "tango", then w

= exp(−4 ∗ d

/κ

Value

Returns an n × n matrix of weights.

Author(s)

Joshua French

References

Tango, T. (1995) A class of tests for detecting "general" and "focused" clustering of rare diseases.

Statistics in Medicine. 14:2323-2334.

Rogerson, P. (1999) The Detection of Clusters Using A Spatial Version of the Chi-Square Goodness-

of-ﬁt Test. Geographical Analysis. 31:130-147