Title: | Crosswalk Municipality and District Statistics in Germany |
---|---|
Description: | Construct time series for Germany's municipalities (Gemeinden) and districts (Kreise) using a annual crosswalk constructed by the Federal Office for Building and Regional Planning (BBSR). |
Authors: | Moritz Marbach [aut, cre] |
Maintainer: | Moritz Marbach <[email protected]> |
License: | GPL-3 |
Version: | 1.0.1 |
Built: | 2025-01-03 04:14:05 UTC |
Source: | https://github.com/sumtxt/ags |
Defines a distance metric for the AGS
ags_dist(x, y, landw = 10^6, kreisw = 10^3, gemw = 1, ceiling = 99999999)
ags_dist(x, y, landw = 10^6, kreisw = 10^3, gemw = 1, ceiling = 99999999)
x , y
|
vectors of AGS values |
landw |
weight of the Bundesland (Land) integers |
kreisw |
weight of the Kreis (district) integers |
gemw |
weight of the Gemeinde (municipality) integers |
ceiling |
truncate all distances at this value |
The distance metric is defined as
abs(x[1:2]- y[1:2])*landw + abs(x[3:5]- y[3:5])*kreisw + abs(x[6:8]- y[6:8])*gemw,
where z[a:b] means all digits between a and b for integer z.
With the default weights, this sum is the absolute difference between x and y.
A numerical vector.
ags_dist(14053,14059)
ags_dist(14053,14059)
The dataset includes the number of voters and valid votes in all federal elections (Bundestagswahlen) across districts in Saxony.
btw_sn
btw_sn
A data frame with 155 rows and 4 variables:
AGS of the district.
Election year.
Number of eligible voters.
Number of valid votes.
https://www.regionalstatistik.de
Convert the Name or the AGS of a Bundesland
code_bundesland( sourcevar, origin = "ags", destination = "name", factor = FALSE )
code_bundesland( sourcevar, origin = "ags", destination = "name", factor = FALSE )
sourcevar |
Vector which contains the codes or names to be converted. |
origin |
The following options are available:
|
destination |
The following options are available:
|
factor |
If |
This function converts a string of Bundesland names into the AGS, the standardized (English) name, or the Bundesland abbreviation.
If origin="AGS"
, the first two digits will be used to identify
a Bundesland. It is therefore important that sourcevar
is supplied
as a character vector with a leading zeros if applicable.
A character vector.
format_ags()
for formatting AGS.
library(dplyr) data(btw_sn) btw_sn %>% mutate(bl=code_bundesland(district, origin="ags", destination="name"))
library(dplyr) data(btw_sn) btw_sn %>% mutate(bl=code_bundesland(district, origin="ags", destination="name"))
Formats AGS with a Leading Zero
format_ags(ags, type, verbose = FALSE)
format_ags(ags, type, verbose = FALSE)
ags |
Input vector that will be coerced into an integer vector. Factor vectors are first coerced to a character vector and then to an integer vector. |
type |
Type of AGS supplied as
The abbreviations |
verbose |
If |
A character vector.
format_ags(c(1,14), type="land") format_ags(c(1002,14612), type="district") format_ags(c(01002000,14612000), type="municipality")
format_ags(c(1,14), type="land") format_ags(c(1002,14612), type="district") format_ags(c(01002000,14612000), type="municipality")
This function constructs time series of counts for Germany's municipalities (Gemeinden) and districts (Kreise).
xwalk_ags( data, ags, time, xwalk, variables = NULL, strata = NULL, weight = NULL, fuzzy_time = FALSE, verbose = TRUE )
xwalk_ags( data, ags, time, xwalk, variables = NULL, strata = NULL, weight = NULL, fuzzy_time = FALSE, verbose = TRUE )
data |
A data frame or a data frame extension (e.g. a tibble). |
ags |
Name of the character variable (quoted) with municipality AGS (Gemeinden, 8 digits) or district AGS (Kreise, 5 digits). |
time |
Name of the variable (quoted) identifying the year (YYYY format). Values will be coerced to integers. |
xwalk |
Name of the crosswalk. The following crosswalks are available:
|
variables |
Either a vector of names (quoted) for
variables to interpolate or |
strata |
Vector of variable names (quoted) or |
weight |
Name of the interpolation weight or
|
fuzzy_time |
If |
verbose |
If |
This function facilitates the use of crosswalks constructed by the BBSR for municipalities and districts in Germany (Milbert 2010). The crosswalks map one year's set of district/municipality identifiers to later year's identifiers and provide weights to perform area or population weighted interpolation.
All data rows with NA
s in either the ags
or time
variable are excluded. The same applies to all rows with a value in
ags
or time
that never appears in the crosswalk.
Fuzzy matching uses the absolute difference between the year reported in the data and a crosswalk year. If there is a tie, crosswalk years from before the year reported in the data are preferred.
If area or population weighted interpolation is requested (i.e., when
variables
are supplied), the combination of the variables set
in ags
, time
and strata
need to uniquely
identify a row in data
.
Caution: Data from https://www.regionalstatistik.de/ sometimes includes
annual values for merged units (e.g., Städteregion Aachen, 05334)) and
for their former parts (Kreis Aachen, 05354 and Stadt Aachen, 05313).
When such data is crosswalked with fuzzy_time=TRUE
and
interpolated, the final counts will be off by approximately factor 2.
The reason is that the final output is the sum of the interpolated counts
for the parts and the measured count of the merged unit.
If interpolation is requested, the crosswalked and interpolated
data are returned. If interpolation is not requested, the data
matched
with the crosswalk are returned. The following variables are added:
row_id
row number of data
before matching.
ags[*]
the crosswalked AGS.
year_xw
the matched year from the crosswalk.
[*]_conv
the interpolation weight.
diff
the absolute difference between year_xw
and time
.
Milbert, Antonia. 2010. "Gebietsreformen–politische Entscheidungen und Folgen für die Statistik." BBSR-Berichte kompakt 6/2010. Bundesinsitut für Bau-, Stadt-und Raumfoschung.
data(btw_sn) btw_sn_ags20 <- xwalk_ags( data = btw_sn, ags = "district", time = "year", xwalk = "xd20", variables = c("voters", "valid"), weight = "pop" ) head(btw_sn_ags20)
data(btw_sn) btw_sn_ags20 <- xwalk_ags( data = btw_sn, ags = "district", time = "year", xwalk = "xd20", variables = c("voters", "valid"), weight = "pop" ) head(btw_sn_ags20)