Package 'arealDB' reference manual

Title:	Harmonise and Integrate Heterogeneous Areal Data
Description:	Many relevant applications in the environmental and socioeconomic sciences use areal data, such as biodiversity checklists, agricultural statistics, or socioeconomic surveys. For applications that surpass the spatial, temporal or thematic scope of any single data source, data must be integrated from several heterogeneous sources. Inconsistent concepts, definitions, or messy data tables make this a tedious and error-prone process. 'arealDB' tackles those problems and helps the user to integrate a harmonised databases of areal data. Read the paper at Ehrmann, Seppelt & Meyer (2020) <doi:10.1016/j.envsoft.2020.104799>.
Authors:	Steffen Ehrmann [aut, cre] , Arne Rümmler [aut, ctb] , Felipe Melges [ctb] , Carsten Meyer [aut]
Maintainer:	Steffen Ehrmann <[email protected]>
License:	GPL-3
Version:	0.9.5
Built:	2025-02-11 22:20:57 UTC
Source:	https://github.com/luckinet/arealdb

Edit matches manually in a csv-table

Description

Allows the user to match concepts with an already existing ontology, without actually writing into the ontology, but instead storing the resulting matching table as csv.

Usage

.editMatches(
  new,
  topLevel,
  source = NULL,
  ontology = NULL,
  matchDir = NULL,
  stringdist = TRUE,
  parentClasses = FALSE,
  beep = NULL,
  verbose = TRUE
)
.editMatches(
  new,
  topLevel,
  source = NULL,
  ontology = NULL,
  matchDir = NULL,
  stringdist = TRUE,
  parentClasses = FALSE,
  beep = NULL,
  verbose = TRUE
)

Arguments

`new`	`data.frame(.)` the new concepts that shall be manually matched, includes "label", "class" and "has_broader" columns.
`topLevel`	`logical(1)` whether or not the new concepts are at the highest level only, i.e., have to be matched without context, or whether they are contain columns that must be matched within parent columns.
`source`	`character(1)` any character uniquely identifying the source dataset of the new concepts.
`ontology`	`ontology(1)` either a path where the ontology is stored, or an already loaded ontology.
`matchDir`	`character(1)` the directory where to store source-specific matching tables.
`stringdist`	`logical(1)` whether or not to use string distance to find matches (should not be used for large datasets/when a memory error is shown).
`parentClasses`	`logical(1)` whether or not to search for matches in classes that are hierarchically higher than the target class.
`beep`	`integerish(1)` Number specifying what sound to be played to signal the user that a point of interaction is reached by the program, see `beep`.
`verbose`	`logical(1)` whether or not to give detailed information on the process of this function.

Details

In order to match new concepts into an already existing ontology, it may become necessary to carry out manual matches of the new concepts with already harmonised concepts, for example, when the new concepts are described with terms that are not yet in the ontology. This function puts together a table, in which the user would edit matches by hand. Whith the argument verbose = TRUE, detailed information about the edit process are shown to the user. After defining matches, and even if not all necessary matches are finished, the function stores a specific "matching table" with the name match_SOURCE.csv in the respective directory (matchDir), from where work can be picked up and continued at another time.

Fuzzy matching is carried out and matches with 0, 1 or 2 differing charcters are presented in a respective column.

Value

A table that contains all new matches, or if none of the new concepts weren't already in the ontology, a table of the already sucessful matches.

Get the column types of a tibble

Description

(internal function not for user interaction)

Usage

.getColTypes(input = NULL)
.getColTypes(input = NULL)

Arguments

input

data.frame
table from which to get column types.

Match target terms with an ontology

Description

This function takes a table to replace the values of various columns with harmonised values listed in the project specific gazetteer.

Usage

.matchOntology(
  table = NULL,
  columns = NULL,
  dataseries = NULL,
  ontology = NULL,
  colsAsClass = TRUE,
  groupMatches = FALSE,
  stringdist = TRUE,
  strictMatch = FALSE,
  parentClasses = FALSE,
  beep = NULL,
  verbose = FALSE
)
.matchOntology(
  table = NULL,
  columns = NULL,
  dataseries = NULL,
  ontology = NULL,
  colsAsClass = TRUE,
  groupMatches = FALSE,
  stringdist = TRUE,
  strictMatch = FALSE,
  parentClasses = FALSE,
  beep = NULL,
  verbose = FALSE
)

Arguments

`table`	`data.frame(1)` a table that contains columns that should be harmonised by matching with the gazetteer.
`columns`	`character(1)` the columns containing the concepts
`dataseries`	`character(1)` the source dataseries from which territories are sourced.
`ontology`	`onto` path where the ontology/gazetteer is stored.
`colsAsClass`	`logical(1)` whether or not to match `columns` by their name with the respective classes, or with concepts of all classes.
`groupMatches`	`logical(1)` whether or not to group harmonized concepts (in the output) when there are more than one match (for example for broader or narrower matches).
`stringdist`	`logical(1)` whether or not to use string distance to find matches (should not be used for large datasets/when a memory error is shown).
`strictMatch`	`logical(1)` whether or not matches are strict, i.e., there should be clear one-to-one relationships and no changes in broader concepts.
`parentClasses`	`logical(1)` whether or not to search for matches in classes that are hierarchically higher than the target class.
`beep`	`integerish(1)` Number specifying what sound to be played to signal the user that a point of interaction is reached by the program, see `beep`.
`verbose`	`logical(1)` whether or not to give detailed information on the process of this function.

Value

Returns a table that resembles the input table where the target columns were translated according to the provided ontology.

Update an ontology

Description

This function takes a table (spatial) and updates all territorial concepts in the provided gazetteer.

Usage

.updateOntology(
  table = NULL,
  threshold = NULL,
  dataseries = NULL,
  ontology = NULL
)
.updateOntology(
  table = NULL,
  threshold = NULL,
  dataseries = NULL,
  ontology = NULL
)

Arguments

`table`	`character(1)` a table that contains a match column as the basis to update the gazetteer.
`threshold`	`numeric(1)` a threshold value above which matches are updated in the gazetteer.
`dataseries`	`character(1)` the source dataseries of the external concepts for which the gazetteer shall be updated.
`ontology`	onto path where the ontology/gazetteer is stored.

Value

called for its side-effect of updating a gazetteer

Archive the data from an areal database

Description

Archive the data from an areal database

Usage

adb_archive(pattern = NULL, variables = NULL, compress = FALSE, outPath = NULL)
adb_archive(pattern = NULL, variables = NULL, compress = FALSE, outPath = NULL)

Arguments

`pattern`	`character(1)` a regular expression used to filter files to load.
`variables`	`character(.)` columns, typically observed variables, to select.
`compress`	`logical(1)` whether or not the database should be compressed into a tar.gz archive. Will delete the database folder in `outPath`.
`outPath`	`character(1)` directory, where the archive should be stored.

Details

This function prepares and packages the data into an archiveable form. This contains geopacakge files for geometries and csv files for all tables, such as inventory, matching and thematic data tables.

Value

no return value, called for the side-effect of creating a database archive.

Backup the current state of an areal database

Description

Backup the current state of an areal database

Usage

adb_backup()
adb_backup()

Details

This function creates a tag that is composed of the version and the date, appends it to all stage3 files (tables and geometries), the inventory and the ontology/gazetteer files and stores them in the backup folder of the current areal database.

Value

No return value, called for the side effect of saving the inventory, the stage3 files and modified ontology/gazetteer into the backup directory.

Diagnose databse contents

Description

work in progress, not yet useable

Usage

adb_diagnose(
  territory = NULL,
  concept = NULL,
  variable = NULL,
  level = NULL,
  year = NULL
)
adb_diagnose(
  territory = NULL,
  concept = NULL,
  variable = NULL,
  level = NULL,
  year = NULL
)

Arguments

`territory`	description
`concept`	description
`variable`	description
`level`	description
`year`	description

Build an example areal database

Description

This function helps setting up an example database up until a certain step.

Usage

adb_example(path = NULL, until = NULL, verbose = FALSE)
adb_example(path = NULL, until = NULL, verbose = FALSE)

Arguments

`path`	`character(1)` The database gets created by default in tempdir(), but if you want it in a particular location, specify that in this argument.
`until`	`character(1)` The database building step in terms of the function names until which the example database shall be built, one of `"start_arealDB"`, `"regDataseries"`, `"regGeometry"`, `"regTable"`, `"normGeometry"` or `"normTable"`.
`verbose`	`logical(1)` be verbose about building the example database (default `FALSE`).

Details

Setting up a database with an R-based tool can appear to be cumbersome and too complex and thus intimidating. By creating an example database, this functions allows interested users to learn step by step how to build a database of areal data. Moreover, all functions in this package contain verbose information and ask for information that would be missing or lead to an inconsistent database, before a failure renders hours of work useless.

Value

No return value, called for the side effect of creating an example database at the specified path.

Examples

if(dev.interactive()){
# to build the full example database
adb_example(path = paste0(tempdir(), "/newDB"))

# to make the example database until a certain step
adb_example(path = paste0(tempdir(), "/newDB"), until = "regDataseries")

}
if(dev.interactive()){
# to build the full example database
adb_example(path = paste0(tempdir(), "/newDB"))

# to make the example database until a certain step
adb_example(path = paste0(tempdir(), "/newDB"), until = "regDataseries")

}

Initiate an areal database

Description

Initiate a geospatial database or register a database that exists at the root path.

Usage

adb_init(
  root,
  version,
  author,
  licence,
  ontology,
  gazetteer = NULL,
  top = NULL,
  staged = TRUE
)
adb_init(
  root,
  version,
  author,
  licence,
  ontology,
  gazetteer = NULL,
  top = NULL,
  staged = TRUE
)

Arguments

`root`	`character(1)` path to the root directory that contains or shall contain an areal database.
`version`	`character(1)` version identifier for this areal database.
`author`	`character(1)` authors that contributed to building this areal database. Should be a list with items `"cre"` (creator), `"aut"` (authors) and `"ctb"` (contributors).
`licence`	`character(1)` licence (link) for this areal database.
`ontology`	`list(.)` named list with the path(s) of ontologies, where the list name identifies the variable that shall be matched with the ontology at the path.
`gazetteer`	`character(1)` path to the gazetteer that holds the (hierarchical) information of territorial units used in this database.
`top`	`character(1)` the label of the class in the gazetteer that represents the top-most unit (e.g. country) of the areal database that shall be started.
`staged`	`logical(1)` whether or not the file structure is arranged according to stages (with geometries and tables separated), or merely as input/output (of all types).

Details

This is the first function that is run in a project, as it initiates the areal database by creating the default sub-directories and initial inventory tables. When a database has already been set up, this function is used to register that path in the options of the current R session.

Value

No return value, called for the side effect of creating the directory structure of the new areal database and tables that contain the database metadata.

Examples

adb_init(root = paste0(tempdir(), "/newDB"),
         version = "1.0.0", licence = "CC-BY-0.4",
         author = list(cre = "Gordon Freeman", aut = "Alyx Vance", ctb = "The G-Man"),
         gazetteer = paste0(tempdir(), "/newDB/territories.rds"),
         top = "al1",
         ontology = list(var = paste0(tempdir(), "/newDB/ontology.rds")))

getOption("adb_path"); getOption("gazetteer_path")
adb_init(root = paste0(tempdir(), "/newDB"),
         version = "1.0.0", licence = "CC-BY-0.4",
         author = list(cre = "Gordon Freeman", aut = "Alyx Vance", ctb = "The G-Man"),
         gazetteer = paste0(tempdir(), "/newDB/territories.rds"),
         top = "al1",
         ontology = list(var = paste0(tempdir(), "/newDB/ontology.rds")))

getOption("adb_path"); getOption("gazetteer_path")

Load the inventory of the currently active areal database

Description

Load the inventory of the currently active areal database

Usage

adb_inventory(type = NULL)
adb_inventory(type = NULL)

Arguments

type

character(1)
the inventory sub-table to load, either "dataseries", "tables", "geometries" or "references".

Value

returns the table selected in type

Load the metadata from an areal database

Description

Load the metadata from an areal database

Usage

adb_metadata()
adb_metadata()

Load the currently active ontology

Description

Load the currently active ontology

Usage

adb_ontology(..., type = "ontology")
adb_ontology(..., type = "ontology")

Arguments

`...`	combination of column name in the ontology and value to filter that column by to build a tree of the concepts nested into it; see `make_tree`.
`type`	`character(1)` the type of ontology to load, either `"ontology"` to get the thematic concepts, or `"gazetteer"` to get the territories.

Value

returns a tidy table of an ontology or gazetteer that is used in an areal database.

Extract database contents

Description

Extract database contents

Usage

adb_querry(
  territory = NULL,
  concept = NULL,
  variable = NULL,
  level = NULL,
  year = NULL
)
adb_querry(
  territory = NULL,
  concept = NULL,
  variable = NULL,
  level = NULL,
  year = NULL
)

Arguments

`territory`	'character(.) combination of column name in the ontology and value to filter that column by to build a tree of the territories nested into it.
`concept`	description
`variable`	description
`level`	description
`year`	description

Value

returns ...

Examples

if(dev.interactive()){
adb_example(path = paste0(tempdir(), "/newDB"))

adb_querry(territory = list(al1 = "a_nation"),
           concept = list(commodity = "barley"),
           variable = "harvested")
}
if(dev.interactive()){
adb_example(path = paste0(tempdir(), "/newDB"))

adb_querry(territory = list(al1 = "a_nation"),
           concept = list(commodity = "barley"),
           variable = "harvested")
}

Reset an areal database to its unfilled state

Description

Reset an areal database to its unfilled state

Usage

adb_reset(what = "all")
adb_reset(what = "all")

Arguments

what

logical(1)
what to reset, either "onto", "gaz", "schemas", "tables", "geometries" or "all", the default.

Value

no return value, called for its side effect of reorganising an areal database into a state where no reg* or norm* functions have been run

Restore the database from a backup

Description

Restore the database from a backup

Usage

adb_restore(version = NULL, date = NULL)
adb_restore(version = NULL, date = NULL)

Arguments

`version`	'character(1) a version tag for which to restore files.
`date`	`character(1)` a date for which to restore files.

Details

This function searches for files that have the version and date tag, as it was defined in a previous run of adb_backup, to restore them to their original folders. This function overwrites by default, so use with care.

Value

No return value, called for the side effect of restoring files that were previously stored in a backup.

Load the schemas of the currently active areal database

Description

Load the schemas of the currently active areal database

Usage

adb_schemas(pattern = NULL)
adb_schemas(pattern = NULL)

Arguments

pattern

character(1)
an optional regular expression. Only schema names which match the regular expression will be processed.

Value

returns a list of schema descriptions

Load the translation tables of the currently active areal database

Description

Load the translation tables of the currently active areal database

Usage

adb_translations(type = NULL, dataseries = NULL)
adb_translations(type = NULL, dataseries = NULL)

Arguments

`type`	`character(1)` the type of ontology for which to load translation tables, either `"ontology"` to get the thematic concepts, or `"gazetteer"` to get the territories.
`dataseries`	`character(1)` the name of a dataseries as registered in `regDataseries`.

Value

returns the selected translation table

Normalise geometries

Description

Harmonise and integrate geometries into a standardised format

Usage

normGeometry(
  input = NULL,
  pattern = NULL,
  query = NULL,
  thresh = 10,
  beep = NULL,
  simplify = FALSE,
  stringdist = TRUE,
  strictMatch = FALSE,
  verbose = FALSE
)
normGeometry(
  input = NULL,
  pattern = NULL,
  query = NULL,
  thresh = 10,
  beep = NULL,
  simplify = FALSE,
  stringdist = TRUE,
  strictMatch = FALSE,
  verbose = FALSE
)

Arguments

`input`	`character(1)` path of the file to normalise. If this is left empty, all files at stage two as subset by `pattern` are chosen.
`pattern`	`character(1)` an optional regular expression. Only dataset names which match the regular expression will be processed.
`query`	`character(1)` part of the SQL query (starting from WHERE) used to subset the input geometries, for example `"WHERE NAME_0 IN ('Estonia')"`. The first part of the query (where the layer is defined) is derived from the meta-data of the currently handled geometry.
`thresh`	`integerish(1)` percent value of overlap below which two geometries (the input and the base) are considered to be the same. This is required, because often the polygons from different sources, albeit describing the same territorial unit, aren't completely the same.
`beep`	`integerish(1)` Number specifying what sound to be played to signal the user that a point of interaction is reached by the program, see `beep`.
`simplify`	`logical(1)` whether or not to simplify geometries.
`stringdist`	`logical(1)` whether or not to use string distance to find matches (should not be used for large datasets/when a memory error is shown).
`strictMatch`	`logical(1)` whether or not matches are strict, i.e., there should be clear one-to-one relationships and no changes in broader concepts.
`verbose`	`logical(1)` be verbose about what is happening (default `FALSE`). Furthermore, you can use `suppressMessages` to make this function completely silent.

Details

To normalise geometries, this function proceeds as follows:

Read in input and extract initial metadata from the file name.
In case filters are set, the new geometry is filtered by those.
The territorial names are matched with the gazetteer to harmonise new territorial names (at this step, the function might ask the user to edit the file 'matching.csv' to align new names with already harmonised names).
Loop through every nation potentially included in the file that shall be processed and carry out the following steps:
- In case the geometries are provided as a list of simple feature POLYGONS, they are dissolved into a single MULTIPOLYGON per main polygon.
- In case the nation to which a geometry belongs has not yet been created at stage three, the following steps are carried out:
  1. Store the current geometry as basis of the respective level (the user needs to make sure that all following levels of the same dataseries are perfectly nested into those parent territories, for example by using the GADM dataset)
- In case the nation to which the geometry belongs has already been created, the following steps are carried out:
  1. Check whether the new geometries have the same coordinate reference system as the already existing database and re-project the new geometries if this is not the case.
  2. Check whether all new geometries are already exactly matched spatially and stop if that is the case.
  3. Check whether the new geometries are all within the already defined parents, and save those that are not as a new geometry.
  4. Calculate spatial overlap and distinguish the geometries into those that overlap with more and those with less than thresh.
  5. For all units that dName match, copy gazID from the geometries they overlap.
  6. For all units that dName not match, rebuild metadata and a new gazID.
- store the processed geometry at stage three.
Move the geometry to the folder '/processed', if it is fully processed.

Value

This function harmonises and integrates so far unprocessed geometries at stage two into stage three of the geospatial database. It produces for each main polygon (e.g. nation) in the registered geometries a spatial file of the specified file-type.

Examples

if(dev.interactive()){
  library(sf)

  # build the example database
  adb_example(until = "regGeometry", path = tempdir())

  # normalise all geometries ...
  normGeometry(pattern = "estonia")

  # ... and check the result
  st_layers(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
  output <- st_read(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
}
if(dev.interactive()){
  library(sf)

  # build the example database
  adb_example(until = "regGeometry", path = tempdir())

  # normalise all geometries ...
  normGeometry(pattern = "estonia")

  # ... and check the result
  st_layers(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
  output <- st_read(paste0(tempdir(), "/geometries/stage3/Estonia.gpkg"))
}

Normalise data tables

Description

Harmonise and integrate data tables into standardised format

Usage

normTable(
  input = NULL,
  pattern = NULL,
  query = NULL,
  ontoMatch = NULL,
  beep = NULL,
  verbose = FALSE
)
normTable(
  input = NULL,
  pattern = NULL,
  query = NULL,
  ontoMatch = NULL,
  beep = NULL,
  verbose = FALSE
)

Arguments

`input`	`character(1)` path of the file to normalise. If this is left empty, all files at stage two as subset by `pattern` are chosen.
`pattern`	`character(1)` an optional regular expression. Only dataset names which match the regular expression will be processed.
`query`	`character(1)` the expression that would be used in `filter` to subset a tibble in terms of the columns defined via the schema and given as a single character string, such as `"al1 == 'Estonia'"`.
`ontoMatch`	`character(.)` name of the column(s) that shall be matched with an ontology (defined in `adb_init`).
`beep`	`integerish(1)` Number specifying what sound to be played to signal the user that a point of interaction is reached by the program, see `beep`.
`verbose`	`logical(1)` be verbose about translating terms (default `FALSE`). Furthermore, you can use `suppressMessages` to make this function completely silent.

Details

To normalise data tables, this function proceeds as follows:

Read in input and extract initial metadata from the file name.
Employ the function tabshiftr::reorganise() to reshape input according to the respective schema description.
The territorial names are matched with the gazetteer to harmonise new territorial names (at this step, the function might ask the user to edit the file 'matching.csv' to align new names with already harmonised names).
Harmonise territorial unit names.
store the processed data table at stage three.

Value

This function harmonises and integrates so far unprocessed data tables at stage two into stage three of the areal database. It produces for each main polygon (e.g. nation) in the registered data tables a file that includes all thematic areal data.

Examples

if(dev.interactive()){
  # build the example database
  adb_example(until = "normGeometry", path = tempdir())

  # normalise all available data tables ...
  normTable()

  # ... and check the result
  output <- readRDS(paste0(tempdir(), "/tables/stage3/Estonia.rds"))
}
if(dev.interactive()){
  # build the example database
  adb_example(until = "normGeometry", path = tempdir())

  # normalise all available data tables ...
  normTable()

  # ... and check the result
  output <- readRDS(paste0(tempdir(), "/tables/stage3/Estonia.rds"))
}

Register a new dataseries

Description

This function registers a new dataseries of both, geometries or areal data into the geospatial database. This contains the name and relevant meta-data of a dataseries to enable provenance tracking and reproducability.

Usage

regDataseries(
  name = NULL,
  description = NULL,
  homepage = NULL,
  version = NULL,
  licence_link = NULL,
  reference = NULL,
  notes = NULL,
  overwrite = FALSE
)
regDataseries(
  name = NULL,
  description = NULL,
  homepage = NULL,
  version = NULL,
  licence_link = NULL,
  reference = NULL,
  notes = NULL,
  overwrite = FALSE
)

Arguments

`name`	`character(1)` the dataseries abbreviation or name.
`description`	`character()` the "long name" or "brief description" of the dataseries.
`homepage`	`character(1)` the homepage of the data provider where the dataseries or additional information can be found.
`version`	`character(1)` the version number or date when meta data of the dataseries were recorded.
`licence_link`	`character(1)` link to the licence or the webpage from which the licence was copied.
`reference`	`bibentry(1)` in case the dataseries comes with a reference, provide this here as bibentry object.
`notes`	`character(1)` optional notes.
`overwrite`	`logical(1)` whether or not the dataseries to register shall overwrite a potentially already existing older version.

Value

Returns a tibble of the new entry that is appended to 'inv_dataseries.csv'.

Examples

if(dev.interactive()){
  # start the example database
  adb_exampleDB(until = "match_gazetteer", path = tempdir())

  regDataseries(name = "gadm",
                description = "Database of Global Administrative Areas",
                version = "3.6",
                homepage = "https://gadm.org/index.html",
                licence_link = "https://gadm.org/license.html")
}
if(dev.interactive()){
  # start the example database
  adb_exampleDB(until = "match_gazetteer", path = tempdir())

  regDataseries(name = "gadm",
                description = "Database of Global Administrative Areas",
                version = "3.6",
                homepage = "https://gadm.org/index.html",
                licence_link = "https://gadm.org/license.html")
}

Register a new geometry entry

Description

This function registers a new geometry of territorial units into the geospatial database.

Usage

regGeometry(
  ...,
  subset = NULL,
  gSeries = NULL,
  label = NULL,
  ancillary = NULL,
  layer = NULL,
  archive = NULL,
  archiveLink = NULL,
  downloadDate = NULL,
  updateFrequency = NULL,
  notes = NULL,
  overwrite = FALSE
)
regGeometry(
  ...,
  subset = NULL,
  gSeries = NULL,
  label = NULL,
  ancillary = NULL,
  layer = NULL,
  archive = NULL,
  archiveLink = NULL,
  downloadDate = NULL,
  updateFrequency = NULL,
  notes = NULL,
  overwrite = FALSE
)

Arguments

`...`	`character(1)` optional named argument selecting the main territory into which this geometry is nested. The name of this must be a class of the gazetteer and the value must be one of the territory names of that class, e.g. nation = "Estonia".
`subset`	`character(1)` optional argument to specify which subset the file contains. This could be a subset of territorial units (e.g. only one municipality) or of a target variable.
`gSeries`	`character(1)` the name of the geometry dataseries (see `regDataseries`).
`label`	`list(.)` list of as many columns as there are in common in the ontology and this geometry. Must be of the form `list(class = columnName)`, with 'class' as the class of the ontology corresponding to the respective column name in the geometry.
`ancillary`	`list(.)` optinal list of columns containing ancillary information. Must be of the form `list(attribute = columnName)`, where `attribute` can be one or several of `"name_ltn"` (the english name in latin letters) `"name_lcl"` (the name in local language and letters) `"code"` (any code describing the unit) `"type"` (the type of territorial unit) `"uri"` (the semantic web URI) or `"flag"` (any flag attributed to the unit).
`layer`	`character(1)` the name of the file's layer from which the geometry should be created (if applicable).
`archive`	`character(1)` the original file (perhaps a *.zip) from which the geometry emerges.
`archiveLink`	`character(1)` download-link of the archive.
`downloadDate`	`character(1)` value describing the download date of this dataset (in YYYY-MM-DD format).
`updateFrequency`	`character(1)` value describing the frequency with which the dataset is updated, according to the ISO 19115 Codelist, MD_MaintenanceFrequencyCode. Possible values are: 'continual', 'daily', 'weekly', 'fortnightly', 'quarterly', 'biannually', 'annually', 'asNeeded', 'irregular', 'notPlanned', 'unknown', 'periodic', 'semimonthly', 'biennially'.
`notes`	`character(1)` optional notes that are assigned to all features of this geometry.
`overwrite`	`logical(1)` whether or not the geometry to register shall overwrite a potentially already existing older version.

Details

When processing geometries to which areal data shall be linked, carry out the following steps:

Determine the main territory (such as a nation, or any other polygon), a subset (if applicable), the dataseries of the geometry and the ontology label, and provide them as arguments to this function.
Run the function.
Export the shapefile with the following properties:
- Format: GeoPackage
- File name: What is provided as message by this function
- CRS: EPSG:4326 - WGS 84
- make sure that 'all fields are exported'
Confirm that you have saved the file.

Value

Returns a tibble of the entry that is appended to 'inv_geometries.csv'.

Examples

if(dev.interactive()){
  # build the example database
  adb_exampleDB(until = "regDataseries", path = tempdir())

  # The GADM dataset comes as *.7z archive
  regGeometry(gSeries = "gadm",
              label = list(al1 = "NAME_0"),
              layer = "example_geom1",
              archive = "example_geom.7z|example_geom1.gpkg",
              archiveLink = "https://gadm.org/",
              nextUpdate = "2019-10-01",
              updateFrequency = "quarterly")

  # The second administrative level in GADM contains names in the columns
  # NAME_0 and NAME_1
  regGeometry(gSeries = "gadm",
              label = list(al1 = "NAME_0", al2 = "NAME_1"),
              ancillary = list(name_lcl = "VARNAME_1", code = "GID_1", type = "TYPE_1"),
              layer = "example_geom2",
              archive = "example_geom.7z|example_geom2.gpkg",
              archiveLink = "https://gadm.org/",
              nextUpdate = "2019-10-01",
              updateFrequency = "quarterly")
}
if(dev.interactive()){
  # build the example database
  adb_exampleDB(until = "regDataseries", path = tempdir())

  # The GADM dataset comes as *.7z archive
  regGeometry(gSeries = "gadm",
              label = list(al1 = "NAME_0"),
              layer = "example_geom1",
              archive = "example_geom.7z|example_geom1.gpkg",
              archiveLink = "https://gadm.org/",
              nextUpdate = "2019-10-01",
              updateFrequency = "quarterly")

  # The second administrative level in GADM contains names in the columns
  # NAME_0 and NAME_1
  regGeometry(gSeries = "gadm",
              label = list(al1 = "NAME_0", al2 = "NAME_1"),
              ancillary = list(name_lcl = "VARNAME_1", code = "GID_1", type = "TYPE_1"),
              layer = "example_geom2",
              archive = "example_geom.7z|example_geom2.gpkg",
              archiveLink = "https://gadm.org/",
              nextUpdate = "2019-10-01",
              updateFrequency = "quarterly")
}

Register a new areal data table

Description

This function registers a new areal data table into the geospatial database.

Usage

regTable(
  ...,
  subset = NULL,
  dSeries = NULL,
  gSeries = NULL,
  label = NULL,
  begin = NULL,
  end = NULL,
  schema = NULL,
  archive = NULL,
  archiveLink = NULL,
  downloadDate = NULL,
  updateFrequency = NULL,
  metadataLink = NULL,
  metadataPath = NULL,
  notes = NULL,
  diagnose = FALSE,
  overwrite = FALSE
)
regTable(
  ...,
  subset = NULL,
  dSeries = NULL,
  gSeries = NULL,
  label = NULL,
  begin = NULL,
  end = NULL,
  schema = NULL,
  archive = NULL,
  archiveLink = NULL,
  downloadDate = NULL,
  updateFrequency = NULL,
  metadataLink = NULL,
  metadataPath = NULL,
  notes = NULL,
  diagnose = FALSE,
  overwrite = FALSE
)

Arguments

`...`	`character(1)` name and value of the topmost unit under which the table shall be registered. The name of this must be a class of the gazetteer and the value must be one of the territory names of that class, e.g. nation = "Estonia".
`subset`	`character(1)` optional argument to specify which subset the file contains. This could be a subset of territorial units (e.g. only one municipality) or of a target variable.
`dSeries`	`character(1)` the dataseries of the areal data (see `regDataseries`).
`gSeries`	`character(1)` optionally, the dataseries of the geometries, if the geometry dataseries deviates from the dataseries of the areal data (see `regDataseries`).
`label`	`integerish(1)` the label in the onology this geometry should correspond to.
`begin`	`integerish(1)` the date from which on the data are valid.
`end`	`integerish(1)` the date until which the data are valid.
`schema`	`schema` the schema description of the table to read in (must have been placed in the global environment before calling it here).
`archive`	`character(1)` the original file from which the boundaries emerge.
`archiveLink`	`character(1)` download-link of the archive.
`downloadDate`	`character(1)` value describing the download date of this dataset (in YYYY-MM-DD format).
`updateFrequency`	`character(1)` value describing the frequency with which the dataset is updated, according to the ISO 19115 Codelist, MD_MaintenanceFrequencyCode. Possible values are: 'continual', 'daily', 'weekly', 'fortnightly', 'quarterly', 'biannually', 'annually', 'asNeeded', 'irregular', 'notPlanned', 'unknown', 'periodic', 'semimonthly', 'biennially'.
`metadataLink`	`character(1)` if there is already metadata existing: link to the meta dataset.
`metadataPath`	`character(1)` if an existing meta dataset was downloaded along the data: the path where it is stored locally.
`notes`	`character(1)` optional notes.
`diagnose`	`logical(1)` whether or not to try to reorganise the table with the provided schema. note: this does not save the reogranised table into the database yet, further steps of harmonisation are carried out by `normTable` before that.
`overwrite`	`logical(1)` whether or not the geometry to register shall overwrite a potentially already existing older version.

Details

When processing areal data tables, carry out the following steps:

Determine the main territory (such as a nation, or any other polygon), a subset (if applicable), the ontology label and the dataseries of the areal data and of the geometry, and provide them as arguments to this function.
Provide a begin and end date for the areal data.
Run the function.
(Re)Save the table with the following properties:
- Format: csv
- Encoding: UTF-8
- File name: What is provided as message by this function
- make sure that the file is not modified or reshaped. This will happen during data normalisation via the schema description, which expects the original table.
Confirm that you have saved the file.

Every areal data dataseries (dSeries) may come as a slight permutation of a particular table arrangement. The function normTable expects internally a schema description (a list that describes the position of the data components) for each data table, which is saved as paste0("meta_", dSeries, TAB_NUMBER). See package tabshiftr.

Value

Returns a tibble of the entry that is appended to 'inv_tables.csv' in case update = TRUE.

Examples

if(dev.interactive()){
  # build the example database
  adb_exampleDB(until = "regGeometry", path = tempdir())

  # the schema description for this table
  library(tabshiftr)

  schema_madeUp <-
    setIDVar(name = "al1", columns = 1) %>%
    setIDVar(name = "year", columns = 2) %>%
    setIDVar(name = "commodities", columns = 3) %>%
    setObsVar(name = "harvested",
              factor = 1, columns = 4) %>%
    setObsVar(name = "production",
              factor = 1, columns = 5)

  regTable(nation = "Estonia",
           subset = "barleyMaize",
           label = "al1",
           dSeries = "madeUp",
           gSeries = "gadm",
           begin = 1990,
           end = 2017,
           schema = schema_madeUp,
           archive = "example_table.7z|example_table1.csv",
           archiveLink = "...",
           nextUpdate = "2024-10-01",
           updateFrequency = "quarterly",
           metadataLink = "...",
           metadataPath = "my/local/path")
}
if(dev.interactive()){
  # build the example database
  adb_exampleDB(until = "regGeometry", path = tempdir())

  # the schema description for this table
  library(tabshiftr)

  schema_madeUp <-
    setIDVar(name = "al1", columns = 1) %>%
    setIDVar(name = "year", columns = 2) %>%
    setIDVar(name = "commodities", columns = 3) %>%
    setObsVar(name = "harvested",
              factor = 1, columns = 4) %>%
    setObsVar(name = "production",
              factor = 1, columns = 5)

  regTable(nation = "Estonia",
           subset = "barleyMaize",
           label = "al1",
           dSeries = "madeUp",
           gSeries = "gadm",
           begin = 1990,
           end = 2017,
           schema = schema_madeUp,
           archive = "example_table.7z|example_table1.csv",
           archiveLink = "...",
           nextUpdate = "2024-10-01",
           updateFrequency = "quarterly",
           metadataLink = "...",
           metadataPath = "my/local/path")
}

Example `gazetteer`

Description

An ontology of territory names (gazetteer)

Usage

territories
territories

Format

object of class onto for the example territories used in adb_example.

Package 'arealDB'

Help Index

Edit matches manually in a csv-table

Description

Usage

Arguments

Details

Value

Get the column types of a tibble

Description

Usage

Arguments

Match target terms with an ontology

Description

Usage

Arguments

Value

Update an ontology

Description

Usage

Arguments

Value

Archive the data from an areal database

Description

Usage

Arguments

Details

Value

Backup the current state of an areal database

Description

Usage

Details

Value

Diagnose databse contents

Description

Usage

Arguments

Build an example areal database

Description

Usage

Arguments

Details

Value

Examples

Initiate an areal database

Description

Usage

Arguments

Details

Value

Examples

Load the inventory of the currently active areal database

Description

Usage

Arguments

Value

Load the metadata from an areal database

Description

Usage

Load the currently active ontology

Description

Usage

Arguments

Value

Extract database contents

Description

Usage

Arguments

Value

Examples

Reset an areal database to its unfilled state

Description

Usage

Arguments

Value

Restore the database from a backup

Description

Usage

Arguments

Details

Example `gazetteer`