Package 'taxalight'

Title: A Lightweight and Lightning-Fast Taxonomic Naming Interface
Description: Creates a local Lightning Memory-Mapped Database ('LMDB') of many commonly used taxonomic authorities and provides functions that can quickly query this data. Supported taxonomic authorities include the Integrated Taxonomic Information System ('ITIS'), National Center for Biotechnology Information ('NCBI'), Global Biodiversity Information Facility ('GBIF'), Catalogue of Life ('COL'), and Open Tree Taxonomy ('OTT'). Name and identifier resolution using 'LMDB' can be hundreds of times faster than either relational databases or internet-based queries. Precise data provenance information for data derived from naming providers is also included.
Authors: Carl Boettiger [aut, cre] , Kari Norman [aut]
Maintainer: Carl Boettiger <[email protected]>
License: MIT + file LICENSE
Version: 0.1.5
Built: 2024-10-24 02:42:26 UTC
Source: https://github.com/cboettig/taxalight

Help Index


Return the accepted taxonomic identifier, acceptedNameUsageID given a scientific name

Description

Return the accepted taxonomic identifier, acceptedNameUsageID given a scientific name

Usage

get_ids(
  name,
  provider = getOption("tl_default_provider", "itis"),
  version = tl_latest_version(),
  dir = tl_dir()
)

Arguments

name

character vector of scientific names

provider

Abbreviation for a known naming provider. Provider data should first be imported with ⁠[tl_create]⁠. Note: setting provider to "itis_test" is for testing purposes only, use "itis" for the full ITIS data. See details

version

version of the authority to use (e.g. four-digit year)

dir

storage location for the LMDB databases

Value

a vector of matching accepted identifiers. Note that if the name provided is considered to be a synonym by the provider, then the ID corresponds to the accepted name and not the synonym. (i.e. get_names(get_ids(synonym))) will return the accepted name and not the synonym name.

Examples

# slow initial import
sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor")
get_ids(sp, "itis_test") # use "itis_test" test data for example only

Return scientificName names given taxonomic identifiers

Description

Return scientificName names given taxonomic identifiers

Usage

get_names(
  id,
  provider = getOption("tl_default_provider", "itis"),
  version = tl_latest_version(),
  dir = tl_dir()
)

Arguments

id

a character vector of taxonomic identifiers, including provider prefix

provider

Abbreviation for a known naming provider. Provider data should first be imported with ⁠[tl_create]⁠. Note: setting provider to "itis_test" is for testing purposes only, use "itis" for the full ITIS data. See details

version

version of the authority to use (e.g. four-digit year)

dir

storage location for the LMDB databases

Value

a vector of matching scientific names

Examples

# slow initial import
get_names(c("ITIS:180092", "ITIS:179913"), "itis_test") # uses test version

taxalight query: rapidly look up scientific names from a local database

Description

taxalight query: rapidly look up scientific names from a local database

Usage

tl(
  x,
  provider = getOption("tl_default_provider", "itis"),
  version = tl_latest_version(),
  dir = tl_dir()
)

Arguments

x

character vector of either scientific names or taxonomic identifiers (with prefix). Can mix and match too.

provider

Abbreviation for a known naming provider. Provider data should first be imported with ⁠[tl_create]⁠. Note: setting provider to "itis_test" is for testing purposes only, use "itis" for the full ITIS data. See details

version

version of the authority to use (e.g. four-digit year)

dir

storage location for the LMDB databases

Details

Naming providers currently recognized by taxalight are:

The default provider is itis, which can be reconfigured by setting tl_default_provider in ⁠[options]⁠.

Value

a data.frame in Darwin Core format with rows matching the acceptedNameUsageID or scientificName requested.

See Also

tl_create

Examples

# slow initial import
sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor")
id <- c("ITIS:180092", "ITIS:179913")

## example uses "itis_test" provider for illustration only:
tl(sp, "itis_test")
tl(id, "itis_test")

Create a Lightning Memory-Mapped Database (LMDB) for a given provider

Description

Download raw data and store in a local LMDB database. Importing data is a time-consuming step that needs be run only once per machine and will persist through sessions.

Usage

tl_create(
  provider = getOption("tl_default_provider", "itis"),
  version = tl_latest_version(),
  dir = tl_dir(),
  lines = 100000L
)

Arguments

provider

Abbreviation for a known naming provider. Provider data should first be imported with ⁠[tl_create]⁠. Note: setting provider to "itis_test" is for testing purposes only, use "itis" for the full ITIS data. See details

version

version of the authority to use (e.g. four-digit year)

dir

storage location for the LMDB databases

lines

number of lines to read in each chunk.

Details

Naming providers currently recognized by taxalight are:

The default provider is itis, which can be reconfigured by setting tl_default_provider in ⁠[options]⁠.

Examples

## example uses "itis_test" for illustration only:



 # test may take > 5s
tl_create("itis_test")

taxalight data directory

Description

taxalight stores data for persistent access in the directory given by tl_dir() by default. All functions can override this choice by passing an alternative path to the dir argument, or configure the location system-wide by setting the environmental variable TAXALIGHT_HOME, eg. in their .Renviron file, see Sys.setenv(). If unset, the default location is the default for the operating system, as provided by the core R function tools::R_user_dir(). Users can manually purge the data storage at any time by deleting this directory.

Usage

tl_dir()

Examples

tl_dir()