Title: | A Lightweight and Lightning-Fast Taxonomic Naming Interface |
---|---|
Description: | Creates a local Lightning Memory-Mapped Database ('LMDB') of many commonly used taxonomic authorities and provides functions that can quickly query this data. Supported taxonomic authorities include the Integrated Taxonomic Information System ('ITIS'), National Center for Biotechnology Information ('NCBI'), Global Biodiversity Information Facility ('GBIF'), Catalogue of Life ('COL'), and Open Tree Taxonomy ('OTT'). Name and identifier resolution using 'LMDB' can be hundreds of times faster than either relational databases or internet-based queries. Precise data provenance information for data derived from naming providers is also included. |
Authors: | Carl Boettiger [aut, cre] , Kari Norman [aut] |
Maintainer: | Carl Boettiger <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.5 |
Built: | 2024-10-24 02:42:26 UTC |
Source: | https://github.com/cboettig/taxalight |
acceptedNameUsageID
given a scientific nameReturn the accepted taxonomic identifier, acceptedNameUsageID
given a scientific name
get_ids( name, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
get_ids( name, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
name |
character vector of scientific names |
provider |
Abbreviation for a known naming provider.
Provider data should first be imported with |
version |
version of the authority to use (e.g. four-digit year) |
dir |
storage location for the LMDB databases |
a vector of matching accepted identifiers. Note that if the name
provided is considered to be a synonym by the provider, then the ID corresponds
to the accepted name and not the synonym. (i.e. get_names(get_ids(synonym))
)
will return the accepted name and not the synonym name.
# slow initial import sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor") get_ids(sp, "itis_test") # use "itis_test" test data for example only
# slow initial import sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor") get_ids(sp, "itis_test") # use "itis_test" test data for example only
scientificName
names given taxonomic identifiersReturn scientificName
names given taxonomic identifiers
get_names( id, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
get_names( id, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
id |
a character vector of taxonomic identifiers, including provider prefix |
provider |
Abbreviation for a known naming provider.
Provider data should first be imported with |
version |
version of the authority to use (e.g. four-digit year) |
dir |
storage location for the LMDB databases |
a vector of matching scientific names
# slow initial import get_names(c("ITIS:180092", "ITIS:179913"), "itis_test") # uses test version
# slow initial import get_names(c("ITIS:180092", "ITIS:179913"), "itis_test") # uses test version
taxalight query: rapidly look up scientific names from a local database
tl( x, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
tl( x, provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir() )
x |
character vector of either scientific names or taxonomic identifiers (with prefix). Can mix and match too. |
provider |
Abbreviation for a known naming provider.
Provider data should first be imported with |
version |
version of the authority to use (e.g. four-digit year) |
dir |
storage location for the LMDB databases |
Naming providers currently recognized by taxalight
are:
itis
: Integrated Taxonomic Information System, https://www.itis.gov/
ncbi
: National Center for Biotechnology Information,
https://www.ncbi.nlm.nih.gov/taxonomy
col
: Catalogue of Life, http://www.catalogueoflife.org/
gbif
: Global Biodiversity Information Facility, https://www.gbif.org/
ott
: OpenTree Taxonomy: https://github.com/OpenTreeOfLife/reference-taxonomy
itis_test
: a small subset of ITIS, cached locally for testing purposes only.
The default provider is itis
, which can be reconfigured by setting
tl_default_provider
in [options]
.
a data.frame in Darwin Core format with rows matching the acceptedNameUsageID or scientificName requested.
# slow initial import sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor") id <- c("ITIS:180092", "ITIS:179913") ## example uses "itis_test" provider for illustration only: tl(sp, "itis_test") tl(id, "itis_test")
# slow initial import sp <- c("Dendrocygna autumnalis", "Dendrocygna bicolor") id <- c("ITIS:180092", "ITIS:179913") ## example uses "itis_test" provider for illustration only: tl(sp, "itis_test") tl(id, "itis_test")
Download raw data and store in a local LMDB database. Importing data is a time-consuming step that needs be run only once per machine and will persist through sessions.
tl_create( provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir(), lines = 100000L )
tl_create( provider = getOption("tl_default_provider", "itis"), version = tl_latest_version(), dir = tl_dir(), lines = 100000L )
provider |
Abbreviation for a known naming provider.
Provider data should first be imported with |
version |
version of the authority to use (e.g. four-digit year) |
dir |
storage location for the LMDB databases |
lines |
number of lines to read in each chunk. |
Naming providers currently recognized by taxalight
are:
itis
: Integrated Taxonomic Information System, https://www.itis.gov/
ncbi
: National Center for Biotechnology Information,
https://www.ncbi.nlm.nih.gov/taxonomy
col
: Catalogue of Life, http://www.catalogueoflife.org/
gbif
: Global Biodiversity Information Facility, https://www.gbif.org/
ott
: OpenTree Taxonomy: https://github.com/OpenTreeOfLife/reference-taxonomy
itis_test
: a small subset of ITIS, cached locally for testing purposes only.
The default provider is itis
, which can be reconfigured by setting
tl_default_provider
in [options]
.
## example uses "itis_test" for illustration only: # test may take > 5s tl_create("itis_test")
## example uses "itis_test" for illustration only: # test may take > 5s tl_create("itis_test")
taxalight stores data for persistent access in the directory given by
tl_dir() by default. All functions can override this choice by passing
an alternative path to the dir
argument, or configure the location
system-wide by setting the environmental variable TAXALIGHT_HOME
,
eg. in their .Renviron
file, see Sys.setenv()
. If unset, the default
location is the default for the operating system, as provided by the
core R function tools::R_user_dir()
. Users can manually purge the data
storage at any time by deleting this directory.
tl_dir()
tl_dir()
tl_dir()
tl_dir()