Title: | Interface to the 'MinIO' Client |
---|---|
Description: | An R interface to the 'MinIO' Client. The 'MinIO' Client ('mc') provides a modern alternative to UNIX commands like 'ls', 'cat', 'cp', 'mirror', 'diff', 'find' etc. It supports 'filesystems' and Amazon "S3" compatible cloud storage service ("AWS" Signature v2 and v4). This package provides convenience functions for installing the 'MinIO' client and running any operations, as described in the official documentation, <https://min.io/docs/minio/linux/reference/minio-mc.html?ref=docs-redirect>. This package provides a flexible and high-performance alternative to 'aws.s3'. |
Authors: | Carl Boettiger [aut, cre] , Markus Skyttner [ctb] |
Maintainer: | Carl Boettiger <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.6 |
Built: | 2024-10-30 02:44:06 UTC |
Source: | https://github.com/cboettig/minioclient |
install the mc client
install_mc( os = system_os(), arch = system_arch(), path = minio_path(), force = FALSE )
install_mc( os = system_os(), arch = system_arch(), path = minio_path(), force = FALSE )
os |
operating system |
arch |
architecture |
path |
destination where binary is installed. |
force |
install even if binary is already found. Can be used to force upgrade. |
This function is just a convenience wrapper for prebuilt MINIO binaries, from https://dl.min.io/client/mc/release/. Should support Windows, Mac, and Linux on both Intel/AMD (amd64) and ARM architectures. For details, see official MINIO docs for your operating system, e.g. https://min.io/docs/minio/macos/index.html.
NOTE: If you want to install to other than the default location,
simply set the option "minioclient.dir", to the appropriate location of the
directory containing your "mc" binary, e.g.
options("minioclient.dir" = "~/.mc")
. This is also used as the location
of the config directory. Note that this package
will not automatically use MINIO available on $PATH (to promote security
and portability in design).
path to the minio binary (invisibly)
install_mc() # Force upgrade install_mc(force=TRUE)
install_mc() # Force upgrade install_mc(force=TRUE)
The MINIO Client
mc(command, ..., path = minio_path(), verbose = interactive())
mc(command, ..., path = minio_path(), verbose = interactive())
command |
space-delimited text string of an mc command (starting after the mc ...) |
... |
additional arguments to |
path |
location where mc executable will be installed. By default will use the OS-appropriate storage location. |
verbose |
print output? |
This function forms the basis for all other available commands.
This utility can run any mc
command supported by the official minio client,
see https://min.io/docs/minio/linux/reference/minio-mc.html.
The R package provides wrappers only for the most common use cases,
which provide a more natural R syntax and native documentation.
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
List all configured aliases
mc_alias_ls(alias = "")
mc_alias_ls(alias = "")
alias |
optional argument, display only specified alias |
Note that all available
Configured aliases, including secret keys!
mc
Set a new alias for the minio client, possibly using env var defaults.
mc_alias_set( alias = "minio", endpoint = Sys.getenv("AWS_S3_ENDPOINT", "s3.amazonaws.com"), access_key = Sys.getenv("AWS_ACCESS_KEY_ID"), secret_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"), scheme = "https" )
mc_alias_set( alias = "minio", endpoint = Sys.getenv("AWS_S3_ENDPOINT", "s3.amazonaws.com"), access_key = Sys.getenv("AWS_ACCESS_KEY_ID"), secret_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"), scheme = "https" )
alias |
a short name for this endpoint, default is |
endpoint |
the endpoint domain name |
access_key |
access key (user), reads from AWS env vars by default |
secret_key |
secret access key, reads from AWS env vars by default |
scheme |
https or http (e.g. for local machine only) |
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
https://min.io/docs/minio/linux/reference/minio-mc.html. Note that keys can be omitted for anonymous use.
mc_alias_set()
mc_alias_set()
This function uses the mc
command to set the anonymous access policy for
a specified target.
mc_anonymous_set( target, policy = c("download", "upload", "public", "private"), verbose = interactive() )
mc_anonymous_set( target, policy = c("download", "upload", "public", "private"), verbose = interactive() )
target |
Character string specifying the target cloud storage bucket or object |
policy |
Character string specifying the anonymous access policy. Must be one of "download", "upload", "public" (upload and download), or "private". |
verbose |
print output? |
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# create a test bucket on the 'play' server mc_mb("play/minioclient-test") # Set anonymous access policy to download mc_anonymous_set("play/minioclient-test/file.txt", policy = "download") # Set anonymous access policy to upload mc_anonymous_set("play/minioclient-test/directory", policy = "upload") # Set anonymous access policy to public mc_anonymous_set("play/minioclient-test/file.txt", policy = "public") # Set anonymous access policy to private (default policy for new buckets) mc_anonymous_set("play/minioclient-test/directory", policy = "private") mc_rb("play/minioclient-test")
# create a test bucket on the 'play' server mc_mb("play/minioclient-test") # Set anonymous access policy to download mc_anonymous_set("play/minioclient-test/file.txt", policy = "download") # Set anonymous access policy to upload mc_anonymous_set("play/minioclient-test/directory", policy = "upload") # Set anonymous access policy to public mc_anonymous_set("play/minioclient-test/file.txt", policy = "public") # Set anonymous access policy to private (default policy for new buckets) mc_anonymous_set("play/minioclient-test/directory", policy = "private") mc_rb("play/minioclient-test")
The cat command returns the contents of the object as a string. This can be useful when reading smaller files (without first downloading to disk).
mc_cat(target, offset = 0, tail = 0, flags = "")
mc_cat(target, offset = 0, tail = 0, flags = "")
target |
character string specifying the target directory path. |
offset |
start offset, default 0 if not specified |
tail |
tail number of bytes at ending of file, default 0 if not specified |
flags |
additional flags to be passed to the |
a character string with the contents of the file
# upload a file to a bucket and read it back install_mc() mc_mb("play/mcr") mc_cp(system.file(package = "minioclient", "DESCRIPTION"), "play/mcr/DESCRIPTION") mc_cat("play/mcr/DESCRIPTION")
# upload a file to a bucket and read it back install_mc() mc_mb("play/mcr") mc_cp(system.file(package = "minioclient", "DESCRIPTION"), "play/mcr/DESCRIPTION") mc_cat("play/mcr/DESCRIPTION")
Edit the config files, e.g. to add a sessionToken
mc_config_set(alias, key, value, json = file.path(minio_path(), "config.json"))
mc_config_set(alias, key, value, json = file.path(minio_path(), "config.json"))
alias |
A configured alias, see |
key |
the parameter name, e.g. |
value |
the value to set the parameter to |
json |
path to the config |
updates configuration and returns silently (NULL
).
mc_config_set("play", key="sessionToken", value="MyTmpSessionToken")
mc_config_set("play", key="sessionToken", value="MyTmpSessionToken")
Most commonly used to upload and download files between local filesystem and remote S3 store.
mc_cp(from, to = "", recursive = FALSE, flags = "", verbose = FALSE)
mc_cp(from, to = "", recursive = FALSE, flags = "", verbose = FALSE)
from |
Character string specifying the source file or directory path. Can accept a vector of file paths as well. |
to |
Character string specifying the destination path. |
recursive |
Logical indicating whether to recursively copy directories.
Default is |
flags |
any additional flags to |
verbose |
Logical indicating whether to report files copied.
Default is |
see mc("cp -h")
for details.
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
mc_mirror
# Copy a file mc_cp("local/path/to/file.txt", "alias/bucket/path/file.txt") # Copy a directory recursively mc_cp("local/directory", "alias/bucket/path/to/directory", recursive = TRUE)
# Copy a file mc_cp("local/path/to/file.txt", "alias/bucket/path/file.txt") # Copy a directory recursively mc_cp("local/directory", "alias/bucket/path/to/directory", recursive = TRUE)
Show disk usage for a target path
mc_du(target, flags = "")
mc_du(target, flags = "")
target |
alias/bucket to list |
flags |
optional additional flags |
for more help, run mc_du("-h")
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# create a new bucket mc_mb("play/minioclient-test") # no disk usage on new bucket mc_du("play/minioclient-test") # clean up mc_rb("play/minioclient-test")
# create a new bucket mc_mb("play/minioclient-test") # no disk usage on new bucket mc_du("play/minioclient-test") # clean up mc_rb("play/minioclient-test")
The head command returns the first n lines of the object as a string. This can be useful when inspecting the content of a large file (without first having to download and store it on disk locally).
mc_head(target, n = 10, flags = "")
mc_head(target, n = 10, flags = "")
target |
character string specifying the target directory path. |
n |
integer number of lines to read from the beginning, by default 10 |
flags |
additional flags to be passed to the |
a character string with the contents of the file
# upload a CSV file install_mc() tf <- tempfile() write.csv(iris, tf, row.names = FALSE) mc_mb("play/iris") mc_cp(tf, "play/iris/iris.csv") # read first 13 lines from the CSV (header + 12 rows of data) read.csv(text = mc_head("play/iris/iris.csv", n = 13))
# upload a CSV file install_mc() tf <- tempfile() write.csv(iris, tf, row.names = FALSE) mc_mb("play/iris") mc_cp(tf, "play/iris/iris.csv") # read first 13 lines from the CSV (header + 12 rows of data) read.csv(text = mc_head("play/iris/iris.csv", n = 13))
This function uses the mc
command to list files and directories
at the specified target location.
mc_ls(target, recursive = FALSE, details = FALSE)
mc_ls(target, recursive = FALSE, details = FALSE)
target |
character vector specifying the target directory path(s). |
recursive |
Logical indicating whether to recursively list directories.
Default is |
details |
logical, by default FALSE; if TRUE a data frame with details for the directory listing is returned. |
a vector of file or directory names ("keys" in minio parlance) or, if details is TRUE, a data.frame with the directory listing information
# list all buckets on play server mc_ls("play/") mc_ls("play", details = TRUE)
# list all buckets on play server mc_ls("play/") mc_ls("play", details = TRUE)
Create a new S3 bucket using mc command
mc_mb(bucket, ignore_existing = TRUE, flags = "", verbose = TRUE)
mc_mb(bucket, ignore_existing = TRUE, flags = "", verbose = TRUE)
bucket |
Character string specifying the name of the bucket to create. |
ignore_existing |
do not error if bucket already exists |
flags |
additional flags, see |
verbose |
print output? |
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# Create a new bucket named "my-bucket" mc_mb("play/my-bucket")
# Create a new bucket named "my-bucket" mc_mb("play/my-bucket")
This function uses the mc
command to mirror files and directories
from one location to another.
mc_mirror( from, to, overwrite = FALSE, remove = FALSE, flags = "", verbose = FALSE )
mc_mirror( from, to, overwrite = FALSE, remove = FALSE, flags = "", verbose = FALSE )
from |
Character string specifying the source file or directory path. |
to |
Character string specifying the destination path. |
overwrite |
Logical indicating whether to overwrite existing files.
Default is |
remove |
Logical indicating whether to remove extraneous files from
the destination. Default is |
flags |
Additional flags to be passed to the |
verbose |
Logical indicating whether to display verbose output.
Default is |
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# Mirror files and directories from source to destination mc_mirror("path/to/source", "path/to/destination") # Mirror files and directories with overwrite and remove options mc_mirror("path/to/source", "path/to/destination", overwrite = TRUE, remove = TRUE) # Mirror files and directories with additional flags and verbose output mc_mirror("path/to/source", "path/to/destination", flags = "--exclude '*.txt'", verbose = TRUE)
# Mirror files and directories from source to destination mc_mirror("path/to/source", "path/to/destination") # Mirror files and directories with overwrite and remove options mc_mirror("path/to/source", "path/to/destination", overwrite = TRUE, remove = TRUE) # Mirror files and directories with additional flags and verbose output mc_mirror("path/to/source", "path/to/destination", flags = "--exclude '*.txt'", verbose = TRUE)
move or rename files or directories between servers
mc_mv(from, to, recursive = FALSE, flags = "", verbose = FALSE)
mc_mv(from, to, recursive = FALSE, flags = "", verbose = FALSE)
from |
Character string specifying the source file or directory path. Can accept a vector of file paths as well. |
to |
Character string specifying the destination path. |
recursive |
Logical indicating whether to recursively move directories.
Default is |
flags |
any additional flags to |
verbose |
Logical indicating whether to report files copied.
Default is |
see mc("mv -h")
for details.
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
mc_cp
# move a file mc_mv("local/path/to/file.txt", "alias/bucket/path/file.txt") # move a directory recursively mc_mv("local/directory", "alias/bucket/path/to/directory", recursive = TRUE)
# move a file mc_mv("local/path/to/file.txt", "alias/bucket/path/file.txt") # move a directory recursively mc_mv("local/directory", "alias/bucket/path/to/directory", recursive = TRUE)
Remove an S3 bucket using mc command
mc_rb(bucket, force = FALSE)
mc_rb(bucket, force = FALSE)
bucket |
Character string specifying the name of the bucket to remove |
force |
Delete bucket without confirmation in non-interactive mode |
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# Create a new bucket named "my-bucket" on the "play" system mc_mb("play/my-bucket") mc_rb("play/my-bucket")
# Create a new bucket named "my-bucket" on the "play" system mc_mb("play/my-bucket") mc_rb("play/my-bucket")
This function uses the mc
command to remove files or directories
at the specified target location.
mc_rm(target, recursive = FALSE, flags = "", verbose = FALSE)
mc_rm(target, recursive = FALSE, flags = "", verbose = FALSE)
target |
Character string specifying the target file or directory path to be removed. |
recursive |
Logical indicating whether to recursively remove
directories. Default is |
flags |
Additional flags to be passed to the |
verbose |
Logical indicating whether to list files removed.
Default is |
see mc("rm -h")
for details.
Returns the list from processx::run()
, with components status
,
stdout
, stderr
, and timeout
; invisibly.
# Remove a file mc_rm("path/to/file.txt") # Remove a directory recursively mc_rm("path/to/directory", recursive = TRUE)
# Remove a file mc_rm("path/to/file.txt") # Remove a directory recursively mc_rm("path/to/directory", recursive = TRUE)
The S3 Select API can be used against CSV and JSON objects stored in minio. If the minio server runs with MINIO_API_SELECT_PARQUET=on, also parquet files can be queried.
mc_sql( target, query = "select * from S3Object", recursive = TRUE, verbose = FALSE )
mc_sql( target, query = "select * from S3Object", recursive = TRUE, verbose = FALSE )
target |
character alias or path specification at minio for the object (a .csv, .json or .parquet file) |
query |
character string with sql query, by default "select * from S3Object" |
recursive |
logical, by default TRUE, allowing a s3 select query to work across a minio ALIAS/PATH specification |
verbose |
logical, by default FALSE |
See https://min.io/docs/minio/linux/reference/minio-mc/mc-sql.html# and https://github.com/minio/minio/blob/master/docs/select/README.md
For example "select s.* from S3Object s limit 10" is valid syntax.
More examples of query syntax here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-select-sql-reference-select.html
SQL query results as a data.frame
of class tbl_df
install_mc() # upload a CSV file tf <- tempfile() write.csv(iris, tf, row.names = FALSE) mc_mb("play/iris") mc_cp(tf, "play/iris/iris.csv") # read first 12 lines from the CSV mc_sql("play/iris/iris.csv", query = "select * from S3Object limit 12")
install_mc() # upload a CSV file tf <- tempfile() write.csv(iris, tf, row.names = FALSE) mc_mb("play/iris") mc_cp(tf, "play/iris/iris.csv") # read first 12 lines from the CSV mc_sql("play/iris/iris.csv", query = "select * from S3Object limit 12")