TileDB C++ API Reference¶
Context¶
-
class
Context
¶ A TileDB context wraps a TileDB storage manager “instance.” Most objects and functions will require a Context.
Internal error handling is also defined by the Context; the default error handler throws a TileDBError with a specific message.
Example:
tiledb::Context ctx; // Use ctx when creating other objects: tiledb::ArraySchema schema(ctx, TILEDB_SPARSE); // Set a custom error handler: ctx.set_error_handler([](const std::string &msg) { std::cerr << msg << std::endl; });
Public Functions
-
Context
()¶ Constructor. Creates a TileDB Context with default configuration.
- Exceptions
TileDBError
: if construction fails
-
Context
(const Config &config)¶ Constructor. Creates a TileDB context with the given configuration.
- Exceptions
TileDBError
: if construction fails
-
void
handle_error
(int rc) const¶ Error handler for the TileDB C API calls. Throws an exception in case of error.
- Parameters
rc
: If != TILEDB_OK, calls error handler
-
std::shared_ptr<tiledb_ctx_t>
ptr
() const¶ Returns the C TileDB context object.
-
Context &
set_error_handler
(const std::function<void(const std::string&)> &fn)¶ Sets the error handler callback. If none is set, the
default_error_handler
is used. The callback accepts an error message.- Return
- Reference to this Context
- Parameters
fn
: Error handler callback function
-
bool
is_supported_fs
(tiledb_filesystem_t fs) const¶ Return true if the given filesystem backend is supported.
Example:
tiledb::Context ctx; bool s3_supported = ctx.is_supported_fs(TILEDB_S3);
- Parameters
fs
: Filesystem to check
-
void
cancel_tasks
() const¶ Cancels all background or async tasks associated with this context.
-
void
set_tag
(const std::string &key, const std::string &value)¶ Sets a string/string KV tag on the context.
Public Static Functions
-
static void
default_error_handler
(const std::string &msg)¶ The default error handler callback.
- Exceptions
TileDBError
: with the error message
-
Config¶
-
class
Config
¶ Carries configuration parameters for a context.
Example:
Config conf; conf["vfs.s3.region"] = "us-east-1a"; conf["vfs.s3.use_virtual_addressing"] = "true"; Context ctx(conf); // array/kv operations with ctx
Public Functions
-
Config
(const std::string &filename)¶ Constructor that takes as input a filename (URI) that stores the config parameters. The file must have the following (text) format:
{parameter} {value}
Anything following a
#
character is considered a comment and, thus, is ignored.See
Config::set
for the various TileDB config parameters and allowed values.- Parameters
filename
: The name of the file where the parameters will be read from.
-
Config
(tiledb_config_t **config)¶ Constructor from a C config object.
-
void
save_to_file
(const std::string filename)¶ Saves the config parameters to a (local) text file.
-
std::shared_ptr<tiledb_config_t>
ptr
() const¶ Returns the pointer to the TileDB C config object.
-
Config &
set
(const std::string ¶m, const std::string &value)¶ Sets a config parameter.
Parameters
sm.dedup_coords
If
true
, cells with duplicate coordinates will be removed during sparse array writes. Note that ties during deduplication are arbitrary. Default: falsesm.check_coord_dups
This is applicable only if
sm.dedup_coords
isfalse
. Iftrue
, an error will be thrown if there are cells with duplicate coordinates during sparse array writes. Iffalse
and there are duplicates, the duplicates will be written without errors, but the TileDB behavior could be unpredictable. Default: truesm.check_coord_oob
If
true
, an error will be thrown if there are cells with coordinates falling outside the array domain during sparse array writes. Default: truesm.check_global_order
Checks if the coordinates obey the global array order. Applicable only to sparse writes in global order.
Default: truesm.tile_cache_size
The tile cache size in bytes. Any
uint64_t
value is acceptable. Default: 10,000,000sm.array_schema_cache_size
Array schema cache size in bytes. Anyuint64_t
value is acceptable. Default: 10,000,000sm.fragment_metadata_cache_size
The fragment metadata cache size in bytes. Any
uint64_t
value is acceptable. Default: 10,000,000sm.enable_signal_handlers
Whether or not TileDB will install signal handlers.
Default: truesm.num_async_threads
The number of threads allocated for async queries.
Default: 1sm.num_reader_threads
The number of threads allocated for issuing reads to
VFS in parallel. Default: 1sm.num_writer_threads
The number of threads allocated for issuing writes to
VFS in parallel.Default: 1sm.num_tbb_threads
The number of threads allocated for the TBB thread pool (if TBB is enabled). Note: this is a whole-program setting. Usually this should not be modified from the default. See also the documentation for TBB’s
task_scheduler_init
class.Default: TBB automaticsm.consolidation.amplification
The factor by which the size of the dense fragment resulting from consolidating a set of fragments (containing at least one dense fragment) can be amplified. This is important when the union of the non-empty domains of the fragments to be consolidated have a lot of empty cells, which the consolidated fragment will have to fill with the special fill value (since the resulting fragments is dense).
Default: 1.0sm.consolidation.buffer_size
The size (in bytes) of the attribute buffers used during consolidation.
Default: 50,000,000sm.consolidation.steps
The number of consolidation steps to be performed when executing the consolidation algorithm.
Default: 1sm.consolidation.step_min_frags
The minimum number of fragments to consolidate in a single step.
Default: UINT32_MAXsm.consolidation.step_max_frags
The maximum number of fragments to consolidate in a single step.
Default: UINT32_MAXsm.consolidation.step_size_ratio
The size ratio that two (“adjacent”) fragments must satisfy to be considered for consolidation in a single step.
Default: 0.0sm.memory_budget
The memory budget for tiles of fixed-sized attributes (or offsets for var-sized attributes) to be fetched during reads.
Default: 5GBsm.memory_budget_var
The memory budget for tiles of var-sized attributes to be fetched during reads.
Default: 10GBvfs.num_threads
The number of threads allocated for
VFS operations (any backend), per VFS instance. Default: number of coresvfs.min_parallel_size
The minimum number of bytes in a parallel
VFS operation (except parallel S3 writes, which are controlled byvfs.s3.multipart_part_size
.) Default: 10MBvfs.min_batch_size
The minimum number of bytes in a
VFS read operationDefault: 20MBvfs.min_batch_gap
The minimum number of bytes between two
VFS read batches.Default: 500KBvfs.file.max_parallel_ops
The maximum number of parallel operations on objects with
URIs. Default:
vfs.num_threads
vfs.file.enable_filelocks
If set to
false
, file locking operations are no-ops forURIs in VFS. Default:
true
vfs.s3.region
The S3 region, if S3 is enabled.
Default: us-east-1vfs.s3.aws_access_key_id
Set the AWS_ACCESS_KEY_ID
Default: “”vfs.s3.aws_secret_access_key
Set the AWS_SECRET_ACCESS_KEY
Default: “”vfs.s3.aws_session_token
Set the AWS_SESSION_TOKEN
Default: “”vfs.s3.scheme
The S3 scheme (
http
orhttps
), if S3 is enabled. Default: httpsvfs.s3.endpoint_override
The S3 endpoint, if S3 is enabled.
Default: “”vfs.s3.use_virtual_addressing
The S3 use of virtual addressing (
true
orfalse
), if S3 is enabled. Default: truevfs.s3.use_virtual_addressing
The S3 use of virtual addressing (
true
orfalse
), if S3 is enabled. Default: truevfs.s3.max_parallel_ops
The maximum number of S3 backend parallel operations.
Default:vfs.num_threads
vfs.s3.multipart_part_size
The part size (in bytes) used in S3 multipart writes. Any
uint64_t
value is acceptable. Note:vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops
bytes will be buffered before issuing multipart uploads in parallel. Default: 5MBvfs.s3.ca_file
Path to SSL/TLS certificate file to be used by cURL for for S3 HTTPS encryption. Follows cURL conventions:
https://curl.haxx.se/docs/manpage.html Default: “”vfs.s3.ca_path
Path to SSL/TLS certificate directory to be used by cURL for S3 HTTPS encryption. Follows cURL conventions:
https://curl.haxx.se/docs/manpage.html Default: “”vfs.s3.connect_timeout_ms
The connection timeout in ms. Any
long
value is acceptable. Default: 3000vfs.s3.connect_max_tries
The maximum tries for a connection. Any
long
value is acceptable. Default: 5vfs.s3.connect_scale_factor
The scale factor for exponential backofff when connecting to S3. Any
long
value is acceptable. Default: 25vfs.s3.logging_level
The AWS SDK logging level. This is a process-global setting. The configuration of the most recently constructed context will set process state. Log files are written to the process working directory.
Default: off””vfs.s3.request_timeout_ms
The request timeout in ms. Any
long
value is acceptable. Default: 3000vfs.s3.proxy_host
The proxy host.
Default: “”vfs.s3.proxy_port
The proxy port.
Default: 0vfs.s3.proxy_scheme
The proxy scheme.
Default: “https”vfs.s3.proxy_username
The proxy username. Note: this parameter is not serialized by
tiledb_config_save_to_file
. Default: “”vfs.s3.proxy_password
The proxy password. Note: this parameter is not serialized by
tiledb_config_save_to_file
. Default: “”vfs.s3.verify_ssl
Enable HTTPS certificate verification.
Default: true””vfs.hdfs.name_node"
Name node for HDFS.
Default: “”vfs.hdfs.username
HDFS username.
Default: “”vfs.hdfs.kerb_ticket_cache_path
HDFS kerb ticket cache path.
Default: “”
-
std::string
get
(const std::string ¶m) const¶ Get a parameter from the configuration by key.
- Return
- Value of configuration parameter
- Parameters
param
: Name of configuration parameter
- Exceptions
TileDBError
: if the parameter does not exist
-
impl::ConfigProxy
operator[]
(const std::string ¶m)¶ Operator that enables setting parameters with
[]
.Example:
Config conf; conf["vfs.s3.region"] = "us-east-1a"; conf["vfs.s3.use_virtual_addressing"] = "true"; Context ctx(conf);
- Return
- ”Proxy” object supporting assignment.
- Parameters
param
: Name of parameter to set
-
Config &
unset
(const std::string ¶m)¶ Resets a config parameter to its default value.
- Return
- Reference to this Config instance
- Parameters
param
: Name of parameter
-
iterator
begin
(const std::string &prefix)¶ Iterate over params starting with a prefix.
Example:
tiledb::Config config; for (auto it = config.begin("vfs"), ite = config.end(); it != ite; ++it) { std::string name = it->first, value = it->second; }
- Return
- iterator
- Parameters
prefix
: Prefix to iterate over
-
iterator
begin
()¶ Iterate over all params.
Example:
tiledb::Config config; for (auto it = config.begin(), ite = config.end(); it != ite; ++it) { std::string name = it->first, value = it->second; }
- Return
- iterator
-
iterator
end
()¶ End iterator.
Public Static Functions
-
static void
free
(tiledb_config_t *config)¶ Wrapper function for freeing a config C object.
-
Exceptions¶
-
struct
TileDBError
: public runtime_error¶ Exception indicating a TileDB error.
Subclassed by tiledb::AttributeError, tiledb::SchemaMismatch, tiledb::TypeError
-
struct
TypeError
: public tiledb::TileDBError¶ Exception indicating a mismatch between a static and runtime type
-
struct
SchemaMismatch
: public tiledb::TileDBError¶ Exception indicating the requested operation does not match array schema
-
struct
AttributeError
: public tiledb::TileDBError¶ Error related to attributes
Dimension¶
-
class
Dimension
¶ Describes one dimension of an Array. The dimension consists of a type, lower and upper bound, and tile-extent describing the memory ordering. Dimensions are added to a Domain.
Example:
tiledb::Context ctx; tiledb::Domain domain(ctx); // Create a dimension with inclusive domain [0,1000] and tile extent 100. domain.add_dimension(Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100));
Public Functions
-
const std::string
name
() const¶ Returns the name of the dimension.
-
tiledb_datatype_t
type
() const¶ Returns the dimension datatype.
-
template<typename
T
>
std::pair<T, T>domain
() const¶ Returns the domain of the dimension.
- Return
- Pair of [lower, upper] inclusive bounds.
- Template Parameters
T
: Domain datatype
-
std::string
domain_to_str
() const¶ Returns a string representation of the domain.
- Exceptions
TileDBError
: if the domain cannot be stringified (TILEDB_ANY)
-
std::string
tile_extent_to_str
() const¶ Returns a string representation of the extent.
- Exceptions
TileDBError
: if the domain cannot be stringified (TILEDB_ANY)
-
std::shared_ptr<tiledb_dimension_t>
ptr
() const¶ Returns a shared pointer to the C TileDB dimension object.
Public Static Functions
-
template<typename
T
>
static Dimensioncreate
(const Context &ctx, const std::string &name, const std::array<T, 2> &domain, T extent)¶ Factory function for creating a new dimension with datatype T.
Example:
tiledb::Context ctx; // Create a dimension with inclusive domain [0,1000] and tile extent 100. auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100);
- Return
- A new
Dimension
object. - Template Parameters
T
: int, char, etc…
- Parameters
ctx
: The TileDB context.name
: The dimension name.domain
: The dimension domain. A pair [lower,upper] of inclusive bounds.extent
: The tile extent on the dimension.
-
static Dimension
create
(const Context &ctx, const std::string &name, tiledb_datatype_t datatype, const void *domain, const void *extent)¶ Factory function for creating a new dimension (non typechecked).
- Return
- A new
Dimension
object. - Parameters
ctx
: The TileDB context.name
: The dimension name.datatype
: The dimension datatype.domain
: The dimension domain. A pair [lower,upper] of inclusive bounds.extent
: The tile extent on the dimension.
-
const std::string
Domain¶
-
class
Domain
¶ Represents the domain of an array.
A Domain defines the set of Dimension objects for a given array. The properties of a Domain derive from the underlying dimensions. A Domain is a component of an ArraySchema.
Example:
- Note
- The dimension can only be signed or unsigned integral types, as well as floating point for sparse array domains.
tiledb::Context ctx; tiledb::Domain domain; // Note the dimension bounds are inclusive. auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); auto d2 = tiledb::Dimension::create<uint64_t>(ctx, "d2", {1, 10}); auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100}); domain.add_dimension(d1); domain.add_dimension(d2); // Throws error, all dims must be same type domain.add_dimension(d3); domain.cell_num(); // (10 - -10 + 1) * (10 - 1 + 1) = 210 max cells domain.type(); // TILEDB_INT32, determined from the dimensions domain.rank(); // 2, d1 and d2 tiledb::ArraySchema schema(ctx, TILEDB_DENSE); schema.set_domain(domain); // Set the array's domain
Public Functions
-
uint64_t
cell_num
() const¶ Returns the total number of cells in the domain. Throws an exception if the domain type is
float32
orfloat64
.- Exceptions
TileDBError
: if cell_num cannot be computed.
-
void
dump
(FILE *out = stdout) const¶ Dumps the domain in an ASCII representation to an output.
- Parameters
out
: (Optional) File to dump output to. Defaults tostdout
.
-
tiledb_datatype_t
type
() const¶ Returns the domain type.
-
unsigned
ndim
() const¶ Returns the number of dimensions.
-
Domain &
add_dimension
(const Dimension &d)¶ Adds a new dimension to the domain.
Example:
tiledb::Context ctx; tiledb::Domain domain; auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); domain.add_dimension(d1);
-
template<typename ...
Args
>
Domain &add_dimensions
(Args... dims)¶ Adds multiple dimensions to the domain.
Example:
tiledb::Context ctx; tiledb::Domain domain; auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); auto d2 = tiledb::Dimension::create<int>(ctx, "d2", {1, 10}); auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100}); domain.add_dimensions(d1, d2, d3);
- Return
- Reference to this Domain.
- Template Parameters
Args
: Variadic dimension datatype
- Parameters
dims
: Dimensions to add
-
bool
has_dimension
(const std::string &name) const¶ Checks if the domain has a dimension of the given name.
- Return
- True if the domain has a dimension of the given name.
- Parameters
name
: Name of dimension to check for
-
std::shared_ptr<tiledb_domain_t>
ptr
() const¶ Returns a shared pointer to the C TileDB domain object.
Attribute¶
-
class
Attribute
¶ Describes an attribute of an Array cell.
An attribute specifies a name and datatype for a particular value in each array cell. There are 3 supported attribute types:
- Fundamental types, such as
char
,int
,double
,uint64_t
, etc.. - Fixed sized arrays:
T[N]
orstd::array<T, N>
, where T is a fundamental type - Variable length data:
std::string
,std::vector<T>
where T is a fundamental type
Fixed-size array types using POD types like
std::array<T, N>
are internally converted to byte-array attributes. E.g. an attribute of typestd::array<float, 3>
will be created as an attribute of typeTILEDB_CHAR
with cell_val_numsizeof(std::array<float, 3>)
.Therefore, for fixed-length attributes it is recommended to use C-style arrays instead, e.g.
float[3]
instead ofstd::array<float, 3>
.Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); // Change compression scheme tiledb::FilterList filters(ctx); filters.add_filter({ctx, TILEDB_FILTER_BZIP2}); a1.set_filter_list(filters); // Add attributes to a schema tiledb::ArraySchema schema(ctx, TILEDB_DENSE); schema.add_attributes(a1, a2, a3);
Public Functions
-
Attribute
(const Context &ctx, const std::string &name, tiledb_datatype_t type)¶ Construct an attribute with a name and enumerated type.
cell_val_num
will be set to 1.- Parameters
ctx
: TileDB contextname
: Name of attributetype
: Enumerated type of attribute
-
Attribute
(const Context &ctx, const std::string &name, tiledb_datatype_t type, const FilterList &filter_list)¶ Construct an attribute with an enumerated type and given filter list.
-
std::string
name
() const¶ Returns the name of the attribute.
-
tiledb_datatype_t
type
() const¶ Returns the attribute datatype.
-
uint64_t
cell_size
() const¶ Returns the size (in bytes) of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4"); a1.cell_size(); // Returns sizeof(int) a2.cell_size(); // Variable sized attribute, returns TILEDB_VAR_NUM a3.cell_size(); // Returns 3 * sizeof(float) a4.cell_size(); // Stored as byte array, returns sizeof(char).
-
unsigned
cell_val_num
() const¶ Returns number of values of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4"); a1.cell_val_num(); // Returns 1 a2.cell_val_num(); // Variable sized attribute, returns TILEDB_VAR_NUM a3.cell_val_num(); // Returns 3 a4.cell_val_num(); // Stored as byte array, returns sizeof(std::array<float, 3>).
-
Attribute &
set_cell_val_num
(unsigned num)¶ Sets the number of attribute values per cell. This is inferred from the type parameter of the
Attribute::create<T>()
function, but can also be set manually.Example:
// a1 and a2 are equivalent: auto a1 = Attribute::create<std::vector<int>>(...); auto a2 = Attribute::create<int>(...); a2.set_cell_val_num(TILEDB_VAR_NUM);
- Return
- Reference to this Attribute
- Parameters
num
: Cell val number to set.
-
bool
variable_sized
() const¶ Check if attribute is variable sized.
-
FilterList
filter_list
() const¶ Returns a copy of the FilterList of the attribute. To change the filter list, use
set_filter_list()
.- Return
- Copy of the attribute FilterList.
-
Attribute &
set_filter_list
(const FilterList &filter_list)¶ Sets the attribute filter list, which is an ordered list of filters that will be used to process and/or transform the attribute data (such as compression).
-
std::shared_ptr<tiledb_attribute_t>
ptr
() const¶ Returns the C TileDB attribute object pointer.
-
void
dump
(FILE *out = stdout) const¶ Dumps information about the attribute in an ASCII representation to an output.
- Parameters
out
: (Optional) File to dump output to. Defaults tostdout
.
Public Static Functions
-
template<typename
T
>
static Attributecreate
(const Context &ctx, const std::string &name)¶ Factory function for creating a new attribute with datatype T.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::vector<double>>(ctx, "a4"); auto a5 = tiledb::Attribute::create<char[8]>(ctx, "a5");
- Return
- A new Attribute object.
- Template Parameters
T
: Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).
- Parameters
ctx
: The TileDB context.name
: The attribute name.
-
template<typename
T
>
static Attributecreate
(const Context &ctx, const std::string &name, const FilterList &filter_list)¶ Factory function for creating a new attribute with datatype T and a FilterList.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); auto a1 = tiledb::Attribute::create<int>(ctx, "a1", filter_list);
- Return
- A new Attribute object.
- Template Parameters
T
: Datatype of the attribute. Can either be arithmetic type, C-style array,std::string
,std::vector
, or any trivially copyable classes (defined bystd::is_trivially_copyable
).
- Parameters
ctx
: The TileDB context.name
: The attribute name.filter_list
: FilterList to use for attribute
- Fundamental types, such as
Array Schema¶
-
class
ArraySchema
: public tiledb::Schema¶ Schema describing an array.
The schema is an independent description of an array. A schema can be used to create multiple array’s, and stores information about its domain, cell types, and compression details. An array schema is composed of:
- A Domain
- A set of Attributes
- Memory layout definitions: tile and cell
- Compression details for Array level factors like offsets and coordinates
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Or TILEDB_DENSE // Create a Domain tiledb::Domain domain(...); // Create Attributes auto a1 = tiledb::Attribute::create(...); schema.set_domain(domain); schema.add_attribute(a1); // Specify tile memory layout schema.set_tile_order(TILEDB_ROW_MAJOR); // Specify cell memory layout within each tile schema.set_cell_order(TILEDB_ROW_MAJOR); schema.set_capacity(10); // For sparse, set capacity of each tile // Create the array on persistent storage with the schema. tiledb::Array::create("my_array", schema);
Public Functions
-
ArraySchema
(const Context &ctx, tiledb_array_type_t type)¶ Creates a new array schema.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
- Parameters
ctx
: TileDB contexttype
: Array type, sparse or dense.
-
ArraySchema
(const Context &ctx, const std::string &uri)¶ Loads the schema of an existing array.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx, "s3://bucket-name/array-name");
- Parameters
ctx
: TileDB contexturi
: URI of array
-
ArraySchema
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length)¶ Loads the schema of an existing encrypted array.
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; tiledb::Context ctx; tiledb::ArraySchema schema(ctx, "s3://bucket-name/array-name", TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
ctx
: TileDB contexturi
: URI of arrayencryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.
-
ArraySchema
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶ Loads the schema of an existing encrypted array.
- Parameters
ctx
: TileDB contexturi
: URI of arrayencryption_type
: The encryption type to use.encryption_key
: The encryption key to use.
-
ArraySchema
(const Context &ctx, tiledb_array_schema_t *schema)¶ Loads the schema of an existing array with the input C array schema object.
- Parameters
ctx
: TileDB contextschema
: C API array schema object
-
void
dump
(FILE *out = stdout) const¶ Dumps the array schema in an ASCII representation to an output.
- Parameters
out
: (Optional) File to dump output to. Defaults tostdout
.
-
tiledb_array_type_t
array_type
() const¶ Returns the array type.
-
uint64_t
capacity
() const¶ Returns the tile capacity.
-
ArraySchema &
set_capacity
(uint64_t capacity)¶ Sets the tile capacity.
- Return
- Reference to this
ArraySchema
instance. - Parameters
capacity
: The capacity of a sparse data tile. Note that sparse data tiles exist in sparse fragments, which can be created in both sparse and dense arrays. For more details, see tutorials/tiling-sparse.html.
-
tiledb_layout_t
tile_order
() const¶ Returns the tile order.
-
ArraySchema &
set_tile_order
(tiledb_layout_t layout)¶ Sets the tile order.
- Return
- Reference to this
ArraySchema
instance. - Parameters
layout
: Tile order to set.
-
ArraySchema &
set_order
(const std::array<tiledb_layout_t, 2> &p)¶ Sets both the tile and cell orders.
- Return
- Reference to this
ArraySchema
instance. - Parameters
layout
: Pair of {tile order, cell order}
-
tiledb_layout_t
cell_order
() const¶ Returns the cell order.
-
ArraySchema &
set_cell_order
(tiledb_layout_t layout)¶ Sets the cell order.
- Return
- Reference to this
ArraySchema
instance. - Parameters
layout
: Cell order to set.
-
FilterList
coords_filter_list
() const¶ Returns a copy of the FilterList of the coordinates. To change the coordinate compressor, use
set_coords_filter_list()
.- Return
- Copy of the coordinates FilterList.
-
ArraySchema &
set_coords_filter_list
(const FilterList &filter_list)¶ Sets the FilterList for the coordinates, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); schema.set_coords_filter_list(filter_list);
- Return
- Reference to this
ArraySchema
instance. - Parameters
filter_list
: FilterList to use
-
FilterList
offsets_filter_list
() const¶ Returns a copy of the FilterList of the offsets. To change the offsets compressor, use
set_offsets_filter_list()
.- Return
- Copy of the offsets FilterList.
-
ArraySchema &
set_offsets_filter_list
(const FilterList &filter_list)¶ Sets the FilterList for the offsets, which is an ordered list of filters that will be used to process and/or transform the offsets data (such as compression).
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA}) .add_filter({ctx, TILEDB_FILTER_LZ4}); schema.set_offsets_filter_list(filter_list);
- Return
- Reference to this
ArraySchema
instance. - Parameters
filter_list
: FilterList to use
-
Domain
domain
() const¶ Returns a copy of the schema’s array Domain. To change the domain, use
set_domain()
.- Return
- Copy of the array Domain
-
ArraySchema &
set_domain
(const Domain &domain)¶ Sets the array domain.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Create a Domain tiledb::Domain domain(...); schema.set_domain(domain);
- Return
- Reference to this
ArraySchema
instance. - Parameters
domain
: Domain to use
-
ArraySchema &
add_attribute
(const Attribute &attr)¶ Adds an Attribute to the array.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); schema.add_attribute(Attribute::create<int32_t>(ctx.ptr().get(), "attr_name"));
- Return
- Reference to this
ArraySchema
instance. - Parameters
attr
: The Attribute to add
-
std::shared_ptr<tiledb_array_schema_t>
ptr
() const¶ Returns a shared pointer to the C TileDB domain object.
-
void
check
() const¶ Validates the schema.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Add domain, attributes, etc... try { schema.check(); } catch (const tiledb::TileDBError& e) { std::cout << e.what() << "\n"; exit(1); }
- Exceptions
TileDBError
: if the schema is incorrect or invalid.
-
std::unordered_map<std::string, Attribute>
attributes
() const¶ Gets all attributes in the array.
- Return
- Map of attribute name to copy of Attribute instance.
-
Attribute
attribute
(const std::string &name) const¶ Get a copy of an Attribute in the schema by name.
- Return
- Attribute
- Parameters
name
: Name of attribute
-
unsigned
attribute_num
() const¶ Returns the number of attributes in the schema.
-
Attribute
attribute
(unsigned int i) const¶ Get a copy of an Attribute in the schema by index. Attributes are ordered the same way they were defined when constructing the array schema.
- Return
- Attribute
- Parameters
i
: Index of attribute
-
bool
has_attribute
(const std::string &name) const¶ Checks if the schema has an attribute of the given name.
- Return
- True if the schema has an attribute of the given name.
- Parameters
name
: Name of attribute to check for
Array¶
-
class
Array
¶ Class representing a TileDB array object.
An Array object represents array data in TileDB at some persisted location, e.g. on disk, in an S3 bucket, etc. Once an array has been opened for reading or writing, interact with the data through Query objects.
Example:
tiledb::Context ctx; // Create an ArraySchema, add attributes, domain, etc. tiledb::ArraySchema schema(...); // Create empty array named "my_array" on persistent storage. tiledb::Array::create("my_array", schema);
Public Functions
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type)¶ Constructor. This opens the array for the given query type. The destructor calls the
close()
method.Example:
// Open the array for reading tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
- Parameters
ctx
: TileDB context.array_uri
: The array URI.query_type
: Query type to open the array for.
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length)¶ Constructor. This opens an encrypted array for the given query type. The destructor calls the
close()
method.Example:
// Open the encrypted array for reading tiledb::Context ctx; // Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ, TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
ctx
: TileDB context.array_uri
: The array URI.query_type
: Query type to open the array for.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶ Constructor. This opens an encrypted array for the given query type. The destructor calls the
close()
method.See Array::Array
- Parameters
ctx
: TileDB context.array_uri
: The array URI.query_type
: Query type to open the array for.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, uint64_t timestamp)¶ Constructor. This opens the array for the given query type at the given timestamp. The destructor calls the
close()
method.This constructor takes as input a timestamp, representing time in milliseconds ellapsed since 1970-01-01 00:00:00 +0000 (UTC). Opening the array at a timestamp provides a view of the array with all writes/updates that happened at or before
timestamp
(i.e., excluding those that occurred aftertimestamp
). This is useful to ensure consistency at a potential distributed setting, where machines need to operate on the same view of the array.Example:
// Open the array for reading tiledb::Context ctx; // Get some `timestamp` here in milliseconds tiledb::Array array( ctx, "s3://bucket-name/array-name", TILEDB_READ, timestamp);
- Parameters
ctx
: TileDB context.array_uri
: The array URI.query_type
: Query type to open the array for.timestamp
: The timestamp to open the array at.
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length, uint64_t timestamp)¶ Constructor. This opens the array for the given query type at the given timestamp. The destructor calls the
close()
method.Same as Array::Array but for encrypted arrays.
Example:
// Open the encrypted array for reading tiledb::Context ctx; // Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; // Get some `timestamp` here in milliseconds tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ, TILEDB_AES_256_GCM, key, sizeof(key), timestamp);
- Parameters
ctx
: TileDB context.array_uri
: The array URI.query_type
: Query type to open the array for.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.timestamp
: The timestamp to open the array at.
-
Array
(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, uint64_t timestamp)¶ Constructor. This opens the array for the given query type at the given timestamp. The destructor calls the
close()
method.See Array::Array
-
bool
is_open
() const¶ Checks if the array is open.
-
std::string
uri
() const¶ Returns the array URI.
-
ArraySchema
schema
() const¶ Get the ArraySchema for the array.
-
std::shared_ptr<tiledb_array_t>
ptr
() const¶ Returns a shared pointer to the C TileDB array object.
-
void
open
(tiledb_query_type_t query_type)¶ Opens the array. The array is opened using a query type as input.
This is to indicate that queries created for this
Array
object will inherit the query type. In other words,Array
objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many differentArray
objects created and opened with different query types. For instance, one may create and open an array objectarray_read
for reads and another onearray_write
for writes, and interleave creation and submission of queries for both these array objects.Example:
// Open the array for writing tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE); // Close and open again for reading. array.close(); array.open(TILEDB_READ);
- Parameters
query_type
: The type of queries the array object will be receiving.
- Exceptions
TileDBError
: if the array is already open or other error occurred.
-
void
open
(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length)¶ Opens the array, for encrypted arrays.
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; // Open the encrypted array for writing tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE, TILEDB_AES_256_GCM, key, sizeof(key)); // Close and open again for reading. array.close(); array.open(TILEDB_READ, TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
query_type
: The type of queries the array object will be receiving.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.
-
void
open
(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶ Opens the array, for encrypted arrays.
See Array::open
-
void
open
(tiledb_query_type_t query_type, uint64_t timestamp)¶ Opens the array for a query type, at the given timestamp.
This function takes as input a timestamp, representing time in milliseconds ellapsed since 1970-01-01 00:00:00 +0000 (UTC). Opening the array at a timestamp provides a view of the array with all writes/updates that happened at or before
timestamp
(i.e., excluding those that occurred aftertimestamp
). This is useful to ensure consistency at a potential distributed setting, where machines need to operate on the same view of the array.Example:
// Open the array for writing tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE); // Close and open again for reading. array.close(); // Get some `timestamp` in milliseconds here array.open(TILEDB_READ, timestamp);
- Parameters
query_type
: The type of queries the array object will be receiving.timestamp
: The timestamp to open the array at.
- Exceptions
TileDBError
: if the array is already open or other error occurred.
-
void
open
(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length, uint64_t timestamp)¶ Opens the array for a query type, at the given timestamp.
Same as Array::open but for encrypted arrays.
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; // Open the encrypted array for writing tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE, TILEDB_AES_256_GCM, key, sizeof(key)); // Close and open again for reading. array.close(); // Get some `timestamp` in milliseconds here array.open(TILEDB_READ, TILEDB_AES_256_GCM, key, sizeof(key), timestamp);
- Parameters
query_type
: The type of queries the array object will be receiving.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.timestamp
: The timestamp to open the array at.
-
void
open
(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, uint64_t timestamp)¶ Opens the array for a query type, at the given timestamp.
See Array::open
-
void
reopen
()¶ Reopens the array (the array must be already open). This is useful when the array got updated after it got opened and the
Array
object got created. To sync-up with the updates, the user must either close the array and open withopen()
, or just usereopen()
without closing. This function will be generally faster than the former alternative.Note: reopening encrypted arrays does not require the encryption key.
Example:
// Open the array for reading tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); array.reopen();
- Exceptions
TileDBError
: if the array was not already open or other error occurred.
-
void
reopen_at
(uint64_t timestamp)¶ Reopens the array at a specific timestamp.
Example:
// Open the array for reading tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); uint64_t timestamp = tiledb_timestamp_now_ms(); array.reopen_at(timestamp);
- Exceptions
TileDBError
: if the array was not already open or other error occurred.
-
uint64_t
timestamp
() const¶ Returns the timestamp at which the array was opened.
-
void
close
()¶ Closes the array. The destructor calls this automatically.
Example:
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); array.close();
-
template<typename
T
>
std::vector<std::pair<std::string, std::pair<T, T>>>non_empty_domain
()¶ Retrieves the non-empty domain from the array. This is the union of the non-empty domains of the array fragments.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the domain type (example uint32_t) auto non_empty = array.non_empty_domain<uint32_t>(); std::cout << "Dimension named " << non_empty[0].first << " has cells in [" << non_empty[0].second.first << ", " non_empty[0].second.second << "]" << std::endl;
- Return
- Vector of dim names with a {lower, upper} pair. Inclusive. Empty vector if the array has no data.
- Template Parameters
T
: Domain datatype
-
template<typename
T
>
std::unordered_map<std::string, std::pair<uint64_t, uint64_t>>max_buffer_elements
(const std::vector<T> &subarray)¶ Compute an upper bound on the buffer elements needed to read a subarray.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); std::vector<int> subarray = {0, 2, 0, 2}; auto max_elements = array.max_buffer_elements(subarray); // For fixed-sized attributes, `.second` is the max number of elements // that can be read for the attribute. Use it to size the vector. std::vector<int> data_a1(max_elements["a1"].second); // In sparse reads, coords are also fixed-sized attributes. std::vector<int> coords(max_elements[TILEDB_COORDS].second); // In variable attributes, e.g. std::string type, need two buffers, // one for offsets and one for cell data. std::vector<uint64_t> offsets_a1(max_elements["a2"].first); std::vector<char> data_a1(max_elements["a2"].second);
- Return
- A map of attribute name (including
TILEDB_COORDS
) to the maximum number of elements that can be read in the given subarray. For each attribute, a pair of numbers are returned. The first, for variable-length attributes, is the maximum number of offsets for that attribute in the given subarray. For fixed-length attributes and coordinates, the first is always 0. The second is the maximum number of elements for that attribute in the given subarray. - Template Parameters
T
: The domain datatype
- Parameters
subarray
: Targeted subarray.
-
tiledb_query_type_t
query_type
() const¶ Returns the query type the array was opened with.
-
void
put_metadata
(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)¶ It puts a metadata key-value item to an open array. The array must be opened in WRITE mode, otherwise the function will error out.
- Note
- The writes will take effect only upon closing the array.
- Parameters
key
: The key of the metadata item to be added. UTF-8 encodings are acceptable.value_type
: The datatype of the value.value_num
: The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.value
: The metadata value in binary form.
-
void
delete_metadata
(const std::string &key)¶ It deletes a metadata key-value item from an open array. The array must be opened in WRITE mode, otherwise the function will error out.
- Note
- The writes will take effect only upon closing the array.
- Note
- If the key does not exist, this will take no effect (i.e., the function will not error out).
- Parameters
key
: The key of the metadata item to be deleted.
-
void
get_metadata
(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶ It gets a metadata key-value item from an open array. The array must be opened in READ mode, otherwise the function will error out.
- Note
- If the key does not exist, then
value
will be NULL. - Parameters
key
: The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.value_type
: The datatype of the value.value_num
: The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.value
: The metadata value in binary form.
-
bool
has_metadata
(const std::string &key, tiledb_datatype_t *value_type)¶ Checks if key exists in metadata from an open array. The array must be opened in READ mode, otherwise the function will error out.
- Return
- true if the key exists, else false.
- Note
- If the key does not exist, then
value_type
will not be modified. - Parameters
key
: The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.value_type
: The datatype of the value associated with the key (if any).
-
uint64_t
metadata_num
() const¶ Returns then number of metadata items in an open array. The array must be opened in READ mode, otherwise the function will error out.
-
void
get_metadata_from_index
(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶ It gets a metadata item from an open array using an index. The array must be opened in READ mode, otherwise the function will error out.
- Parameters
index
: The index used to get the metadata.key
: The metadata key.value_type
: The datatype of the value.value_num
: The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.value
: The metadata value in binary form.
Public Static Functions
-
static void
consolidate
(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶ Consolidates the fragments of an array into a single fragment.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
tiledb::Array::consolidate(ctx, "s3://bucket-name/array-name");
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array to be consolidated.config
: Configuration parameters for the consolidation.
-
static void
consolidate
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length, Config *const config = nullptr)¶ Consolidates the fragments of an encrypted array into a single fragment.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; tiledb::Array::consolidate( ctx, "s3://bucket-name/array-name", TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array to be consolidated.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.config
: Configuration parameters for the consolidation.
-
static void
consolidate
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, Config *const config = nullptr)¶ See Array::consolidate( const Context&, const std::string&, tiledb_encryption_type_t, const void*, uint32_t,const Config&) “Array::consolidate”
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array to be consolidated.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.config
: Configuration parameters for the consolidation.
-
static void
create
(const std::string &uri, const ArraySchema &schema)¶ Creates a new TileDB array given an input schema.
Example:
tiledb::Array::create("s3://bucket-name/array-name", schema);
- Parameters
uri
: URI where array will be created.schema
: The array schema.
-
static void
create
(const std::string &uri, const ArraySchema &schema, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length)¶ Creates a new encrypted TileDB array given an input schema.
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; tiledb::Array::create("s3://bucket-name/array-name", schema, TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
uri
: URI where array will be created.schema
: The array schema.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.
-
static void
create
(const std::string &uri, const ArraySchema &schema, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶ Creates a new encrypted TileDB array given an input schema.
See Array::create
- Parameters
uri
: URI where array will be created.schema
: The array schema.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.
-
static tiledb_encryption_type_t
encryption_type
(const Context &ctx, const std::string &array_uri)¶ Gets the encryption type the given array was created with.
Example:
tiledb_encryption_type_t enc_type; tiledb::Array::encryption_type(ctx, "s3://bucket-name/array-name", &enc_type);
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array to be consolidated.encryption_type
: Set to the encryption type of the array.
-
static void
consolidate_metadata
(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶ Consolidates the metadata of an array.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
tiledb::Array::consolidate_metadata(ctx, "s3://bucket-name/array-name");
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array whose metadata will be consolidated.config
: Configuration parameters for the consolidation.
-
static void
consolidate_metadata
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const void *encryption_key, uint32_t key_length, Config *const config = nullptr)¶ Consolidates the metadata of an encrypted array.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
// Load AES-256 key from disk, environment variable, etc. uint8_t key[32] = ...; tiledb::Array::consolidate_metadata( ctx, "s3://bucket-name/array-name", TILEDB_AES_256_GCM, key, sizeof(key));
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array whose metadata will be consolidated.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.key_length
: Length in bytes of the encryption key.config
: Configuration parameters for the consolidation.
-
static void
consolidate_metadata
(const Context &ctx, const std::string &uri, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, Config *const config = nullptr)¶ See Array::consolidate_metadata( const Context&, const std::string&, tiledb_encryption_type_t, const void*, uint32_t,const Config&) “Array::consolidate_metadata”
- Parameters
ctx
: TileDB contextarray_uri
: The URI of the TileDB array whose metadata will be consolidated.encryption_type
: The encryption type to use.encryption_key
: The encryption key to use.config
: Configuration parameters for the consolidation.
-
Query¶
-
class
Query
¶ Construct and execute read/write queries on a tiledb::Array.
See examples for more usage details.
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE); Query query(ctx, array); query.set_layout(TILEDB_GLOBAL_ORDER); std::vector a1_data = {1, 2, 3}; query.set_buffer("a1", a1_data); query.submit(); query.finalize(); array.close();
Public Types
Public Functions
-
Query
(const Context &ctx, const Array &array, tiledb_query_type_t type)¶ Creates a TileDB query object.
The query type (read or write) must be the same as the type used to open the array object.
The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_WRITE); Query query(ctx, array, TILEDB_WRITE);
- Parameters
ctx
: TileDB contextarray
: Open Array objecttype
: The TileDB query type
-
Query
(const Context &ctx, const Array &array)¶ Creates a TileDB query object.
The query type (read or write) is inferred from the array object, which was opened with a specific query type.
The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_WRITE); Query query(ctx, array); // Equivalent to: // Query query(ctx, array, TILEDB_WRITE);
- Parameters
ctx
: TileDB contextarray
: Open Array object
-
std::shared_ptr<tiledb_query_t>
ptr
() const¶ Returns a shared pointer to the C TileDB query object.
-
tiledb_query_type_t
query_type
() const¶ Returns the query type (read or write).
-
Query &
set_layout
(tiledb_layout_t layout)¶ Sets the layout of the cells to be written or read.
- Return
- Reference to this Query
- Parameters
layout
: For a write query, this specifies the order of the cells provided by the user in the buffers. For a read query, this specifies the order of the cells that will be retrieved as results and stored in the user buffers. The layout can be one of the following:TILEDB_COL_MAJOR
: This means column-major order with respect to the subarray.TILEDB_ROW_MAJOR
: This means row-major order with respect to the subarray.TILEDB_GLOBAL_ORDER
: This means that cells are stored or retrieved in the array global cell order.TILEDB_UNORDERED
: This is applicable only to writes for sparse arrays, or for sparse writes to dense arrays. It specifies that the cells are unordered and, hence, TileDB must sort the cells in the global cell order prior to writing.
-
tiledb_layout_t
query_layout
() const¶ Returns the layout of the query.
-
bool
has_results
() const¶ Returns
true
if the query has results. Applicable only to read queries (it returnsfalse
for write queries).
-
Status
submit
()¶ Submits the query. Call will block until query is complete.
- Note
finalize()
must be invoked after finish writing in global layout (via repeated invocations ofsubmit()
), in order to flush any internal state. For the case of reads, if the returned status isTILEDB_INCOMPLETE
, TileDB could not fit the entire result in the user’s buffers. In this case, the user should consume the read results (if any), optionally reset the buffers withset_buffer()
, and then resubmit the query until the status becomesTILEDB_COMPLETED
. If all buffer sizes after the termination of this function become 0, then this means that no useful data was read into the buffers, implying that the larger buffers are needed for the query to proceed. In this case, the users must reallocate their buffers (increasing their size), reset the buffers withset_buffer()
, and resubmit the query.- Return
- Query status
-
template<typename
Fn
>
voidsubmit_async
(const Fn &callback)¶ Submit an async query, with callback. Call returns immediately.
Example:
// Create query tiledb::Query query(...); // Submit with callback query.submit_async([]() { std::cout << "Callback: query completed.\n"; });
- Note
- Same notes apply as
Query::submit()
.
- Parameters
callback
: Callback function.
-
void
submit_async
()¶ Submit an async query, with no callback. Call returns immediately.
Example:
// Create query tiledb::Query query(...); // Submit with no callback query.submit_async();
- Note
- Same notes apply as
Query::submit()
.
-
void
finalize
()¶ Flushes all internal state of a query object and finalizes the query. This is applicable only to global layout writes. It has no effect for any other query type.
-
std::unordered_map<std::string, std::pair<uint64_t, uint64_t>>
result_buffer_elements
() const¶ Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a pair of values.
The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0.
For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length
float
attribute that reads three cells would return 3 for the first number in the pair. If the total amount offloats
read across the three cells was 10, then the second number in the pair would be 10.For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single
float
attribute that reads three cells would return 3 for the second value. A read query on afloat
attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.If the query has not been submitted, an empty map is returned.
Example:
// Submit a read query. query.submit(); auto result_el = query.result_buffer_elements(); // For fixed-sized attributes, `.second` is the number of elements // that were read for the attribute across all cells. Note: number of // elements and not number of bytes. auto num_a1_elements = result_el["a1"].second; // Coords are also fixed-sized. auto num_coords = result_el[TILEDB_COORDS].second; // In variable attributes, e.g. std::string type, need two buffers, // one for offsets and one for cell data ("elements"). auto num_a2_offsets = result_el["a2"].first; auto num_a2_elements = result_el["a2"].second;
-
template<class
T
>
Query &add_range
(uint32_t dim_idx, T start, T end, T stride = 0)¶ Adds a 1D range along a subarray dimension, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.
Example:
// Set a 1D range on dimension 0, assuming the domain type is int64. int64_t start = 10; int64_t end = 20; // Stride is optional subarray.add_range(0, start, end);
- Return
- Reference to this Query
- Template Parameters
T
: The dimension datatype
- Parameters
dim_idx
: The index of the dimension to add the range to.start
: The range start to add.end
: The range end to add.stride
: The range stride to add.
-
uint64_t
range_num
(unsigned dim_idx) const¶ Retrieves the number of ranges for a given dimension.
Example:
unsigned dim_idx = 0; uint64_t range_num = query.range_num(dim_idx);
- Return
- The number of ranges.
- Parameters
dim_idx
: The dimension index.
-
template<class
T
>
std::array<T, 3>range
(unsigned dim_idx, uint64_t range_idx)¶ Retrieves a range for a given dimension and range id. The template datatype must be the same as that of the underlying array.
Example:
unsigned dim_idx = 0; unsigned range_idx = 0; auto range = query.range<int32_t>(dim_idx, range_idx);
- Return
- A triplet of the form (start, end, stride).
- Template Parameters
T
: The dimension datatype.
- Parameters
dim_idx
: The dimension index.range_idx
: The range index.
-
uint64_t
est_result_size
(const std::string &attr_name) const¶ Retrieves the estimated result size for a fixed-size attribute.
Example:
uint64_t est_size = query.est_result_size("attr1");
- Return
- The estimated size in bytes.
- Parameters
attr_name
: The attribute name.
-
std::pair<uint64_t, uint64_t>
est_result_size_var
(const std::string &attr_name) const¶ Retrieves the estimated result size for a variable-size attribute.
Example:
std::pair<uint64_t, uint64_t> est_size = query.est_result_size_var("attr1");
- Return
- A pair with first element containing the estimated number of result offsets, and second element containing the estimated number of result value bytes.
- Parameters
attr_name
: The attribute name.
-
uint32_t
fragment_num
() const¶ Returns the number of written fragments. Applicable only to WRITE queries.
-
std::string
fragment_uri
(uint32_t idx) const¶ Returns the URI of the written fragment with the input index. Applicable only to WRITE queries.
-
std::pair<uint64_t, uint64_t>
fragment_timestamp_range
(uint32_t idx) const¶ Returns the timestamp range of the written fragment with the input index. Applicable only to WRITE queries.
-
template<typename
T
= uint64_t>
Query &set_subarray
(const T *pairs, uint64_t size)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); int subarray[] = {0, 3, 0, 3}; Query query(ctx, array); query.set_subarray(subarray, 4);
- Note
set_subarray(std::vector<T>)
is preferred as it is safer.
- Template Parameters
T
: Type of array domain.
- Parameters
pairs
: Subarray pointer defined as an array of [start, stop] values per dimension.size
: The number of subarray elements.
-
template<typename
Vec
>
Query &set_subarray
(const Vec &pairs)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); std::vector<int> subarray = {0, 3, 0, 3}; Query query(ctx, array); query.set_subarray(subarray);
- Template Parameters
Vec
: Vector datatype. Should always be a vector of the domain type.
- Parameters
pairs
: The subarray defined as a vector of [start, stop] coordinates per dimension.
-
template<typename
T
= uint64_t>
Query &set_subarray
(const std::initializer_list<T> &l)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); Query query(ctx, array); query.set_subarray({0, 3, 0, 3});
- Template Parameters
T
: Type of array domain.
- Parameters
pairs
: List of [start, stop] coordinates per dimension.
-
template<typename
T
= uint64_t>
Query &set_subarray
(const std::vector<std::array<T, 2>> &pairs)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive.
- Note
- set_subarray(std::vector) is preferred and avoids an extra copy.
- Template Parameters
T
: Type of array domain.
- Parameters
pairs
: The subarray defined as pairs of [start, stop] per dimension.
-
template<typename
T
>
Query &set_coordinates
(T *buf, uint64_t size)¶ Set the coordinate buffer.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); // Write to points (0,1) and (2,3) in a 2D array with int domain. int coords[] = {0, 1, 2, 3}; Query query(ctx, array); query.set_layout(TILEDB_UNORDERED).set_coordinates(coords, 4);
- Note
- set_coordinates(std::vector<T>) is preferred as it is safer.
- Template Parameters
T
: Type of array domain.
- Parameters
buf
: Coordinate array buffer pointersize
: The number of elements in the coordinate array buffer
-
template<typename
Vec
>
Query &set_coordinates
(Vec &buf)¶ Set the coordinate buffer for unordered queries
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); // Write to points (0,1) and (2,3) in a 2D array with int domain. std::vector<int> coords = {0, 1, 2, 3}; Query query(ctx, array); query.set_layout(TILEDB_UNORDERED).set_coordinates(coords);
- Template Parameters
Vec
: Vector datatype. Should always be a vector of the domain type.
- Parameters
buf
: Coordinate vector
-
template<typename
T
>
Query &set_buffer
(const std::string &attr, T *buff, uint64_t nelements)¶ Sets a buffer for a fixed-sized attribute.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); int data_a1[] = {0, 1, 2, 3}; Query query(ctx, array); query.set_buffer("a1", data_a1, 4);
-
template<typename
T
>
Query &set_buffer
(const std::string &attr, std::vector<T> &buf)¶ Sets a buffer for a fixed-sized attribute.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); std::vector<int> data_a1 = {0, 1, 2, 3}; Query query(ctx, array); query.set_buffer("a1", data_a1);
-
Query &
set_buffer
(const std::string &attr, void *buff, uint64_t nelements)¶ Sets a buffer for a fixed-sized attribute.
- Note
- This unsafe version does not perform type checking; the given buffer is assumed to be the correct type, and the size of an element in the given buffer is assumed to be the size of the datatype of the attribute.
- Parameters
attr
: Attribute namebuff
: Buffer array pointer with elements of the attribute type.nelements
: Number of array elements in buffer
-
template<typename
T
>
Query &set_buffer
(const std::string &attr, uint64_t *offsets, uint64_t offset_nelements, T *data, uint64_t data_nelements)¶ Sets a buffer for a variable-sized attribute.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); int data_a1[] = {0, 1, 2, 3}; uint64_t offsets_a1[] = {0, 8}; Query query(ctx, array); query.set_buffer("a1", offsets_a1, 2, data_a1, 4);
- Note
- set_buffer(std::string, std::vector, std::vector) is preferred as it is safer.
- Template Parameters
T
: Attribute value type
- Parameters
attr
: Attribute nameoffsets
: Offsets array pointer where a new element begins in the data buffer.offsets_nelements
: Number of elements in offsets buffer.data
: Buffer array pointer with elements of the attribute type. For variable sized attributes, the buffer should be flattened.data_nelements
: Number of array elements in data buffer.
-
Query &
set_buffer
(const std::string &attr, uint64_t *offsets, uint64_t offset_nelements, void *data, uint64_t data_nelements)¶ Sets a buffer for a variable-sized attribute.
- Note
- This unsafe version does not perform type checking; the given buffer is assumed to be the correct type, and the size of an element in the given buffer is assumed to be the size of the datatype of the attribute.
- Parameters
attr
: Attribute nameoffsets
: Offsets array pointer where a new element begins in the data buffer.offsets_nelements
: Number of elements in offsets buffer.data
: Buffer array pointer with elements of the attribute type.data_nelements
: Number of array elements in data buffer.
-
template<typename
T
>
Query &set_buffer
(const std::string &attr, std::vector<uint64_t> &offsets, std::vector<T> &data)¶ Sets a buffer for a variable-sized attribute.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); std::vector<int> data_a1 = {0, 1, 2, 3}; std::vector<uint64_t> offsets_a1 = {0, 8}; Query query(ctx, array); query.set_buffer("a1", offsets_a1, data_a1);
- Template Parameters
T
: Attribute value type
- Parameters
attr
: Attribute nameoffsets
: Offsets where a new element begins in the data buffer.data
: Buffer vector with elements of the attribute type. For variable sized attributes, the buffer should be flattened. E.x. an attribute of type std::string should have a buffer Vec type of std::string, where the values of each cell are concatenated.
-
Filter¶
-
class
Filter
¶ Represents a filter. A filter is used to transform attribute data e.g. with compression, delta encoding, etc.
Example:
tiledb::Context ctx; tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int level = 5; f.set_option(TILEDB_COMPRESSION_LEVEL, &level);
Public Functions
-
Filter
(const Context &ctx, tiledb_filter_type_t filter_type)¶ Creates a Filter of the given type.
Example:
tiledb::Context ctx; tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
- Parameters
ctx
: TileDB contextfilter_type
: Enumerated type of filter
-
Filter
(const Context &ctx, tiledb_filter_t *filter)¶ Creates a Filter with the input C object.
- Parameters
ctx
: TileDB contextfilter
: C API filter object
-
std::shared_ptr<tiledb_filter_t>
ptr
() const¶ Returns a shared pointer to the C TileDB domain object.
-
template<typename T, typename std::enable_if< std::is_arithmetic< T >::value >::type * = nullptr>Filter& tiledb::Filter::set_option(tiledb_filter_option_t option, T value)
Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); f.set_option(TILEDB_COMPRESSION_LEVEL, 5);
- Return
- Reference to this Filter
- Template Parameters
T
: Type of value of option to set.
- Parameters
option
: Enumerated option to set.value
: Value of option to set.
- Exceptions
TileDBError
: if the option cannot be set on the filter.std::invalid_argument
: if the option value is the wrong type.
-
Filter &
set_option
(tiledb_filter_option_t option, const void *value)¶ Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.
This version of set_option performs no type checks.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int level = 5; f.set_option(TILEDB_COMPRESSION_LEVEL, &level);
- Return
- Reference to this Filter
- Note
- set_option<T>(option, T value) is preferred as it is safer.
- Parameters
option
: Enumerated option to set.value
: Value of option to set.
- Exceptions
TileDBError
: if the option cannot be set on the filter.
-
template<typename T, typename std::enable_if< std::is_arithmetic< T >::value >::type * = nullptr>void tiledb::Filter::get_option(tiledb_filter_option_t option, T * value)
Gets an option value from the filter.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int32_t level; f.get_option(TILEDB_COMPRESSION_LEVEL, &level); // level == -1 (the default compression level)
- Note
- The buffer pointed to by
value
must be large enough to hold the option value. - Template Parameters
T
: Type of option value to get.
- Parameters
option
: Enumerated option to get.value
: Buffer that option value will be written to.
- Exceptions
TileDBError
: if the option cannot be retrieved from the filter.std::invalid_argument
: if the option value is the wrong type.
-
void
get_option
(tiledb_filter_option_t option, void *value)¶ Gets an option value from the filter.
This version of get_option performs no type checks.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int32_t level; f.get_option(TILEDB_COMPRESSION_LEVEL, &level); // level == -1 (the default compression level)
- Note
- The buffer pointed to by
value
must be large enough to hold the option value. - Note
- get_option<T>(option, T* value) is preferred as it is safer.
- Parameters
option
: Enumerated option to get.value
: Buffer that option value will be written to.
- Exceptions
TileDBError
: if the option cannot be retrieved from the filter.
-
tiledb_filter_type_t
filter_type
() const¶ Gets the filter type of this filter.
Public Static Functions
-
static std::string
to_str
(tiledb_filter_type_t type)¶ Returns the input type in string format.
-
Filter List¶
-
class
FilterList
¶ Represents an ordered list of Filters used to transform attribute data.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2});
Public Functions
-
FilterList
(const Context &ctx)¶ Construct a FilterList.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx);
- Parameters
ctx
: TileDB context
-
FilterList
(const Context &ctx, tiledb_filter_list_t *filter_list)¶ Creates a FilterList with the input C object.
- Parameters
ctx
: TileDB contextfilter
: C API filter list object
-
std::shared_ptr<tiledb_filter_list_t>
ptr
() const¶ Returns a shared pointer to the C TileDB domain object.
-
FilterList &
add_filter
(const Filter &filter)¶ Appends a filter to a filter list. Data is processed through each filter in the order the filters were added.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2});
- Return
- Reference to this FilterList
- Parameters
filter
: The filter to add
-
Filter
filter
(uint32_t filter_index) const¶ Returns a copy of the Filter in this list at the given index.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); auto f = filter_list.filter(1); // f.filter_type() == TILEDB_FILTER_BZIP2
- Return
- Filter
- Parameters
filter_index
: Index of filter to get
- Exceptions
TileDBError
: if the index is out of range
-
uint32_t
max_chunk_size
() const¶ Gets the maximum tile chunk size for the filter list.
- Return
- Maximum tile chunk size
-
uint32_t
nfilters
() const¶ Returns the number of filters in this filter list.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); uint32_t n = filter_list.nfilters(); // n == 2
- Return
-
FilterList &
set_max_chunk_size
(uint32_t max_chunk_size)¶ Sets the maximum tile chunk size for the filter list.
- Return
- Reference to this FilterList
- Parameters
max_chunk_size
: Maximum tile chunk size to set
-
Group¶
Object Management¶
-
class
Object
¶ Represents a TileDB object: array, group, key-value (map), or none (invalid).
Public Types
Public Functions
-
std::string
to_str
() const¶ Returns a string representation of the object, including its type and URI.
-
std::string
uri
() const¶ Returns the object URI.
Public Static Functions
-
static Object
object
(const Context &ctx, const std::string &uri)¶ Gets an Object object that encapsulates the object type of the given path.
- Return
- An object that contains the type along with the URI.
- Parameters
ctx
: The TileDB contexturi
: The path to the object.
-
std::string
-
class
ObjectIter
¶ Enables listing TileDB objects in a directory or walking recursively an entire directory tree.
Example:
// List the TileDB objects in an S3 bucket. tiledb::Context ctx; tiledb::ObjectIter obj_it(ctx, "s3://bucket-name"); for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) { const tiledb::Object &obj = *it; std::cout << obj << std::endl; }
Public Functions
-
ObjectIter
(Context &ctx, const std::string &root = ".")¶ Creates an object iterator. Unless
set_recursive
is invoked, this iterator will iterate only over the children ofroot
. It will also retrieve only TileDB-related objects.Example:
// List the TileDB objects in an S3 bucket. tiledb::Context ctx; tiledb::ObjectIter obj_it(ctx, "s3://bucket-name"); for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) { const tiledb::Object &obj = *it; std::cout << obj << std::endl; }
- Parameters
ctx
: The TileDB context.root
: The root directory where the iteration will begin.
-
void
set_iter_policy
(bool group, bool array)¶ Determines whether group, array and key-value objects will be iterated on during the walk. The default (if the function is not invoked) is
true
for all objects.- Parameters
group
: Iftrue
, groups will be considered.array
: Iftrue
, arrays will be considered.
-
void
set_recursive
(tiledb_walk_order_t walk_order = TILEDB_PREORDER)¶ Specifies that the iteration will be over all the directories in the tree rooted at
root_
.- Parameters
walk_order
: The walk order.
-
void
set_non_recursive
()¶ Disables recursive traversal.
Public Static Functions
-
static int
obj_getter
(const char *path, tiledb_object_t type, void *data)¶ Callback function to be used when invoking the C TileDB functions for walking through the TileDB objects in the
root_
diretory. The function retrieves the visited object and stored it in the object vectorobj_vec
.- Return
- If
1
then the walk should continue to the next object. - Parameters
path
: The path of a visited TileDB objecttype
: The type of the visited TileDB object.data
: To be casted to the vector where the visited object will be stored.
-
class
iterator
: public std::iterator<std::forward_iterator_tag, const Object>¶ The actual iterator implementation in this class.
-
struct
ObjGetterData
¶ Carries data to be passed to
obj_getter
.
-
VFS¶
-
class
VFS
¶ Implements a virtual filesystem that enables performing directory/file operations with a unified API on different filesystems, such as local posix/windows, HDFS, AWS S3, etc.
Public Types
-
using
filebuf
= impl::VFSFilebuf¶ Stream buffer for Tiledb VFS.
This is unbuffered; each read/write is directly dispatched to TileDB. As such it is recommended to issue fewer, larger, operations.
Example (write to file):
// Create the file buffer. tiledb::Context ctx; tiledb::VFS vfs(ctx); tiledb::VFS::filebuf buff(vfs); // Create new file, truncating it if it exists. buff.open("file.txt", std::ios::out); std::ostream os(&buff); if (!os.good()) throw std::runtime_error("Error opening file); std::string str = "This will be written to the file."; os.write(str.data(), str.size()); // Alternatively: // os << str; os.flush(); buff.close();
Example (read from file):
// Create the file buffer. tiledb::Context ctx; tiledb::VFS vfs(ctx); tiledb::VFS::filebuf buff(vfs); std::string file_uri = "s3://bucket-name/file.txt"; buff.open(file_uri, std::ios::in); std::istream is(&buff); if (!is.good()) throw std::runtime_error("Error opening file); // Read all contents from the file std::string contents; auto nbytes = vfs.file_size(file_uri); contents.resize(nbytes); vfs.read((char*)contents.data(), nbytes); buff.close();
Public Functions
-
VFS
(const Context &ctx, const Config &config)¶ Constructor.
- Parameters
ctx
: TileDB context.config
: TileDB config.
-
void
create_bucket
(const std::string &uri) const¶ Creates an object-store bucket with the input URI.
-
void
remove_bucket
(const std::string &uri) const¶ Deletes an object-store bucket with the input URI.
-
bool
is_bucket
(const std::string &uri) const¶ Checks if an object-store bucket with the input URI exists.
-
void
empty_bucket
(const std::string &bucket) const¶ Empty a bucket
-
bool
is_empty_bucket
(const std::string &bucket) const¶ Check if a bucket is empty
-
void
create_dir
(const std::string &uri) const¶ Creates a directory with the input URI.
-
bool
is_dir
(const std::string &uri) const¶ Checks if a directory with the input URI exists.
-
void
remove_dir
(const std::string &uri) const¶ Removes a directory (recursively) with the input URI.
-
bool
is_file
(const std::string &uri) const¶ Checks if a file with the input URI exists.
-
void
remove_file
(const std::string &uri) const¶ Deletes a file with the input URI.
-
uint64_t
dir_size
(const std::string &uri) const¶ Retrieves the size of a directory with the input URI.
-
std::vector<std::string>
ls
(const std::string &uri) const¶ Retrieves the children in directory
uri
. This function is non-recursive, i.e., it focuses in one level belowuri
.
-
uint64_t
file_size
(const std::string &uri) const¶ Retrieves the size of a file with the input URI.
-
void
move_file
(const std::string &old_uri, const std::string &new_uri) const¶ Renames a TileDB file from an old URI to a new URI.
-
void
move_dir
(const std::string &old_uri, const std::string &new_uri) const¶ Renames a TileDB directory from an old URI to a new URI.
-
void
touch
(const std::string &uri) const¶ Touches a file with the input URI, i.e., creates a new empty file.
-
std::shared_ptr<tiledb_vfs_t>
ptr
() const¶ Get the underlying tiledb object
Public Static Functions
-
static int
ls_getter
(const char *path, void *data)¶ Callback function to be used when invoking the C TileDB function for getting the children of a URI. It simply adds
path
tovec
(which is casted fromdata
).- Return
- If
1
then the walk should continue to the next object. - Parameters
path
: The path of a visited TileDB objectdata
: This will be casted to the vector that will storepath
.
-
using
Utils¶
-
namespace
tiledb
¶ Functions
-
template<typename
T
, typenameE
= typename std::vector<T>>
std::vector<E>group_by_cell
(const std::vector<uint64_t> &offsets, const std::vector<T> &data, uint64_t num_offsets, uint64_t num_data)¶ Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.
The offsets must be given in units of bytes.
Example:
std::vector<uint64_t> offsets; std::vector<char> data; ... query.set_buffer("attr_name", offsets, data); query.submit(); ... auto attr_results = query.result_buffer_elements()["attr_name"]; // cell_vals length will be equal to the number of cells read by the query. // Each element is a std::vector<char> with each cell's data for "attr_name" auto cell_vals = group_by_cell(offsets, data, attr_results.first, attr_results.second); // Reconstruct a std::string value for the first cell: std::string cell_val(cell_vals[0].data(), cell_vals[0].size());
- Note
- This function, and the other utility functions, copy all of the input data when constructing their return values. Thus, these may be expensive for large amounts of data.
- Return
std::vector<E>
- Template Parameters
T
: Underlying attribute datatypeE
: Cell type. usuallystd::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters
offsets
: Offsets vector. This specifies the start offset in bytes of each cell in the data vector.data
: Data vector. Flat data buffer with cell contents.num_offsets
: Number of offset elements populated by query. If the entire buffer is to be grouped, passoffsets.size()
.num_data
: Number of data elements populated by query. If the entire buffer is to be grouped, passdata.size()
.
-
template<typename
T
, typenameE
= typename std::vector<T>>
std::vector<E>group_by_cell
(const std::pair<std::vector<uint64_t>, std::vector<T>> &buff, uint64_t num_offsets, uint64_t num_data)¶ Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.
The offsets must be given in units of bytes.
Example:
std::vector<uint64_t> offsets; std::vector<char> data; ... query.set_buffer("attr_name", offsets, data); query.submit(); ... auto attr_results = query.result_buffer_elements()["attr_name"]; // cell_vals length will be equal to the number of cells read by the query. // Each element is a std::vector<char> with each cell's data for "attr_name" auto cell_vals = group_by_cell(std::make_pair(offsets, data), attr_results.first, attr_results.second); // Reconstruct a std::string value for the first cell: std::string cell_val(cell_vals[0].data(), cell_vals[0].size());
- Return
std::vector<E>
- Template Parameters
T
: Underlying attribute datatypeE
: Cell type. usuallystd::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters
buff
: Pair of (offset_vec, data_vec) to be grouped.num_offsets
: Number of offset elements populated by query.num_data
: Number of data elements populated by query.
-
template<typename
T
, typenameE
= typename std::vector<T>>
std::vector<E>group_by_cell
(const std::vector<uint64_t> &offsets, const std::vector<T> &data)¶ Convert a generic (offset, data) vector pair into a single vector of vectors. The offsets must be given in units of bytes.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; std::vector<uint64_t> offsets = {0, 5}; auto grouped = group_by_cell<char, std::string>(offsets, buf); // grouped.size() == 2 // grouped[0] == "abcde" // grouped[1] == "fghi"
- Return
std::vector<E>
- Template Parameters
T
: Underlying attribute datatypeE
: Cell type. usuallystd::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters
offsets
: Offsets vectordata
: Data vector
-
template<typename
T
, typenameE
= typename std::vector<T>>
std::vector<E>group_by_cell
(const std::vector<T> &buff, uint64_t el_per_cell, uint64_t num_buff)¶ Convert a vector of elements into a vector of fixed-length vectors.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell(buf, 3, buf.size()); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell(buf, 2, buf.size());
- Return
std::vector<E>
- Template Parameters
T
: Underlying attribute datatypeE
: Cell type. usuallystd::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters
buff
: Data buffer to groupel_per_cell
: Number of elements per cell to group togethernum_buff
: Number of elements populated by query. To group whole buffer, passbuff.size()
.
-
template<typename
T
, typenameE
= typename std::vector<T>>
std::vector<E>group_by_cell
(const std::vector<T> &buff, uint64_t el_per_cell)¶ Convert a vector of elements into a vector of fixed-length vectors.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell(buf, 3); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell(buf, 2);
- Return
std::vector<E>
- Template Parameters
T
: Element typeE
: Cell type. usuallystd::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters
buff
: Data buffer to groupel_per_cell
: Number of elements per cell to group together
-
template<uint64_t
N
, typenameT
>
std::vector<std::array<T, N>>group_by_cell
(const std::vector<T> &buff, uint64_t num_buff)¶ Convert a vector of elements into a vector of fixed-length arrays.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell<3>(buf, buf.size()); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell<2>(buf, buf.size());
- Return
std::vector<std::array<T,N>>
- Template Parameters
N
: Elements per cellT
: Array element type
- Parameters
buff
: Data buffer to groupnum_buff
: Number of elements in buff that were populated by the query.
-
template<uint64_t
N
, typenameT
>
std::vector<std::array<T, N>>group_by_cell
(const std::vector<T> &buff)¶ Convert a vector of elements into a vector of fixed-length arrays.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell<3>(buf); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell<2>(buf);
- Return
std::vector<std::array<T,N>>
- Template Parameters
N
: Elements per cellT
: Array element type
- Parameters
buff
: data buff to group
-
template<typename
T
, typenameR
= typename T::value_type>
std::pair<std::vector<uint64_t>, std::vector<R>>ungroup_var_buffer
(const std::vector<T> &data)¶ Unpack a vector of variable sized attributes into a data and offset buffer. The offset buffer result is in units of bytes.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; // For the sake of example, group buf into groups of 3 elements: auto grouped = group_by_cell(buf, 3); // Ungroup into offsets, data pair. auto p = ungroup_var_buffer(grouped); auto offsets = p.first; // {0, 3, 6} auto data = p.second; // {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}
- Return
- pair where
.first
is the offset buffer, and.second
is data buffer - Template Parameters
T
: Vector type.T::value_type
is considered the underlying data element type. Should be vector or string.R
:T::value_type
, deduced
- Parameters
data
: Data buffer to unpack
-
template<typename
V
, typenameT
= typename V::value_type::value_type>
std::vector<T>flatten
(const V &vec)¶ Convert a vector-of-vectors and flatten it into a single vector.
Example:
std::vector<std::string> v = {"a", "bb", "ccc"}; auto flat_v = flatten(v); std::string s(flat_v.begin(), flat_v.end()); // "abbccc" std::vector<std::vector<double>> d = {{1.2, 2.1}, {2.3, 3.2}, {3.4, 4.3}}; auto flat_d = flatten(d); // {1.2, 2.1, 2.3, 3.2, 3.4, 4.3};
- Return
std::vector<T>
- Template Parameters
V
: Container typeT
: Return element type
- Parameters
vec
: Vector to flatten
-
template<typename
Stats¶
-
class
Stats
¶ Encapsulates functionality related to internal TileDB statistics.
Example:
// Enable stats, submit a query, then dump to stdout. tiledb::Stats::enable(); query.submit(); tiledb::Stats::dump(); // Dump to a string instead. std::string str; tiledb::Stats::dump(&str);
Public Static Functions
-
static void
enable
()¶ Enables internal TileDB statistics gathering.
-
static void
disable
()¶ Disables internal TileDB statistics gathering.
-
static void
reset
()¶ Reset all internal statistics counters to 0.
-
static void
dump
(FILE *out = stdout)¶ Dump all statistics counters to to some output (e.g., file or stdout).
- Parameters
out
: The output.
-
static void
dump
(std::string *out)¶ Dump all statistics counters to a string.
- Parameters
out
: The output.
-
static void