Multi-attribute Arrays

In this tutorial we will learn how to add multiple attributes to TileDB arrays. We will focus only on dense arrays, as everything you learn here applies to sparse arrays as well in a straightforward manner. It is recommended to read the tutorial on dense arrays first.

Full programs
Program Links
multi_attribute multiattrcpp multiattrpy

Basic concepts and definitions

Creating a multi-attribute array

This is similar to what we covered in the simple dense array example. The only difference is that we add two attributes to the array schema instead of one, namely a1 that stores characters, and a2 that stores floats. Notice however that a2 is defined to store two float values per cell.

Note

In the current version of TileDB, once an array has been created, you cannot modify the array schema. This means that it is not currently possible to add or remove attributes to an already existing array.

Writing to the array

Writing is similar to the simple dense array example. The difference here is that we need to prepare two data buffers (one for a1 and one for a2). Note that there should be a one-to-one correspondence between the values of a1 and a2 in the buffers; for instance, value 1 in data_a1 is associated with value (1.1, 1.2) in data_a2 (recall each cell stores two floats on a2), 2 in data_a1 with (2.1, 2.2) in data_a2, etc.

Warning

During writing, you must provide a value for all attributes for the cells being written, otherwise an error will be thrown.

The array on disk now stores the written data. The resulting array is depicted in the figure below.

../_images/multi_attribute.png

Reading from the array

We focus on subarray [1,2], [2,4].

Subselecting on attributes

While you must provide values for all attributes during writes, the same is not true during reads.

If you compile and run the example of this tutorial as shown below, you should see the following output:

On-disk structure

Let us look at the contents of the array of this example on disk.

$ ls -l multi_attribute_array/
total 8
drwx------  5 stavros  staff  160 Jun 25 15:34 __1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319
-rwx------  1 stavros  staff  159 Jun 25 15:34 __array_schema.tdb
-rwx------  1 stavros  staff    0 Jun 25 15:34 __lock.tdb
drwx------  2 stavros  staff   64 Jun 25 15:34 __meta

$ ls -l multi_attribute_array/__1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319/
total 24
-rwx------  1 stavros  staff  939 Jun 25 15:34 __fragment_metadata.tdb
-rwx------  1 stavros  staff   36 Jun 25 15:34 a1.tdb
-rwx------  1 stavros  staff  148 Jun 25 15:34 a2.tdb

TileDB created two separate attribute files in fragment subdirectory __1561491299419_1561491299419_fcb0ee91899142baad8a08049c0e2319: a1.tdb that stores the cell values on attribute a1 (the file size is 16 bytes, equal to the size required for storing 16 1-byte characters, plus 20 bytes of metadata overhead), and a2.tdb that stores the cell values on attribute a2 (the file size is 128 bytes, equal to the size required for storing 32 4-byte floats, recalling that each cell stores two floats, plus the 20 bytes of metadata).