Array Metadata¶
In this tutorial we will learn how to write, read and consolidate array metadata.
Program | Links |
array_metadata |
![]() |
What is array metadata?¶
Array metadata is a collection of key-value pairs associated with an array.
Each metadata item is of the form: key : (value_type, value_num, value)
,
where key
is a string key, value_type
is the data type of the value,
value_num
is the number of elements that constitute the value (can be
more than one), and value
is the metadata value in binary form.
TileDB persistently stores the array metadata inside the array directory. However, note that TileDB loads all its metadata upon opening the array in read mode, which means that it assumes all the array metadata is small enough to be maintained in main memory.
Writing array metadata¶
To write array metadata, the array must be opened in write mode. Here is an example:
Note that the metadata gets flushed to persistent storage only upon closing the array.
Reading array metadata¶
To read array metadata, the array must be opened in read mode. Here is an example:
You can also enumerate all array metadata as follows:
Deleting array metadata¶
TileDB allows you to delete metadata simply as shown in the example below. The array must be opened in write mode and appropriately closed in the end so that the change gets flushed to persistent storage. Note also that you can mix writing/overwriting and deleting metadata in a single write session (i.e., between opening an array in write mode and closing it).
On-disk structure¶
Every array metadata write session (i.e., between opening the array in write mode,
writing/deleting some metadata and closing the array) creates a timestamped
array metadata file inside the __meta
directory in the array directory:
$ ls -l my_array/__meta/
total 8
-rwx------ 1 stavros staff 127 Sep 7 18:27 __1567895268179_1567895268179_87a009d6b2cf46b68d74621635863b45
Notice that the file name has an identical structure to that of the fragment name
(see Fragments and Consolidation), i.e., it consists of a timestamp
range and a UUID. The same semantics for opening an array at a timestamp apply
also to metadata as well, i.e., if the array is opened at a timestamp before
1567895268179
, the above file will be ignored.
Multiple separate write sessions (executed either serially or in parallel) create multiple timestamped metadata files, similar to fragments (and again no array locking is necessary here).
$ ls -l my_array/__meta/
total 8
-rwx------ 1 stavros staff 127 Sep 7 18:27 __1567895268179_1567895268179_87a009d6b2cf46b68d74621635863b45
-rwx------ 1 stavros staff 127 Sep 7 19:21 __1567898509507_1567898509507_f0d9756d932540729059eabcfe6856d1
Consolidating array metadata¶
To avoid the uncontrollable creation of numerous array metadata files, TileDB enables consolidating all files in one, similar again to fragment consoplidation:
Continuing the above example, the result of consolidation is:
$ ls -l my_array/__meta/
total 8
-rwx------ 1 stavros staff 127 Sep 7 19:30 __1567895268179_1567898509507_7382ed6aef65427e8cc9b076e6970c61
Notice that now the new file name contains a timestamp range that includes both the timestamps of the consolidated array metadata file names. This is again very similar to fragment consolidation.
Encrypting array metatadata¶
The metadata of the array inherit the encryption filters of the array. This means that if the array is encrypted, the array metadata will be encrypted as well. Similar to array writes/reads, in order to write/read array metadata the array must be opened with the encryption key (see Encryption). Finally, to consolidate the metadata of an encrypted array, you must use: