GenomicsDB Protobuf Documentation
genomicsdb_callsets_mapping.proto
CallsetMappingPB
Field |
Type |
Label |
Description |
---|---|---|---|
callsets |
repeated |
SampleIDToTileDBIDMap
Field |
Type |
Label |
Description |
---|---|---|---|
sample_name |
required |
||
row_idx |
required |
||
idx_in_file |
required |
||
stream_name |
optional |
||
filename |
optional |
genomicsdb_coordinates.proto
ContigInterval
Field |
Type |
Label |
Description |
---|---|---|---|
contig |
required |
||
begin |
optional |
||
end |
optional |
ContigPosition
Field |
Type |
Label |
Description |
---|---|---|---|
contig |
required |
||
position |
required |
GenomicsDBColumn
Field |
Type |
Label |
Description |
---|---|---|---|
tiledb_column |
optional |
||
contig_position |
optional |
GenomicsDBColumnInterval
Field |
Type |
Label |
Description |
---|---|---|---|
tiledb_column_interval |
optional |
||
contig_interval |
optional |
GenomicsDBColumnOrInterval
Field |
Type |
Label |
Description |
---|---|---|---|
column |
optional |
||
column_interval |
optional |
TileDBColumnInterval
Field |
Type |
Label |
Description |
---|---|---|---|
begin |
required |
||
end |
required |
genomicsdb_export_config.proto
AnnotationSource
Field |
Type |
Label |
Description |
---|---|---|---|
filename |
required |
||
data_source |
required |
||
attributes |
repeated |
||
is_vcf |
optional |
Default: true |
|
file_chromosomes |
repeated |
ExportConfiguration
Field |
Type |
Label |
Description |
---|---|---|---|
workspace |
required |
||
reference_genome |
optional |
||
array_name |
optional |
||
generate_array_name_from_partition_bounds |
optional |
Default: true |
|
query_column_ranges |
repeated |
Only one of the following two fields must be defined query_contig_intervals is recommended for use |
|
query_contig_intervals |
repeated |
||
query_row_ranges |
repeated |
Only one of the following two fields must be defined |
|
query_sample_names |
repeated |
||
attributes |
repeated |
||
query_filter |
optional |
QueryConfiguration - END |
|
vcf_header_filename |
optional |
||
vcf_output_filename |
optional |
||
vcf_output_format |
optional |
||
vid_mapping_file |
optional |
||
vid_mapping |
optional |
||
callset_mapping_file |
optional |
||
callset_mapping |
optional |
||
max_diploid_alt_alleles_that_can_be_genotyped |
optional |
Other configuration |
|
max_genotype_count |
optional |
||
index_output_VCF |
optional |
||
produce_GT_field |
optional |
||
produce_FILTER_field |
optional |
||
sites_only_query |
optional |
||
produce_GT_with_min_PL_value_for_spanning_deletions |
optional |
||
scan_full |
optional |
||
segment_size |
optional |
Default: 10485760 |
|
combined_vcf_records_buffer_size_limit |
optional |
||
enable_shared_posixfs_optimizations |
optional |
Default: false |
|
bypass_intersecting_intervals_phase |
optional |
Default: false |
|
spark_config |
optional |
||
annotation_source |
repeated |
||
annotation_buffer_size |
optional |
Default: 10240 |
GenomicsDBColumnOrIntervalList
Field |
Type |
Label |
Description |
---|---|---|---|
column_or_interval_list |
repeated |
QueryConfiguration
Simple query configuration for GenomicsDB::query_variant_calls for the class initialized with ExportConfiguration below
Field |
Type |
Label |
Description |
---|---|---|---|
array_name |
optional |
||
generate_array_name_from_partition_bounds |
optional |
Default: true |
|
query_column_ranges |
repeated |
Only one of the following two fields must be defined query_contig_intervals is recommended for use |
|
query_contig_intervals |
repeated |
||
query_row_ranges |
repeated |
Only one of the following two fields must be defined |
|
query_sample_names |
repeated |
||
attributes |
repeated |
||
query_filter |
optional |
RowRange
Field |
Type |
Label |
Description |
---|---|---|---|
low |
required |
||
high |
required |
RowRangeList
Field |
Type |
Label |
Description |
---|---|---|---|
range_list |
repeated |
SparkConfig
Field |
Type |
Label |
Description |
---|---|---|---|
query_block_size |
optional |
||
query_block_size_margin |
optional |
genomicsdb_import_config.proto
ImportConfiguration
Field |
Type |
Label |
Description |
---|---|---|---|
size_per_column_partition |
required |
Default: 16384 |
|
row_based_partitioning |
optional |
Default: false |
|
produce_combined_vcf |
optional |
Default: false |
|
produce_tiledb_array |
optional |
Default: true |
|
column_partitions |
repeated |
||
vid_mapping_file |
optional |
||
vid_mapping |
optional |
||
callset_mapping_file |
optional |
||
callset_mapping |
optional |
||
treat_deletions_as_intervals |
optional |
Default: true |
|
num_parallel_vcf_files |
optional |
Default: 1 |
|
delete_and_create_tiledb_array |
optional |
Default: false |
|
do_ping_pong_buffering |
optional |
Default: true |
|
offload_vcf_output_processing |
optional |
Default: true |
|
discard_vcf_index |
optional |
Default: true |
|
segment_size |
optional |
Default: 10485760 |
|
compress_tiledb_array |
optional |
Default: true |
|
num_cells_per_tile |
optional |
Default: 1000 |
|
fail_if_updating |
optional |
Default: false |
|
tiledb_compression_type |
optional |
Default: 1 |
|
tiledb_compression_level |
optional |
Default: -1 |
|
consolidate_tiledb_array_after_load |
optional |
Default: false |
|
disable_synced_writes |
optional |
Default: true |
|
ignore_cells_not_in_partition |
optional |
||
lb_callset_row_idx |
optional |
Default: 0 |
|
ub_callset_row_idx |
optional |
||
enable_shared_posixfs_optimizations |
optional |
Default: false |
|
disable_delta_encode_for_offsets |
optional |
Default: false |
|
disable_delta_encode_for_coords |
optional |
Default: false |
|
enable_bit_shuffle_gt |
optional |
Default: false |
|
enable_lz4_compression_gt |
optional |
Default: false |
|
reference_genome |
optional |
||
vcf_header_filename |
optional |
Partition
Field |
Type |
Label |
Description |
---|---|---|---|
begin |
required |
||
workspace |
optional |
||
array_name |
optional |
||
generate_array_name_from_partition_bounds |
optional |
||
vcf_output_filename |
optional |
||
vcf_header_filename |
optional |
||
end |
optional |
genomicsdb_vid_mapping.proto
Chromosome
Field |
Type |
Label |
Description |
---|---|---|---|
name |
required |
||
length |
required |
||
tiledb_column_offset |
required |
FieldLengthDescriptorComponentPB
Field |
Type |
Label |
Description |
---|---|---|---|
variable_length_descriptor |
optional |
||
fixed_length |
optional |
GenomicsDBFieldInfo
Field |
Type |
Label |
Description |
---|---|---|---|
name |
required |
||
type |
repeated |
||
vcf_field_class |
repeated |
||
vcf_type |
optional |
||
length |
repeated |
||
vcf_delimiter |
repeated |
||
VCF_field_combine_operation |
optional |
||
vcf_name |
optional |
useful when multiple fields of different types/length with the same name (FILTER, FORMAT, INFO) are defined in the VCF header |
|
disable_remap_missing_with_non_ref |
optional |
Default: false |
VidMappingPB
Field |
Type |
Label |
Description |
---|---|---|---|
fields |
repeated |
||
contigs |
repeated |
Scalar Value Types
.proto Type |
Notes |
C++ |
Java |
Python |
Go |
---|---|---|---|---|---|
double |
double |
float |
float64 |
||
float |
float |
float |
float32 |
||
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. |
int32 |
int |
int |
int32 |
|
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. |
int64 |
long |
int/long |
int64 |
|
Uses variable-length encoding. |
uint32 |
int |
int/long |
uint32 |
|
Uses variable-length encoding. |
uint64 |
long |
int/long |
uint64 |
|
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. |
int32 |
int |
int |
int32 |
|
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. |
int64 |
long |
int/long |
int64 |
|
Always four bytes. More efficient than uint32 if values are often greater than 2^28. |
uint32 |
int |
int |
uint32 |
|
Always eight bytes. More efficient than uint64 if values are often greater than 2^56. |
uint64 |
long |
int/long |
uint64 |
|
Always four bytes. |
int32 |
int |
int |
int32 |
|
Always eight bytes. |
int64 |
long |
int/long |
int64 |
|
bool |
boolean |
boolean |
bool |
||
A string must always contain UTF-8 encoded or 7-bit ASCII text. |
string |
String |
str/unicode |
string |
|
May contain any arbitrary sequence of bytes. |
string |
ByteString |
str |
[]byte |