# GenomicsDB Protobuf Documentation

Top

## genomicsdb_callsets_mapping.proto ### CallsetMappingPB | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | callsets | [SampleIDToTileDBIDMap](#SampleIDToTileDBIDMap) | repeated | | ### SampleIDToTileDBIDMap | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | sample_name | [string](#string) | required | | | row_idx | [int64](#int64) | required | | | idx_in_file | [int64](#int64) | required | | | stream_name | [string](#string) | optional | | | filename | [string](#string) | optional | |

Top

## genomicsdb_coordinates.proto ### ContigInterval | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | contig | [string](#string) | required | | | begin | [int64](#int64) | optional | | | end | [int64](#int64) | optional | | ### ContigPosition | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | contig | [string](#string) | required | | | position | [int64](#int64) | required | | ### GenomicsDBColumn | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | tiledb_column | [int64](#int64) | optional | | | contig_position | [ContigPosition](#ContigPosition) | optional | | ### GenomicsDBColumnInterval | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | tiledb_column_interval | [TileDBColumnInterval](#TileDBColumnInterval) | optional | | | contig_interval | [ContigInterval](#ContigInterval) | optional | | ### GenomicsDBColumnOrInterval | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | column | [GenomicsDBColumn](#GenomicsDBColumn) | optional | | | column_interval | [GenomicsDBColumnInterval](#GenomicsDBColumnInterval) | optional | | ### TileDBColumnInterval | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | begin | [int64](#int64) | required | | | end | [int64](#int64) | required | |

Top

## genomicsdb_export_config.proto ### AnnotationSource | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | filename | [string](#string) | required | | | data_source | [string](#string) | required | | | attributes | [string](#string) | repeated | | | is_vcf | [bool](#bool) | optional | Default: true | | file_chromosomes | [string](#string) | repeated | | ### ExportConfiguration | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | workspace | [string](#string) | required | | | reference_genome | [string](#string) | optional | | | array_name | [string](#string) | optional | | | generate_array_name_from_partition_bounds | [bool](#bool) | optional | Default: true | | query_column_ranges | [GenomicsDBColumnOrIntervalList](#genomicsdb_pb-GenomicsDBColumnOrIntervalList) | repeated | Only one of the following two fields must be defined query_contig_intervals is recommended for use | | query_contig_intervals | [ContigInterval](#ContigInterval) | repeated | | | query_row_ranges | [RowRangeList](#genomicsdb_pb-RowRangeList) | repeated | Only one of the following two fields must be defined | | query_sample_names | [string](#string) | repeated | | | attributes | [string](#string) | repeated | | | query_filter | [string](#string) | optional | QueryConfiguration - END | | vcf_header_filename | [string](#string) | optional | | | vcf_output_filename | [string](#string) | optional | | | vcf_output_format | [string](#string) | optional | | | vid_mapping_file | [string](#string) | optional | | | vid_mapping | [VidMappingPB](#VidMappingPB) | optional | | | callset_mapping_file | [string](#string) | optional | | | callset_mapping | [CallsetMappingPB](#CallsetMappingPB) | optional | | | max_diploid_alt_alleles_that_can_be_genotyped | [uint32](#uint32) | optional | Other configuration | | max_genotype_count | [uint32](#uint32) | optional | | | index_output_VCF | [bool](#bool) | optional | | | produce_GT_field | [bool](#bool) | optional | | | produce_FILTER_field | [bool](#bool) | optional | | | sites_only_query | [bool](#bool) | optional | | | produce_GT_with_min_PL_value_for_spanning_deletions | [bool](#bool) | optional | | | scan_full | [bool](#bool) | optional | | | segment_size | [uint32](#uint32) | optional | Default: 10485760 | | combined_vcf_records_buffer_size_limit | [uint32](#uint32) | optional | | | enable_shared_posixfs_optimizations | [bool](#bool) | optional | Default: false | | bypass_intersecting_intervals_phase | [bool](#bool) | optional | Default: false | | spark_config | [SparkConfig](#genomicsdb_pb-SparkConfig) | optional | | | annotation_source | [AnnotationSource](#genomicsdb_pb-AnnotationSource) | repeated | | | annotation_buffer_size | [uint32](#uint32) | optional | Default: 10240 | ### GenomicsDBColumnOrIntervalList | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | column_or_interval_list | [GenomicsDBColumnOrInterval](#GenomicsDBColumnOrInterval) | repeated | | ### QueryConfiguration Simple query configuration for GenomicsDB::query_variant_calls for the class initialized with ExportConfiguration below | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | array_name | [string](#string) | optional | | | generate_array_name_from_partition_bounds | [bool](#bool) | optional | Default: true | | query_column_ranges | [GenomicsDBColumnOrIntervalList](#genomicsdb_pb-GenomicsDBColumnOrIntervalList) | repeated | Only one of the following two fields must be defined query_contig_intervals is recommended for use | | query_contig_intervals | [ContigInterval](#ContigInterval) | repeated | | | query_row_ranges | [RowRangeList](#genomicsdb_pb-RowRangeList) | repeated | Only one of the following two fields must be defined | | query_sample_names | [string](#string) | repeated | | | attributes | [string](#string) | repeated | | | query_filter | [string](#string) | optional | | ### RowRange | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | low | [int64](#int64) | required | | | high | [int64](#int64) | required | | ### RowRangeList | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | range_list | [RowRange](#genomicsdb_pb-RowRange) | repeated | | ### SparkConfig | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | query_block_size | [int64](#int64) | optional | | | query_block_size_margin | [int64](#int64) | optional | |

Top

## genomicsdb_import_config.proto ### ImportConfiguration | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | size_per_column_partition | [int64](#int64) | required | Default: 16384 | | row_based_partitioning | [bool](#bool) | optional | Default: false | | produce_combined_vcf | [bool](#bool) | optional | Default: false | | produce_tiledb_array | [bool](#bool) | optional | Default: true | | column_partitions | [Partition](#genomicsdb_pb-Partition) | repeated | | | vid_mapping_file | [string](#string) | optional | | | vid_mapping | [VidMappingPB](#VidMappingPB) | optional | | | callset_mapping_file | [string](#string) | optional | | | callset_mapping | [CallsetMappingPB](#CallsetMappingPB) | optional | | | treat_deletions_as_intervals | [bool](#bool) | optional | Default: true | | num_parallel_vcf_files | [int32](#int32) | optional | Default: 1 | | delete_and_create_tiledb_array | [bool](#bool) | optional | Default: false | | do_ping_pong_buffering | [bool](#bool) | optional | Default: true | | offload_vcf_output_processing | [bool](#bool) | optional | Default: true | | discard_vcf_index | [bool](#bool) | optional | Default: true | | segment_size | [int64](#int64) | optional | Default: 10485760 | | compress_tiledb_array | [bool](#bool) | optional | Default: true | | num_cells_per_tile | [int64](#int64) | optional | Default: 1000 | | fail_if_updating | [bool](#bool) | optional | Default: false | | tiledb_compression_type | [int32](#int32) | optional | Default: 1 | | tiledb_compression_level | [int32](#int32) | optional | Default: -1 | | consolidate_tiledb_array_after_load | [bool](#bool) | optional | Default: false | | disable_synced_writes | [bool](#bool) | optional | Default: true | | ignore_cells_not_in_partition | [bool](#bool) | optional | | | lb_callset_row_idx | [int64](#int64) | optional | Default: 0 | | ub_callset_row_idx | [int64](#int64) | optional | | | enable_shared_posixfs_optimizations | [bool](#bool) | optional | Default: false | | disable_delta_encode_for_offsets | [bool](#bool) | optional | Default: false | | disable_delta_encode_for_coords | [bool](#bool) | optional | Default: false | | enable_bit_shuffle_gt | [bool](#bool) | optional | Default: false | | enable_lz4_compression_gt | [bool](#bool) | optional | Default: false | | reference_genome | [string](#string) | optional | | | vcf_header_filename | [string](#string) | optional | | ### Partition | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | begin | [GenomicsDBColumn](#GenomicsDBColumn) | required | | | workspace | [string](#string) | optional | | | array_name | [string](#string) | optional | | | generate_array_name_from_partition_bounds | [bool](#bool) | optional | | | vcf_output_filename | [string](#string) | optional | | | vcf_header_filename | [string](#string) | optional | | | end | [GenomicsDBColumn](#GenomicsDBColumn) | optional | |

Top

## genomicsdb_vid_mapping.proto ### Chromosome | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | name | [string](#string) | required | | | length | [int64](#int64) | required | | | tiledb_column_offset | [int64](#int64) | required | | ### FieldLengthDescriptorComponentPB | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | variable_length_descriptor | [string](#string) | optional | | | fixed_length | [int32](#int32) | optional | | ### GenomicsDBFieldInfo | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | name | [string](#string) | required | | | type | [string](#string) | repeated | | | vcf_field_class | [string](#string) | repeated | | | vcf_type | [string](#string) | optional | | | length | [FieldLengthDescriptorComponentPB](#FieldLengthDescriptorComponentPB) | repeated | | | vcf_delimiter | [string](#string) | repeated | | | VCF_field_combine_operation | [string](#string) | optional | | | vcf_name | [string](#string) | optional | useful when multiple fields of different types/length with the same name (FILTER, FORMAT, INFO) are defined in the VCF header | | disable_remap_missing_with_non_ref | [bool](#bool) | optional | Default: false | ### VidMappingPB | Field | Type | Label | Description | | ----- | ---- | ----- | ----------- | | fields | [GenomicsDBFieldInfo](#GenomicsDBFieldInfo) | repeated | | | contigs | [Chromosome](#Chromosome) | repeated | | ## Scalar Value Types | .proto Type | Notes | C++ | Java | Python | Go | | ----------- | ----- | --- | ---- | ------ | -- | | double | | double | double | float | float64 | | float | | float | float | float | float32 | | int32 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. | int32 | int | int | int32 | | int64 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. | int64 | long | int/long | int64 | | uint32 | Uses variable-length encoding. | uint32 | int | int/long | uint32 | | uint64 | Uses variable-length encoding. | uint64 | long | int/long | uint64 | | sint32 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. | int32 | int | int | int32 | | sint64 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. | int64 | long | int/long | int64 | | fixed32 | Always four bytes. More efficient than uint32 if values are often greater than 2^28. | uint32 | int | int | uint32 | | fixed64 | Always eight bytes. More efficient than uint64 if values are often greater than 2^56. | uint64 | long | int/long | uint64 | | sfixed32 | Always four bytes. | int32 | int | int | int32 | | sfixed64 | Always eight bytes. | int64 | long | int/long | int64 | | bool | | bool | boolean | boolean | bool | | string | A string must always contain UTF-8 encoded or 7-bit ASCII text. | string | String | str/unicode | string | | bytes | May contain any arbitrary sequence of bytes. | string | ByteString | str | []byte |