Skip to content

Catalog

Tasks

Name Description Parameters
samtools_faidx Create a FASTA index (.fai file) using samtools faidx. ref (Reference)
create_sequence_dictionary Create a sequence dictionary (.dict file) using GATK CreateSequenceDictionary. ref (Reference)
bwa_index Create BWA index files for a reference genome using bwa index. ref (Reference)
bwa_mem Align FASTQ reads to reference genome using BWA-MEM. ref (Reference), r1 (R1), r2 (R2
sort_sam Sort a SAM/BAM file. alignment (Alignment), sort_order (str)
mark_duplicates Mark duplicate reads in a BAM file. alignment (Alignment)
merge_bam_alignment Merge alignment data from aligned BAM with data in unmapped BAM. aligned_bam (Alignment), unmapped_bam (Alignment), ref (Reference)
base_recalibrator Generate a Base Quality Score Recalibration report. alignment (Alignment), ref (Reference), known_sites (list[KnownSites])
apply_bqsr Apply Base Quality Score Recalibration to a BAM file. alignment (Alignment), ref (Reference), bqsr_report (BQSRReport)
haplotype_caller Call germline variants in GVCF mode using GATK HaplotypeCaller. alignment (Alignment), ref (Reference)
joint_call_gvcfs Consolidate GVCFs into GenomicsDB and joint-genotype in a single task. gvcfs (list[Variants]), ref (Reference), intervals (list[str]), cohort_id (str)
combine_gvcfs Combine multiple per-sample GVCFs into a single multi-sample GVCF. gvcfs (list[Variants]), ref (Reference), cohort_id (str)
genomics_db_import Import GVCFs to GenomicsDB workspace for scalable joint genotyping. gvcfs (list[Variants]), workspace_path (Path), intervals (list[str]
variant_recalibrator Build a VQSR recalibration model using GATK VariantRecalibrator. vcf (Variants), ref (Reference), resources (list[KnownSites]), mode (str)
apply_vqsr Apply VQSR recalibration to a VCF using GATK ApplyVQSR. vcf (Variants), ref (Reference), vqsr_model (VQSRModel), truth_sensitivity_filter_level (float

Workflows

Name Description Parameters
prepare_reference Prepare reference genome for alignment and variant calling. build (str)
preprocess_sample Pre-process a single sample's reads for variant calling. build (str), sample_id (str), run_bqsr (bool)
germline_short_variant_discovery Germline short variant discovery from preprocessed BAMs. build (str), cohort_id (str)

Asset Types

Asset Key Class Module Fields
aligner_index AlignerIndex types/reference.py aligner, build, reference_cid
alignment Alignment types/alignment.py bqsr_applied, duplicates_marked, format, r1_cid, reference_cid, sample_id, sorted, tool
alignment_index AlignmentIndex types/alignment.py alignment_cid, sample_id
bqsr_report BQSRReport types/alignment.py alignment_cid, sample_id, tool
duplicate_metrics DuplicateMetrics types/alignment.py alignment_cid, sample_id, tool
known_sites KnownSites types/variants.py build, known, prior, resource_name, training, truth, vqsr_mode
known_sites_index KnownSitesIndex types/variants.py known_sites_cid
r1 R1 types/reads.py mate_cid, sample_id, sequencing_platform
r2 R2 types/reads.py mate_cid, sample_id, sequencing_platform
reference Reference types/reference.py build
reference_index ReferenceIndex types/reference.py build, reference_cid, tool
sequence_dict SequenceDict types/reference.py build, reference_cid, tool
variants Variants types/variants.py build, caller, sample_count, sample_id, source_samples, variant_type, vqsr_mode
variants_index VariantsIndex types/variants.py sample_id, variants_cid
vqsr_model VQSRModel types/variants.py build, mode, sample_id, tranches_path, variants_cid