Skip to content

GATK4 HaplotypeCaller step, in gVCF mode, first step for subsequent whole cohort Joint Genotyping.

License

Notifications You must be signed in to change notification settings

IARCbioinfo/gatk4-HaplotypeCaller-nf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

gatk4-HaplotypeCaller-nf

GATK4 HaplotypeCaller step, in gVCF mode, first step for subsequent whole cohort Joint Genotyping, following in GATK Best Practices (step Call Variants Per-Sample).

Description

Small pipeline to call recalibrated BAM, on a per sample basis, and store the gVCF. This pipeline will take advantage of a scatter-gather strategy. A subsequent pipeline will perform the full cohort calling with all the gVCF files.

Dependencies

  1. This pipeline is based on nextflow. As we have several nextflow pipelines, we have centralized the common information in the IARC-nf repository. Please read it carefully as it contains essential information for the installation, basic usage and configuration of nextflow and our pipelines.
  2. GATK4 executables
  3. Picard Tools

Input

  • --input : your intput BAM file(s) (do not forget the quotes for multiple BAM files e.g. --input "test_*.bam")
  • --output_dir : the folder that will contain your test_123.gVCF file or your test_001.gVCF, test_002.gVCF, ... files.
  • --ref_fasta : your reference in FASTA. Of course, be sure it is compatible (or the same) with the one that aligned your BAM file(s).
  • --gatk_exec : the full path to your GATK4 binary file.
  • --picard_dir : directory that contains picard.jar
  • --interval_list : a file for the intervals to call on. More information on interval_list format.

A nextflow.config is also included, modify for suitability outside our pre-configured clusters (see Nexflow configuration).

Usage for Cobalt cluster

nextflow run iarcbioinfo/gatk4-HaplotypeCaller.nf -profile cobalt --input "/data/test_*.bam" --output_dir myGVCFs --ref_fasta /ref/Homo_sapiens_assembly38.fasta --gatk_exec /bin/gatk-4.0.4.0/gatk --interval_list target.list

About

GATK4 HaplotypeCaller step, in gVCF mode, first step for subsequent whole cohort Joint Genotyping.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published