Hi,
I am working on VQSR step (using GATK 2.8.1) on variants which have been called by UG from ~500 whole genomes of cattle . I run VariantRecalibrator as following:
${JAVA} ${GATK}/GenomeAnalysisTK.jar -T VariantRecalibrator \
-R ${REF} -input ${OUTPUT}/GATK-502-sorted.full.vcf.gz \
-resource:HD,known=false,training=true,truth=true,prior=15.0 HD_bosTau6.vcf \
-resource:JH_F1,known=false,training=true,truth=false,prior=10.0 F1_uni_idra_pp_trusted_only_LMQFS_bosTau6.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=6.0 BosTau6_dbSNP138_NCBI.vcf \
-an QD -an MQRankSum -an ReadPosRankSum -an FS -an MQ -an DP -an HaplotypeScore \
-mode SNP \
-recalFile ${OUTPUT}/gatk_502_sorted_fixed.recal \
-tranchesFile ${OUTPUT}/gatk_502_sorted_fixed.tranches \
-rscriptFile ${OUTPUT}/gatk_502_sorted_fixed.plots.R
HD_bosTau6.vcf : ~770k markers on Illumina bovine high-density chip array
F1_uni_idra_pp_trusted_only_LMQFS_bosTau6.vcf : ~5.4M SNPs
The tranches pdf I got looks really weird, please check the attached file.
Then I tried to vary the 'prior' score of trainning VCF, and also supply additional VCF file from another project as training datasets. And I still got the similar tranches graph as above. e.g.:
-resource:HD,known=false,training=true,truth=true,prior=15.0 HD_bosTau6.vcf
-resource:JH_F1,known=false,training=true,truth=false,prior=12.0 F1_uni_idra_pp_trusted_only_LMQFS_bosTau6.vcf
-resource:DN,known=false,training=true,truth=false,prior=12.0 HC-Plat-FB.3in3.vcf.gz
-resource:dbsnp,known=true,training=false,truth=false,prior=6.0 BosTau6_dbSNP138_NCBI.vcf
HC-Plat-FB.3in3.vcf.gz : ~ 14M markers
It is worthy to mention that I have done VariantRecalibrator step with the same parameters and training sets on another 50 whole genomes very recently, and it worked fine. Actually I have done VariantRecalibrator on the 500 animals before when I accidentally took a unfiltered VCF called by UG as training set. Surprisingly, I got good tranches graph that time, similar to the graph posted on GATK best practice. Do you have any suggestion for me?
Thanks,