Quantcast
Channel: vqsr — GATK-Forum
Browsing latest articles
Browse All 326 View Live

VQSR for single sample exome/targeted regions

Hi, I read on the best practices slides that I should not use VQSR if the cohort is small. I only one sample for a single individual. I was wondering how useful it is to perform VQSR on this sample by...

View Article


VQSR error - no MQ annotation detected

Hi, I am trying to run VQSR and an error occurred. Here are my commands java -Xmx240g -jar GenomeAnalysisTK.jar \ -T VariantRecalibrator \ -R /ref/ucsc.hg19.fasta \ -input input_raw.vcf \ -recalFile...

View Article


No false positives in VQSR tranche plot

I'm doing a large variant calling project on a cohort of ~10,000 exomes. I've run into an issue with VQSR. Everything appears to be working normally except for my output tranche plot (attached), where...

View Article

How to interpret Gaussian mixture model plots

Hi, I ran VQSR with 150+ whole genome samples. Attached is one of the Gaussian mixture model plots. I have read the VQSR guide on how to interpret the plots but I am not quite understand. Referring to...

View Article

VSQR resource format: which elements of vcf are needed?

Dear GATK team, I am working with maize aDNA and would like to find SNPs called in aDNA samples that are at least as good as those in HapMap. Am right to assume that variant recalibration is the...

View Article


Having problem with VQSR step

Hi there. I'm Lynn, doing the whole genome sequencing for human. I'm currently using GATK for variant call and now having problems for VQSR step. Here is the command I entered:...

View Article

VariantRecalibrator SNP and INDEL failure rate

Hi I've been struggling with some issues we have been having with the VariantRecalibrator. Here's the story. We run VariantRecalibrator on our new sample in combination with the previous ones (gVCF...

View Article

VQSR failed with "No data found" with whole genome variant calls, but...

Hi, I ran VQSR on the SNP calls from a WGS sample mapped to b37+decoy reference, it failed at the training step with error message "No data found". I then removed the SNP calls on the small contigs...

View Article


VQSR with missing annotation fields

Hi, I am calling variants (non-model organism) following the best practice workflow. After haplotypecaller (with GVCF) and GenotypeGVCFs, I want to perform VQSR (separately for SNPs and INDELs) to the...

View Article


Question related to VQSR

Hi Everyone, It might be very basic but I just want to reach some clarifications what i understand after applying VQSR steps to WGS sequencing data. For SNPs I've set tranches as described in the best...

View Article

Do strand flips in the dbSNP or training file cause problems for BQSR, Mutect...

Our group is working on putting together a file of known germline variants in a non-model organism. While we have a large set of known variants, my colleague has noted that some number of these are...

View Article

CombineGVCFs subsampling questions

Hi GATK! I want to merge ~3000 HC outputs into one large cohort. However, even I run it directly by scattering on 30M genome chunk, it would still take a long time to compute. So I think I should first...

View Article

Warning message from VariantDataManager

Hello, I notice the following warning messages during the first step of VQSR: ------------------------------------------------------------------------------------------ Done. There were 3 WARN...

View Article


How VQSR deals with multiallelic SNPs and Indel

Hi, May I know how VQSR deals with multiallelic SNPs and indels? How to classify them as pass or fail?

View Article

VQSR - ./. genotypes retained after VQSR filtering

We have generated a set of variant calls based on the GRCh38 pipeline described in GATK using GATK version 3.3. We observed that many calls made on the ALT contigs had the genotype call "./." . On...

View Article


Variant recalibration (WQSR) after pooled calling

Hi - we have whole genome samples consisting of non-barcoded DNA from 50 diploid individuals with 80-120x coverage. We use version 3.5-0-g36282e). We align, then use BQSR. For variant calling, we...

View Article

VQSR --maxGaussians paramater

Hi, I am performing VQSR (GATK 3.7) using the SNP model on individual chromosomes on hundreds of WGS data. However, for some chromosomes, it ran without a problem using the --maxGaussians default,...

View Article


VQSR Resources for Indels

Hi, I noticed that on the page for setting the right arguments for VQSR you mentioned that Mills and dbSNP should be used as resources for INDEL variant recalibration. However, at the bottom of the...

View Article

I have 50 exome samples belong to 25 families. Do I run GenotypeVCFs on...

We have exome sequenced data for 50 samples in total for a cardiac disease. But they have been sequenced in different batches. Even some of the batches were 2 years old. We have relationship...

View Article

Low coverage loci - GATK pipeline

Hi GATK team, I am posting this question for everyone's benefit as it will shed more light on how HaplotypeCaller and other GATK programs deal with low coverage positions. For the sake of this example,...

View Article

Suggestions for WGS 5X Sequences

Hi Geraldine or Sheila, I am in the process of customizing a GATK pipeline for processing aDNAA. I have processed a couple of 3000 year old WGS sequences so far using GATK best practices, and although...

View Article


What is purpuse of multiple True Sites in VQSR

I have 3 questions: 1- What is the exact purpose of having both HapMap and Omni True Sites in VQSR, vs just one; 2- If I want to restrict the variant calling to my custom list of positions. Which of...

View Article


My VQSR tranches-plot shows cumulative variants in tranch 0-90, 90-99, 99-99.9

Dear GATK-Team, My VQSR tranches-plot (exome data) shows cumulative variants in tranch 0-90, 90-99, 99-99.9. To my understanding it should be the other way round (like in your article link). My tranch...

View Article

VQSR error

I used to run VQSR using the following command. Approximately for 400 samples it worked very well. But for the first time I am getting an error while doing VQSR by adding few more sample with old ones....

View Article

VariantRecalibrator error

Dear GATK Team, I'm trying to perform variant recalibration on 3 WGS sequencing data. I am following the pipeline described in Best Practices. I have generated individual g.vcf files for each patient...

View Article


VQSR on specific genomic region

Dear GATK Team, I have exome-data of many individuals (>2000) called with the HaplotypeCaller, but only of a specific set of genes from the genome. I would like to apply the VQSR-tool to recalibrate...

View Article

Can't use VQSR on non-model organism or small dataset

The problem: Our preferred method for filtering variants after the calling step is to use VQSR, a.k.a. recalibration. However, it requires well-curated training/truth resources, which are typically not...

View Article

VQSR and VariantAnnotator on Samtools VCFs

Hi everyone! My goal is to run VQSR on VCFs generated with samtools mpileup. According to GATK best practices first i have to run VariantAnnotator on each of my VCFs in order to do that. here's the...

View Article

WGS+WES combined discovery/genotyping

Hi GATK team, Hope you had great holidays! We're analyzing small families where some individual have been sequenced by WES (HiSeqX) and others by WES (HiSeq4000). Could you please advise on the best...

View Article



VariantAnnotator is not annotating variants with InbreedingCoeff

Hi, I am using GATK VariantAnnotator to annotate my VCF with the InbreedingCoeff but when I check the output VCF I see that no variant was annotated with the InbreedingCoeff. I've used a pedigree file...

View Article

VQSR / CNN filtering for small (~100) gene panels

Hello, I'm trying to perform germline variant calling on a panel with ~100 genes. I was wondering what the bare minimum (in terms of sample size) would be for variant filtering via VQSR. If the sample...

View Article

Too many (?) variants detected by joint genotyping of 8232 exomes

Hello, I am about to finish analyzing 8232 exome samples. I have used GATK 3.8 and 3.6 throughout my workflow, and followed the best practices guideline. After making variant calling by running...

View Article

genotypeGVCFs call confidence and emit thresholds

We have ~10,000 WES samples and generated gVCFs for each using HC. To generate a multi-sample consensus VCF, we performed joint genotyping using genotypeGVCFs. Subsequently we performed VQSR analysis...

View Article


error with VariantRecalibrator java.lang.IllegalArgumentException: No data...

Dear GATK Team, I have one whole genome data called with the HaplotypeCaller. I would like to apply the VariantRecalibrator to recalibrate my variant set, but I get back an error as follows: INFO...

View Article

Image may be NSFW.
Clik here to view.

Variant Quality Score Recalibration (VQSR)

VQSR stands for Variant Quality Score Recalibration. In a nutshell, it is a sophisticated filtering technique applied on the variant callset that uses machine learning to model the technical profile of...

View Article

Image may be NSFW.
Clik here to view.

Do you need to do Variant Quality Score Recalibration when calling somatic...

Hi, I am currently working to call somatic variants from tumour samples with matched normal pairs from the same patient. I have carried out all of the steps in this tutorial:...

View Article


Image may be NSFW.
Clik here to view.

VQSR --recal-file not reconized

In gatk 4.0.6.0 is the --recal-file option not required or has it been changed? Applyvqsr uses this recal-file, but will that throw an error? Thank you very much . gatk VariantRecalibrator \ -R...

View Article


Applying VQSR to the Raw VCF vs Filtered VCF

Hi, I am working on a germline WES dataset with ~450 samples, all the variants are called following an adapted version of GATK Best Practices, using GATK 4.0.3. My question is about at which step we...

View Article

Variant recalibration tranche plot gives a very high number of novel variants...

Hi, My question concerns the VQSR step. I am using GATK version 3.7 on 350 human WES samples. After the calling with HaplotypeCaller I have used the VariantRecalibrator function with the following...

View Article

Error occur on VariantRecalibrator : Malformed floating point valueprior

The problem is about VQSR. I have no idea how to fix this error: A USER ERROR has occurred: Unknown file is malformed: Malformed floating point valueprior The input vcf file is generate by...

View Article

Recommendations for calling on and flitering ~100 low coverage samples

I have about 80-100 population-specific WGS samples with coverage around 5-10X. What modifications would you recommend in GATK best practices to suit variant calling (and VQSR) ? Also, what are the...

View Article


Image may be NSFW.
Clik here to view.

Seeking help debugging a VQSR tranche plot

Dear GATK Team, I have received data from ~ 35k whole exome sequences from a collaborator and while proceeding with variant filtering I noticed that the VQSR tranche plot looked abnormal. Contrary to...

View Article

split multiallelic variants before VQSR and CNNScoreVariants, gatk team opinion

Hi, if I remember well I saw in the forum a user to suggest to split the multiallelic variants before VQSR. I think that is something logical (I never done before), but I would like to know the opinion...

View Article


ExcessHet filtering in cohorts with family members

This post mentions that the first step in Best Practice VQSR filtering involves hard filtering on ExcessHet. The post also states that "ExcessHet filtering applies only to callsets with a large number...

View Article

What is the difference between --truth-sensitivity-tranche and...

I'm using GATK v4.0.3.0. I'm wanting to use the recommended ApplyVQSR --ts-filter-level values, as specified at the end of GATK's document #1259 (albeit this document was written for GATK3, but I...

View Article


A way to come up with "truth set" to use VQSR

Dear GATK Team, I have a question regarding finding cutoffs for hard filtering. I am working with yeast for which we do not have a good true variation set. I am following the best practices and have...

View Article

gnomAD in VQSR

Hi, I am not sure why no one asked this before but I need help please as I couldn't find sufficient info: Is it recommended to use gnomAD variants database as known, training and truth set with...

View Article

VQSR filtering and dbSNP

Hi, For VQSR filtering, it's assumed that all calls made by HaplotypeCaller or MuTect2 are put into VQSR? Polymorphic sites should only be culled AFTER the vcf is annotated with VQSLOD scores, and a...

View Article

Weird VQSR filtering pattern

Hi I performed HC joint calling against 300 normal tissue samples. I did every step according to the best practice and got a VQSR plot very similar to this one. Most variants have the same MQ and FS...

View Article


Are there issues with using reads coming from different technologies and...

Hello! We are analyzing a WGS data of 60 samples (6 groups, 10 samples/group) produced by HiSeq4000. The mean coverage per sample is 25x (lowest sample is 15x). Now we realized we need to sequence more...

View Article

Browsing latest articles
Browse All 326 View Live