Version 1.02
VarScan is a tool for detecting variants from alignments of next-gen sequencing data.
Perform all tasks (parse alignments, combine variants, get readcounts) in one step
ALIGNMENTS: File of read alignments in Blat (PSLX), Bowtie, cross_match, Novoalign, or Newbler format (required).
See "FRONT END ALIGNMENT" section below for recommended alignment parameters.
DETECTION OPTIONS:
--fasta-file File of read sequences in FASTA format (required for indel genotypes)
--quality-file File of read quality scores in FASTA format (required for base qualities)
--ref-dir Directory containing reference sequence FASTAs (required for indel genotypes)
--min-align-score Specifies a minimum BLAST-like alignment score (matches - mismatches - gaps) [25]
--min-identity Specifies a minimum sequence identity for alignments (matches / bases) [90]
--primer-trim Length of M13/MID primer tail at start of read; variants within ignored [0]
--default-qual-score Quality score to assign to bases when there is no quality information [15]
FILTERING OPTIONS:
--num-samples Number of samples in the pool to auto-set the following parameters [1]
--min-coverage Minimum total coverage to call a variant [1]
--min-reads2 Minimum variant-supporting reads [1]
--min-avg-qual Minimum variant base quality [0]
--min-var-freq Minimum variant allele frequency [0]
--min-strands2 Minimum variant strands observed [1]
OUTPUT OPTIONS:
--output-dir Output directory where results will be saved [./]
--sample Sample name to use as base for output files [sample]
Filtered SNP calls will be output to [output_dir]/[sample].snps.combined.readcounts.filtered
Filtered INDEL calls will be output to [output_dir]/[sample].indels.combined.readcounts.filtered
Intermediate output files will include:
[output_dir]/[sample].alignments Alignments for uniquely-placed reads meeting criteria
[output_dir]/[sample].snps Individual read-level SNP calls
[output_dir]/[sample].indels Individual read-level indel calls
[output_dir]/[sample].snps.combined.readcounts Unfiltered SNPs with read counts
[output_dir]/[sample].indels.combined.readcounts Unfiltered INDELs with read counts
Parse alignments file, scores alignments, and detects sequence changes.
ALIGNMENTS: File of read alignments in Blat (PSLX), Bowtie, cross_match, Novoalign, or Newbler format (required).
OPTIONS:
--fasta-file File of read sequences in FASTA format (required for indel genotypes)
--quality-file File of read quality scores in FASTA format (required for base qualities)
--ref-dir Directory containing reference sequence FASTAs (required for indel genotypes)
--min-align-score Specifies a minimum BLAST-like alignment score (matches - mismatches - gaps) [25]
--min-identity Specifies a minimum sequence identity for alignments (matches / bases) [90]
--primer-trim Length of M13/MID primer tail at start of read; variants within ignored [0]
--default-qual-score Quality score to assign to bases when there is no quality information [15]
--min-qual-score Minimum base quality score for variants to be called [15]
--output-alignments Output file to contain qualifying single best alignment for each read
--output-snps Output file to contain SNPs
--output-indels Output file to contain indels
Combine variants (SNPs or indels) detected across multiple reads
VARIANTS: File of variants from alignment parsing
OUTPUT: Output file for combined variants
Determine read counts supporting each allele
VARIANTS: File of *combined* variants
ALIGNMENTS: Original alignments file
OUTPUT: Output file for variant read counts
OPTIONS:
--fasta-file File of read sequences in FASTA format (required for indel genotypes)
--quality-file File of read quality scores in FASTA format (required for base qualities)
--ref-dir Directory containing reference sequence FASTAs (required for indel genotypes)
--default-qual-score Quality score to assign to bases when there is no quality information [15]
--min-qual-score Minimum base quality score for variants to be called [15]
Combine information from multiple read counts files
VARIANTS: Read counts file, separated by commas
OUTPUT: Output file for combined variant read counts
Filter variants based on coverage, read counts, allele frequency, quality, etc.
VARIANTS: File of variants from alignment parsing
OPTIONS: --output-file File to contain variants passing filter
--min-coverage Minimum total coverage [1]
--min-reads2 Minimum variant-supporting reads [1]
--min-avg-qual Minimum variant base quality [0]
--min-var-freq Minimum variant allele frequency [0]
--min-strands2 Minimum variant strands observed [1]
Restricts SNP calls to a given set of chromosome positions
VARIANTS: File of variants from alignment parsing
POSITIONS: Tab-delimited file of chrom-positions at which to include variants
OUTPUT: Output file for variants at the provided positions
VarScan performance relies heavily on the accuracy of the read alignments.
To obtain alignments in a format compatible with VarScan, our recommendations are as follows:
BLAT: Run with the -out=pslx parameter. Give VarScan a single file with all PSLx alignments
Newbler: Run with the -pairt parameter. Give VarScan the 454PairAlign.txt file
Bowtie: Run with the -m 1 parameter. Give VarScan the Bowtie output file.
cross_match: For 454 data, run with these parameters: -minmatch 12 -minscore 25 -penalty -4 -discrep_lists -tags -gap_init -3 -gap_ext -1
For Illumina data, run with these parameters: -minmatch 12 -minscore 25 -minmargin 1 -discrep_lists -gap1_only -tags
Give VarScan the cross_match output file.
Novoalign: Run with the parameters: -a -t 120 . Give VarScan the Novoalign output file.
Daniel C. Koboldt, << <dkoboldt at genome.wustl.edu> >>
The Genome Center at Washington University School of Medicine
St. Louis, Missouri, USA
Copyright 2009 Daniel C. Koboldt and Washington University
All rights reserved.
This program is free for non-commercial use.