Question

Cnv Analysis On Genes

0

Entering edit mode

10.8 years ago

Adrian Pelin ★ 2.6k

Hello,

Is there a CNV variant caller that works on non-model organisms and that works by providing a bam alignment, a fasta reference and a gff annotation of all genes?

Basically I am trying to find our what genes are in high copy numbers.

I am creating the bam file by aligning illumina paired end reads with bwa to reference.

Thank you, Adrian

cnv gff fastq • 3.3k views

ADD COMMENT • link updated 10.8 years ago by Charles Warden 8.3k • written 10.8 years ago by Adrian Pelin ★ 2.6k

score 1 · Answer 1 · 2014-01-28

1

Entering edit mode

10.8 years ago

Charles Warden 8.3k

Depends on your application, but the input files should typically be the same (for common and non-model organisms).

I happen to like CoNIFER if you are doing exon-capture analysis:

http://conifer.sourceforge.net/

You can also use DNAcopy to call segments based upon the normalized values from CoNIFER (and can theoretically work with any table of normalized abundances across windows in the genome):

http://www.bioconductor.org/packages/2.13/bioc/html/DNAcopy.html

ADD COMMENT • link 10.8 years ago by Charles Warden 8.3k

0

Entering edit mode

Also, VarScan can technically work on any .pileup file produced from a .bam files, but I wouldn't trust the copy number calls unless you had paired samples (and used the "copynumber" function instead of the "mpileup2cns"). And you'll need to use another tool to get gene annotations.

ADD REPLY • link 10.8 years ago by Charles Warden 8.3k

0

Entering edit mode

my organism is intronless, so conifer seems inappropriate. VarScan doesn't care about gene annotations, and reports variation. I just need to know what is the copy number in the genome of each gene I feed the program from my annotation file.

ADD REPLY • link 10.8 years ago by Adrian Pelin ★ 2.6k

0

Entering edit mode

Although designed for being given a list of targeted regions, you can define your "targeted" regions as your genes. CoNIFER doesn't actually know or care what the regions represent.

That said, I can't absolutely say this is the best strategy - it is just what I can think of.

ADD REPLY • link 10.8 years ago by Charles Warden 8.3k

0

Entering edit mode

I use cn.mops package (http://www.bioconductor.org/packages/release/bioc/html/cn.mops.html) exactly for the same purposes. We have reference, a number of BAM files and also GFF file. If interested i could provide an example of the R code for the analysis.