Hello,
Is there a CNV variant caller that works on non-model organisms and that works by providing a bam alignment, a fasta reference and a gff annotation of all genes?
Basically I am trying to find our what genes are in high copy numbers.
I am creating the bam file by aligning illumina paired end reads with bwa to reference.
Thank you, Adrian
Also, VarScan can technically work on any .pileup file produced from a .bam files, but I wouldn't trust the copy number calls unless you had paired samples (and used the "copynumber" function instead of the "mpileup2cns"). And you'll need to use another tool to get gene annotations.
my organism is intronless, so conifer seems inappropriate. VarScan doesn't care about gene annotations, and reports variation. I just need to know what is the copy number in the genome of each gene I feed the program from my annotation file.
Although designed for being given a list of targeted regions, you can define your "targeted" regions as your genes. CoNIFER doesn't actually know or care what the regions represent.
That said, I can't absolutely say this is the best strategy - it is just what I can think of.
I use cn.mops package (http://www.bioconductor.org/packages/release/bioc/html/cn.mops.html) exactly for the same purposes. We have reference, a number of BAM files and also GFF file. If interested i could provide an example of the R code for the analysis.
Can you share your R code? Thanks!