Hello,
I have a problem using ExomeDepth with a referenceFASTA file. I analysed app. 500 target genes (>7000 Exons, hg19) and without including a FASTA file and the GC content everything worked fine.
As GC content influencing amplification, coverage and CNV detection, I want to include this factor. Therefore I tried several things to get appropiate FASTA sequences.
BioMart
mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl") FASTA <- getSequence(chromosom,start,stop,type="hgnc_symbol",seqType="gene_exon", mart = mart)
Error:
`Reference fasta file provided so exomeDepth will compute the GC content in each window
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘scanFa’ for signature ‘"data.frame", "GRanges"’`
BSgenome
FASTA<-getSeq(BSgenome.Hsapiens.UCSC.hg19,chromosom,start,stop)
show(FASTA)
A DNAStringSet instance of length 10 width seq [1] 845 GTTGAAAAGTGATCAGGTTCATTTTATTGACTACACAGAAGCAATTCCATTT...GAGGAGGCAGATCACGGCGAAGACAATGAAGCTGTACGGGCCGAGGCCCTC [2] 129 CCTGGATGAACGGGAAGATCAAGCCCACGGTGAAGTTGGAGAGCCAGTGCAC...GCCGAGAGGACTGCAGGAAGATCTCAGTGATGAGCAGCGCGGGTATGGGAC [3] 77 CTGGGCCCGAGGGCATGTCCTATGACGTAGGAGATGACACAGACGATGCTGATGTATGGCATCCAGGACACTGTGTC [4] 103 CCTGCAGTGCCAGAGCTGCAGTGAGCACGCAGCAGGCTATGAGGCAGATGGAGAAGCCCAGCAGCAGCAGCAGCCTCCGACCCAGGAGCTCCACCACGAACAC [5] 112 CGGCGCAGAAGGTCATGACCACGTTCACGGCCCCGGTGCCGGCCGTCACGTA...CTCCTCCGGCACGCCGGCGCTCAGGTAGATCTGGTCCGCGTAGTAGTAGAT
Error:
Reference fasta file provided so exomeDepth will compute the GC content in each window
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘scanFa’ for signature ‘"DNAStringSet", "GRanges"’
3. Downloading human_g1k_v37.fasta.gz
Chromsome, Start & Stop including only the 7000 exon locations. With method 1 and 2, I was able to get target sequences, but the counting (getbamcounts) didn't work, different errors occur, warning something is wrong with the FASTA file. I think there are some file formatting issues.
my.counts<- getBamCount(data.frame(Chromsome, Start,Stop),
bam.files = bam.files,
include.chr = F ,
referenceFasta =FASTA)
I dindn't tried the human_g1k_v37.fasta.gz so far, because I don't know how to load it in R.
Do you have an idea how to transform the files that they work or how to load the whole genome FASTA File?
It is always a good idea to include the output of sessionInfo() as well as any error messages you get when posting an R question.
I edited and attachted the errors