Errors When Using The Easyrnaseq Package From Bioconductor
1
2
Entering edit mode
12.0 years ago
e.karasmani ▴ 140

Dear All,

I am using easyRNAseq to analyze some RNA seq data.....

and this is what i am doing.....

 library(easyRNASeq)
 library(BSgenome.Mmusculus.UCSC.mm9)
chr.sizes=seqlengths(Mmusculus)
bamfiles=dir(getwd(),pattern="*\\.bam$")

rnaSeq <- easyRNASeq(filesDirectory=getwd(),
,organism="Mmusculus"
,chr.sizes="auto"
,readLength=36L
,annotationMethod="biomaRt"
,count="exons"
,filenames=bamfiles[1]
,outputFormat="RNAseq"
)

and that is what i get.....

Checking arguments...
Fetching annotations...
Summarizing counts...
Processing s_7_1_I12-1147-01.rnaseq.mm9.srt.bam
 Warning messages:
1: In easyRNASeq(filesDirectory = getwd(), , organism = "Mmusculus",  :
  There are 366088 features/exons defined in your annotation that overlap! This implies that some reads will be counted more     than once! Is that really what you want?
2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
  You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
  Not all the chromosome names in your chromosome size list 'chr.sizes' are present in your read file(s) (aln or bam).
4: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
 The available chromosomes in both your read file(s) (aln or bam) and 'chr.sizes' list were restricted to their common term.
These are: chr1, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr2, chr3, chr4, chr5, chr6, chr7, chr8,     chr9, chrM, chrX, chrY.
Error in unlist(aggregate(readCoverage(obj)[match(names(genomicAnnotation(obj)),  :
  error in evaluating the argument 'x' in selecting a method for function 'unlist': Error in .Call2("Rle_getStartEndRunAndOffset", x,      start, end, PACKAGE = "IRanges") :
    ' x' values larger than vector length 'sum(width)'

What I am doing wrong????

How can I use the program to analyze my RNA seq data

Could you please help me?

Thank you in advance

Best regards Lena

rna bioconductor • 3.2k views
ADD COMMENT
0
Entering edit mode

Please update your question with the output of sessionInfo().

Also, you may have better luck sending this question to the bioconductor mailing list, since I know the author of this package watches that list closely, and I'm not sure if he's on biostars.

ADD REPLY
0
Entering edit mode
12.0 years ago

This post should answer it easyRNASeq package: error when summarizing per gene. According to the developer (I am copying the answer for future reference for someone)

The error is probably indeed due to your annotation file in combination with you chromosome size list. Every gene loci should be fully contained in the chromosomes, i.e. their coordinates should not be larger than the chromosome size you provide for the chromosome they're located on. I believe I'm checking this in the code, but you might have just found a configuration of providing the annotation and chromosome sizes arguments that I overlooked. The first use case in the "Use Cases" section of the vignette explains why the chromosome sizes are so important to the process. The development version of easyRNASeq (available around October 2nd (PDT time)), makes this a lot easier.

You can easily check if this is the case by running the following command:

lapply(names(seqlengths(Mmusculus)),function(chr,rngs,sizes){
    cat("processing chromosome",chr,"\n")
sel <- space(exon_range) == chr
stopifnot(all(start(exon_range[sel,]) <= sizes[chr]))
stopifnot(all(end(exon_range[sel,]) <= sizes[chr]))
},exon_range,seqlengths(Mmusculus))

It should stop if any of your gene coordinate is larger that the chromosome size it's on. Note that I've just written this bit of code up in this email, so there might be typos. If there are and you can't figure them out, just let me know. If there are no typos and it stops, then you would have to modify either your exon_range object or more easily your chromosome sizes one. For this you could simply use the scanBamHeader (Rsamtools package) function to extract the chromosome sizes from your bam file as follow (change the first list to match your settings):

filesList <- dir("[your bam directory]",pattern="*\\.bam$",full.name=TRUE)

## read the headers
            headers <- scanBamHeader(filesList)

            ## Two sanity checks
            if(!all(sapply(headers,
                           function(header,expected){
                             identical(sort(names(header$targets)),expected)
                           },sort(names(headers[[1]]$targets))))){
              stop("Not all BAM files use the same chromosome names.")
            }

            chr.sizes <- headers[[1]]$targets
            if(!all(sapply(headers, function(header,chr.sizes){
              identical(header$targets[match(names(chr.sizes),
                                             names(header$targets))],chr.sizes)
            },chr.sizes))){
              stop("The chromosome lengths differ between BAM files.")
            }

            ## check if we got some chr sizes at all
            if(length(chr.sizes)==0){
              stop("No chromosome sizes could be determined from your BAM file(s).Is the BAM header present?\nIf not, you can use the 'chr.sizes' argument to provided the chromosome sizes information.")
            }

That's actually an excerpt of the easyRNASeq development version code to extract the chromosome size from the BAM files. I think that this is the reason for the error. If not, we can take this discussion off the list to figure it out, provided you can give me access to your annotation object and to an excerpt of your data (that you could upload to my dropbox). This error should actually not occur in the development version anymore, but it would be great to give it a try after October 3rd again. Let me know how it goes, Cheers


Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310

Email: nicolas.delhomme at embl.de

So, either try

1) using the development version

or 2) contact him

ADD COMMENT

Login before adding your answer.

Traffic: 1648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6