Error Import File With "Import" Of Rtracklayer
1
0
Entering edit mode
12.8 years ago
ska007 • 0

Hi, This is related to a the question found here (http://biostar.stackexchange.com/questions/3414?sort=newest#sort-top). I am trying to import a gff file using import("file.gff") which contains a set of chromosomal positions for all chrmosomes in mouse. When I use the rtracklayer command

genes = import("file.gff")  
names(genes)= "chr1" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chrX" "chrY"

How do I make sure that the import doesn't cause the chromosomal order to change I want it to import it into teh dataRange genes in the order

"chr1" "chr2" "chr3" "chr4" "chr5" "chr6" "chr7" "chr8" "chr9" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chrX" "chrY"

Thanks in Advance

chip-seq next-gen sequencing r • 3.1k views
ADD COMMENT
0
Entering edit mode

The names are sorted in lexical order, however the order of chromosome names is irrelevant for further analysis e.g. overlap computation in GenomicRanges/IRanges. The other way around that also makes sense, if the order of names appearing makes any difference to your analysis, then there is possibly something wrong with your analysis.

ADD REPLY
1
Entering edit mode
12.8 years ago
Neilfws 49k

I don't think the order in which the data are imported has any bearing on your problem.

First, I assume that rtracklayer imports entries in the order in which they appear in the GFF file.

Second, I think you are confused by the output of names(genes). In fact, the chromosomes are sorted in that output. However, the sorting is by the ASCII value of the characters in the strings, not by chromosome number - hence "chr10" comes after "chr1" but before "chr11". This is a quite normal way to sort strings and tells you nothing about how the data are stored.

ADD COMMENT
0
Entering edit mode

Hi, Well I thought so too at first. But then to check if this is what was happening. I extracted say genes$chr2 from the .gff file and the reads I had from the bam file for chr2 and I computed their overlaps. It gave me the correct overlap.

ADD REPLY
0
Entering edit mode

I checked if the gff file was imported in that order (using the command spaces(genes)) and it was. The file.gff as such has the correct order .

ADD REPLY
0
Entering edit mode

OK, so we're agreed that any issues you have with rtracklayer don't relate to chromosome order? I don't really understand your problem.

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6