BSgenome format error in subseq()
1
0
Entering edit mode
10.3 years ago
catherine ▴ 250

Sorry if I'm asking a stupid question...

I'm trying to get sequences from a data frame "DINU" which contains chromosome name, start, and end positions. I have run it in R using windows before, and it works perfectly. But somehow because of the format of chromosome name, it doesn't work in Mac. Here is my scripts.

> subseq(DINU$Chr[i],start=200,end=400)
Error in .Call2("solve_user_SEW", refwidths, start, end, width, translate.negative.coord,  : 
  solving row 1: 'allow.nonnarrowing' is FALSE and the supplied start (200) is > refwidth + 1
> subseq(chr2L,start=200,end=400)
  201-letter "DNAString" instance
seq: ATTGCAACGTTAAATACAGCACAATATATGATCG...TATGATCGCGTATGCGAGAGTAGTGCCAACATAT
> DINU$Chr[i]
[1] "chr2L"

As its shown, the function cannot recognize chromosome name from the data frame, but the format is a character as it required in its description.

> class(DINU$Chr[i])
[1] "character"

Thank you for any idea in advance

R • 3.6k views
ADD COMMENT
1
Entering edit mode
10.3 years ago

DINU$Chr[i] needs to be an XVector object, not a string. That's why you're getting the error.

Edit: I guess I should note that you can use a string, but then it needs to be the sequence. As is, you're passing in a 5 character string, which, as the error indicates, is shorter than the start coordinate. You might try subseq(DINU$Chr[i], start=1, end=5) if this is unclear. BTW, I'm guessing from your syntax that you're iterating over the dataframe in a for loop. You really don't want to do that in a functional language like R as the performance won't be good. Try apply() or just restructure things.

ADD COMMENT

Login before adding your answer.

Traffic: 2973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6