Hi, I have a very basic question.
I want to map genomic coordinates to sequences, but I am not sure how to interpret the coordinates for the minus strand. As an example, using the UCSC Genome Browser I get
\>hg18_refGene_NM_000474 range=chr7:19121616-19122860 5'pad=0 3'pad=0 strand=- repeatMasking=none
CAGGCGGAGCCCCCCACCCCCTCAGCAGGGCCGGAGACCTAGATGTCATT
...
Now I downloaded chr7.fa of hg18. Since every line contains 50 bases, starting from line 2, I supposed to find the sequence on line 382434.
sed -n 382434p chr7.fa
GACCAAACTCTAAGGTTCTCtaaattttttatatttatttattGCAGAAA
This does not match. I also could not find the reverse complement of the first sequence in chr7.fa using grep. Could anyone tell me where I go wrong?
Thanks.
+1 for rev, a forgotten goodie.
thanks a lot, i see my mistake now.