aligning reads varying from 110bp to 56520bp to a reference sequence
0
0
Entering edit mode
5.7 years ago
cecilio11 ▴ 120

Hi Biostars,

I have a set of sequences (from a genome sequencing file) that mapped to a reference sequence (let's name it "ref1"). The file containing the mapped sequences could be named "mapped_ref1". The sizes of those sequences vary from 110bp to 56520bp. Those sequences do not represent the whole genome, but several genes. I want to identify a set of those genes by sequencing them.

I want to sequence genes that have: 1) the highest identity to the reference genome 2) highest coverage.

I am planning on aligning the sequences on "mapped_ref1" to the reference sequence "ref1". After I obtain the alignments, I can blast the genes to identify them and to find the ones with the highest identity to the "ref1" genes. Then I will choose the ones with the highest coverage.

Dear biostars, what tools would you recommend to do this job?

Would you recommend a different strategy to accomplish the job at hand?

Best regards,

cecilio11

alignment • 827 views
ADD COMMENT
0
Entering edit mode

Tool requests are Question posts, not Tool posts (the latter are for announcements of new software etc). I’ve changed it for you this time, but please bear it in mind for the future.

ADD REPLY
0
Entering edit mode

Can you elaborate on the "highest coverage" part of your question?

You seem to already have sequences to work with, so why do you then want to sequence them again?

ADD REPLY

Login before adding your answer.

Traffic: 1522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6