Finding Contig Repeat Counts By Mapping Contigs To The Reference Genome
0
-1
Entering edit mode
11.6 years ago
misaghb ▴ 20

Hi guys, I have a set of contigs of genome G (using de novo assembly by Velvet) and I also have the complete sequence of the reference genome G. I want to know the repeat count (an integer number) of each contig in the reality by mapping them to the reference genome and finding and counting exact matches.

Which tools are easier to use? At the moment I'm just interested to have a 2 column result, one column showing the contig names and the other showing an integer number which is the repeat count of that contig in the reference genome. Everything else is just a bonus. Would you please let me know which tool is better or how I can easily produce this result based on MUMmer or BLAST output?

Thanks.

contigs mapping alignment reference repeats cnv • 3.6k views
ADD COMMENT
0
Entering edit mode

This question is unclear to me. What is your research question? What exactly are you try to do? What kind of data do you have: genomic, transcriptomic, etc.? Are you just trying to determine read depth at a given locus? What does "repeat count of each contig in the reality" mean -- are you identifying repeat regions within contigs? Is there a strain difference in genome "G" -- why not map sequence reads onto reference instead of contigs?

Please edit your question above. Thanks.

ADD REPLY
0
Entering edit mode

Thanks for replying Josh. As I said the data are genomic sequences. You can completely forget about sequence reads,read mapping, and read depth. For the repeat I mean # of times the contig Ci is observed in the reference genome G. (%100 match or some threshold e.g. %98)

ADD REPLY

Login before adding your answer.

Traffic: 2695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6