About Four Hundreds Contigs Mapping
4
1
Entering edit mode
12.5 years ago
wHo ▴ 20

I have about four hundreds contigs.I want to map them to reference sequence.However,they are too much.I wonder if there is a software which can map contigs to the reference sequence.Thank you. I know the sequence of contigs.I need mapping these contigs to reference sequence.So I can know where these contigs locate in the reference sequence.

The order of contigs on reference sequence

contigs reference • 4.0k views
ADD COMMENT
0
Entering edit mode

Too much for what exactly? (I assume you meant aligning them individually?)

ADD REPLY
0
Entering edit mode

If I blast those contigs with reference sequence one by one,I think it is too much.You are right.

ADD REPLY
0
Entering edit mode

Have you tried running BLAST locally on your own computer against your reference sequence? You don't have to run them one by one, you can launch it with a multiple sequences fasta file.

ADD REPLY
0
Entering edit mode

I know what you mean.I have run blast on NCBI.Using blast,I will get the know the location of contig.I think data are too much.So I want to konw if there is a software can do this work.And,I have known the order of contigs on the reference sequence.What I need is to make sure the accurate position.A appropriate tool makes work easy.

ADD REPLY
0
Entering edit mode

Although I have known the order of contigs on reference sequence,I have many questions.For example,some contigs have overlapped.In addition,there are many lines and contigs with different colors.I don't know what about those mean.I will give you a screenshots.

ADD REPLY
0
Entering edit mode

I am not talking about BLAST on NCBI but of a local BLAST on your own machine. This runs perfectly with large files and the output file will give you all the information you need concerning the positions and possible overlaps of your contigs.

How familiar are you with command-line tools? Are you running a Unix-based machine?

ADD REPLY
0
Entering edit mode

I am a new hand.I run a linux-based machine.And,I have many questions,such as the screenshots I gave.

ADD REPLY
1
Entering edit mode

Running BLAST locally is definitely something you should learn, then. Using ready-made tools will only get you so far, and after that, you will have to start learning how to run (and analyse) flatfile outputs.

ADD REPLY
3
Entering edit mode
12.5 years ago

Have a look at MUMmer. And from the MUMmer package especially NUCmer.

ADD COMMENT
0
Entering edit mode

Ok,I will look at it.Thank you.

ADD REPLY
3
Entering edit mode
12.5 years ago

Briefly, I would install BLAST, and then run it on my contigs:

1. Make the BLAST database with the reference genome

makeblastdb -in reference.fasta -dbtype nucl -parse_seqids -out ref_db
blastdbcmd -db ref_db -info

2. Run the BLAST query with my contigs

blastn -query mycontigs.fasta -task blastn -db ref_db -outfmt 7 -out blast.output

3. Read the output file in R and make a basic visualization (untested)

library(seqinr)
L=length(read.sequence("reference.fasta")[[1]])
blast=read.table("blast.output")
blast=blast[blast[,10]<=0.0001,] # e-value threshold
queries=unique(as.character(blast[,1]))
plot(0,0,xlim=c(0,L),ylim=c(0,length(queries)),type="n")
lapply(seq(length(queries)), function(i) {
     tmp=blast[blast[,1]==queries[i],]
     lapply(seq(length(tmp[,1])), function(j) {
         lines(x=c(tmp[j,8],tmp[j,9]),y=rep(i,2),col=rainbow(seq(length(tmp[,1]))[j])
     })
})

This, just to show you how much more free you will be in your own research if you learn how to use day-to-day tools such as BLAST, R, ....

ADD COMMENT
0
Entering edit mode

Thank you. I'll try it.

ADD REPLY
1
Entering edit mode
12.5 years ago
mgalactus ▴ 780

You could also use CONTIGuator, which will also provide convenient maps viewable with the ACT tool from the Sanger institute

ADD COMMENT
0
Entering edit mode

In your opinion,which is better to choose?CONTIGuator or MUMmer.Think you.

ADD REPLY
0
Entering edit mode

Well, since i'm the maintainer of contiguator, i may have a little conflict of interest: MUMmer is less user friendly in my opinion.

ADD REPLY
0
Entering edit mode

Thank you for your opinion.

ADD REPLY
0
Entering edit mode
12.5 years ago
sconlan • 0

ABACAS (http://abacas.sourceforge.net/) is a wrapper for MUMmer that is useful for mapping contigs to a reference. Works great for bacterial genomes, I'm not sure how it scales to larger genomes.

ADD COMMENT
0
Entering edit mode

Ok,thank you for your answer.

ADD REPLY

Login before adding your answer.

Traffic: 2206 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6