to find same or similar sequences within fasta seq
2
0
Entering edit mode
10.0 years ago
Kurban ▴ 230

Hello, I am trying to find out the best way to find same or similar sequences to the defined sequence within the transcriptome sequences in fasta file, which is assembled from RNA-seq data. I know there are many tools, but I don't know which one is developed for this purpose. could any one give me some tips? thanks?

similar-sequence • 3.4k views
ADD COMMENT
0
Entering edit mode

I'm not clear on what exactly you are looking to do -- compare sequences from different samples or within the same sample? There are many different strategies to do both - from clustering (usearch/UPARSE/cd-HIT, etc) to alignment (BLAST, etc.). Can you please clarify your original post with your research question?

ADD REPLY
0
Entering edit mode

sorry @Josh Herr, I have not been clear.

I have a fasta file which contain around 144,000 transcripts/sequences(transcriptome of an insect). My boss gave me several nucleotide sequences and asked me is there any similar or same sequences in the fasta file with those sequences? If any, which ones and how are their similarity?

I want to align those sequences one by one with the transcriptome(fasta file).

I am new at this kind of analysis

ADD REPLY
0
Entering edit mode

Sounds like blast would be a good solution. You can install it locally and use it from the command line.

ADD REPLY
0
Entering edit mode

Thank you @Josh Herr ,@Siva and @Geek_y.

ADD REPLY
1
Entering edit mode
10.0 years ago
Siva ★ 1.9k

You can create a BLAST database of those 144,000 transcripts and do BLASTN search using the nucleotide sequences as query.

ADD COMMENT
0
Entering edit mode
10.0 years ago

You can use cd-hit-est or usearch for this purpose. They will make one representative sequence from similar sequences, which is based on user defined % similarity.

If you need to compare them against another set of sequences, you need to perform blast or any similar alignment.

ADD COMMENT
0
Entering edit mode

Free version of usearch (32 bit) will be very slow.

ADD REPLY

Login before adding your answer.

Traffic: 1672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6