Incomplete alignment EMBOSS needleall
2
2
Entering edit mode
2.4 years ago

Hello everyone

I am trying to align the query Fastq nucleotide sequences with the reference (.fa) using EMBOSS needleall alignment tool. The query file has 2271611 sequences to align. However, the output sam file generated gives alignment score for only 881 sequences. I would like to generate the alignment score for all the query sequences. The output also generates needleall.error file. However, this error file is empty. I am not able to figure out what could be reason.

Looking for the response

Thanks

/home/user/EMBOSS-6.6.0/emboss/needleall -asequence X3.fa -bsequence X3rL41-post.merged.fastq -gapopen 10 -gapextend 0.5 -datafile 'EDNAFULL'   -outfile X3rL41_post_new.sam -aformat sam  
Alignment • 1.3k views
ADD COMMENT
1
Entering edit mode

I'm not sure if you're using the appropriate tool here ...

needleall is to do pairwise comparison of many sequences to many other sequences (usually the same as input set). Moreover, I think it is meant for fasta formatted files (not fastq).

when aligning short reads to a reference you're better of using of of the NGS-aligners : HiSat, Salmon?, STAR, bwa , ...

ADD REPLY
0
Entering edit mode

It does take fastq file as an input. The advantage of this tool is that it allows alignment of the fastq sequences with the fasta sequences.

ADD REPLY
1
Entering edit mode
2.4 years ago

not sure if it takes fastq as input (though yes, there example on their site is named .fastq , but when you look at that file it is a fasta formatted).

Anyway, all those NGS aligners take fastq as input and aligns against a fasta reference so I don't see the advantage here

Moreover, needleall makes use of the needleman-wunsh algorithm which is a global aligner, so it might be that it will not report a non-global alignment (as will be the case for most NGS reads, especially raw reads, aligned against a reference)

ADD COMMENT
0
Entering edit mode

Thank you so much for your response. A simple tweak from fastq to fasta solved the problem and global alignment was turned off :-). Cheers.

ADD REPLY
0
Entering edit mode
2.4 years ago

Number of the reported alignments could be related to the -minscore, minimum alignment score, option.

I have checked this with a small fastq file:

$ needleall -minscore 12 -stdout ../../petase.fa SRR17458628_1.h1000.fastq -auto | wc -l

83

$ needleall -minscore 18 -stdout ../../petase.fa SRR17458628_1.h1000.fastq -auto | wc -l

33

ADD COMMENT

Login before adding your answer.

Traffic: 2630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6