SOLVED:Multiple aligner failures but no error(Closed)
0
0
Entering edit mode
8.0 years ago
skbrimer ▴ 740

*EDIT* Thank you every one. It was user error. Please see the information below.


Hello everyone,

I have encountered an problem I have never had before. I have salmonella sequencing data(ion torrent) that will not map and I have no idea why.

First we tried to use DNAStar, it run thoughts without error but then at the end says there is no sequencing data and fails to align.

Next I tried using TMAP and get a segmentation error [core dump] after the initial load. (alright I lied. That is an error)

Next I tried bwa mem and it loads all the reads and finishes without error and samtools flagstats says 0 reads map.

Lastly I tried using SPAdes for a de novo build, which worked and I was able to make salmonella contigs.

Fastqc is able to read the data without issue and it seems to be OK data.

Has anyone encountered this type of issue before?

Any advice, help, or pointing me in the right direction will be greatly appreciated.

Thank you,

Sean

alignment • 1.5k views
ADD COMMENT
1
Entering edit mode

Can you try BBMap?

ADD REPLY
0
Entering edit mode
-----------------   Results   ------------------   

Genome:                 1
Key Length:             13
Max Indel:              16000
Minimum Score Ratio:    0.56
Mapping Mode:           normal
Reads Used:             3004412 (563086674 bases)

Mapping:            63.640 seconds.
Reads/sec:          47209.62
kBases/sec:         8848.02


Read 1 data:        pct reads   num reads   pct bases      num bases

mapped:               0.0000%           0     0.0000%              0
unambiguous:          0.0000%           0     0.0000%              0
ambiguous:            0.0000%           0     0.0000%              0
low-Q discards:       0.0000%           0     0.0000%              0

perfect best site:    0.0000%           0     0.0000%              0
semiperfect site:     0.0000%           0     0.0000%              0

Match Rate:               NA           NA        NaN%              0
Error Rate:              NaN%           0        NaN%              0
Sub Rate:                NaN%           0        NaN%              0
Del Rate:                NaN%           0        NaN%              0
Ins Rate:                NaN%           0        NaN%              0
N Rate:                  NaN%           0        NaN%              0

Total time:         67.784 seconds.
ADD REPLY
0
Entering edit mode

Can you try these options minratio=0.15 ignorequality slow ordered which are meant for Nanopore/PacBio reads with errors/indels.

ADD REPLY
1
Entering edit mode

In what format is your Ion Torrent data? And would you paste the first 10-12 lines of data?

ADD REPLY
0
Entering edit mode

it is unmapped bam, but in fastq it looks like this

@YE0QL:00664:05761
TGAAAGCGAGCTATGTACTGAATACCGCAGAACTGCACGCGCCGCTGCAGAAAAACCAGGTGGTCGGCACCATCAACTTCCAGCTGGACGGTAAAACCATCGAGCAGCGTCCGCTGGTCGTAGACTGCGAAGAGATCCCGGGAAGGGAACTTCTTCGGTAAAATCATTGATTACATTAAATTAATGTTCCATCACTGGTTTGGATAAAATTGAACTCTTGAAAGTGTGATTTTCGTCCCCTATATACTATGCATCAGTAAACTCCGCCCTGTGG
+
7;;;05@?AABB<<;::5;==4;>949<;55.59=595<<99244449945555$549;7>>9>==8<<<,//997<@:>9=4/--(---(-899,=699999555599994999:4:9@90///----/)/:<555:)55)56;>3;7<=7<<7<;:>>@@2;;;;8<@A259>>:<808-7-000*/)//9/34:59;4<4::::;.:044056;;9<=<265A<99999*////77&///78;86///79;=666=@055;6544)45544
@YE0QL:00664:05804
ATATACGCTGCTATCGTTGGCTTTTACATAATCATCGCGCTTCTTTT
+
BBB@?A;:555999==B=@<@BDC6;9@@=2;443333333-3333&
@YE0QL:00664:05808
TTGCTTCAGGGCGCCTGAGATCCGCGCATCGTTCCAGCCAGCATCCTGCAAGAGCATATTCGCCAGTTCAGAGGCCTTATTGACCCGCACGCAGCCGGAACTTAACGCCCTGGCGTCACGCTGAAAGAGAGTATGGTTAGGCGTATCGTGCAGATAGATAGCATCGTGAACTACGGTCAATATTCAAAA
ADD REPLY
1
Entering edit mode

That first read is definitely from Salmonella enterica.

ADD REPLY
1
Entering edit mode

Just to be sure it's not a hidden formatting issue, what happens if you convert the BAM to FASTQ and align with BBMap?

As @genomax2 noted, the reads are from S. eneterica so they should align.

ADD REPLY
0
Entering edit mode

You both are correct. It is fine. I apparently have a truncated genome file.

The genbank summary file is 4.0KB in size The genbank full (truncated) I was using is 3.4MB in size the genbank full (I just reloaded) is 11.4MB

Right now mapping seems to be moving along just fine with the correct file. The truncated file had all the genes listed but was missing the actual sequence.

Thank you both for your quick help!

ADD REPLY
0
Entering edit mode

Also for the record :)

It works so much better with something to actual align. ------------------ Results ------------------

Genome:                 1
Key Length:             13
Max Indel:              16000
Minimum Score Ratio:    0.56
Mapping Mode:           normal
Reads Used:             3004412 (563086674 bases)

Mapping:            1309.738 seconds.
Reads/sec:          2293.90
kBases/sec:         429.92


Read 1 data:        pct reads   num reads   pct bases      num bases

mapped:              77.7100%     2334730    77.1825%      434604398
unambiguous:         76.7708%     2306510    76.2666%      429447182
ambiguous:            0.9393%       28220     0.9159%        5157216
low-Q discards:       0.0000%           0     0.0000%              0

perfect best site:   12.8440%      385886    10.9398%       61600284
semiperfect site:    12.8440%      385886    10.9398%       61600284

Match Rate:               NA           NA    93.2227%      424412648
Error Rate:          80.5406%     1948844     6.7759%       30848592
Sub Rate:            75.7501%     1832929     1.7104%        7786916
Del Rate:            17.2835%      418210     4.5387%       20663278
Ins Rate:            20.9759%      507554     0.5268%        2398398
N Rate:               0.0059%         143     0.0014%           6436

Total time:         1314.855 seconds.
ADD REPLY

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6