How to rectify error while sorting the sam file
1
0
Entering edit mode
6.1 years ago
sunnykevin97 ▴ 990

HI I'm converting SAM to sort bam it showing SAM file is truncated. But when i view it using samtools it doesn't display truncated . Is their any other way to convert to bam SRR1.sam = 47 GB

Sorting bam

samtools view -u SRR1.sam | samtools sort -@ 5 -T /tmp/SRR1.samsort -o SRR1.sort.bam 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.
[bam_sort_core] merging from 5 files and 5 in-memory blocks...
assembly genome alignment • 2.7k views
ADD COMMENT
0
Entering edit mode
**samtools view -bS SRR1.sam > SRR1.bam** 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.
SRR1.sam => 47 GB
SRR1.bam => 1.9 GB
samtools quickcheck -v -v -v -v SRR1.bam && echo 'all ok' 
verbosity set to 4
checking SRR1.bam
opened SRR1.bam
SRR1.bam is sequence data
SRR1.bam has 455 targets in header.
SRR1.bam has good EOF block.
all ok
ADD REPLY
0
Entering edit mode

sunnykevin97 : Please use ADD REPLY/ADD COMMENT when responding to existing answers/comments to keep threads logically organized.

SUBMIT ANSWER is for new answers to the original question.

ADD REPLY
0
Entering edit mode

How did you obtain this sam file?

ADD REPLY
0
Entering edit mode
6.1 years ago

first, just convert sam to bam

samtools view -bS your_file.sam > your_file.bam

Using samtools v 1.4.1

samtools quickcheck -v -v -v -v   your_file.bam && echo 'all ok'

Paste the output of your file

ADD COMMENT
0
Entering edit mode

@Vijay: Samtools is now at v.1.9. You should upgrade.

ADD REPLY
0
Entering edit mode

I tried with samtools v.1.9 SRR.sam => 47 GB

samtools view -bS SRR.sam > SRR.bam 
[W::sam_read1] Parse error at line 10363596
[main_samview] truncated file.

SRR.bam => 1.9 GB
samtools quickcheck -v -v -v -v SRR.bam  && echo 'all ok' 
verbosity set to 4
checking SRR.bam
opened SRR.bam
SRR.bam is sequence data
SRR.bam has 455 targets in header.
SRR.bam has good EOF block.
all ok
ADD REPLY
1
Entering edit mode

There is something wrong with your file at that specified line.

Can you pull out lines around that error location: sed -n '10363595,10363597p;10363598q' your_file?

ADD REPLY
0
Entering edit mode

sed -n '10363595,10363597p;10363598q' SRR.sam

SRR.5228479 99  chr6    65895862    3   250M    =   65896126    513 TGGCCAATATGAAGGAGCTACTGGCTTCATGAAATTCCAAAATCCAGTGGGTCAGCAATTAAATCTTAAAGCTCCAAAATGACCTCCTTTGATTCCATGTCTCACACTTAGGCATGCTGATACAAGGGGTGGGCTCCCAAGGACTTGGGCAGCTCTGCCTCTGTGGCTCTTCAGGGTACATACCCCATGACTGCTTTCACAGGCTGGCATTGAGTGTCTGCTGTTTTTCCAAGTGCACAGTGCAAGCTGT  DDDDDIIIIIIIIIIGIIIIIIHIIIHHHIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIHHIIIIIIIIIIGIIIIIIHIIIGHIIIIIIIIIIIIIIIIIIIIIIHHHIIIIIHHIIIIIIHIIIIHGIIIIIIIIIIIIIIIIIIIGIGIIGGGHHHHHGHEHHIIHFHIHHIIEHHGHIIIIHIHIICHEEC7FEHIHIGH?BAFEEHIHCHFGIHIIHHII?GIIH.  PQ:i:21 SM:i:3  UQ:i:0  MQ:i:0  XQ:i:0  NM:i:0
SRR.5228479 147 chr6    65896126    3   249M    =   65895862    -513    TTCTGGTGTATGGAGGATGGTGGCCATCTTCTCACAGCTCCACTAGGCACTGCCCCAATGGGCACTCAGTGTTGGGGCTCCAACCACACATTTCTCCTCTGTATTGCCCTAGTAGAGGTTTTCCATGAWarning - out of bounds at position (28,-1)
Warning - out of bounds at position (27,-1)
ADD REPLY
0
Entering edit mode

You could try to delete that problem record by doing: sed -i '10363596d' your.sam and see if that fixes the problem. Make a backup of the file (if you want to be careful) since the command will make the change in place. Otherwise you may need to re-do alignment.

ADD REPLY
0
Entering edit mode

I tried showing truncated sam file sed -i '10363596d' SRR.sam

samtools view -u SRR.sam | samtools sort -@ 10 -T test/SRR.sort -o SRR.bam

[W::sam_read1] Parse error at line 10363596 [main_samview] truncated file. [bam_sort_core] merging from 0 files and 10 in-memory blocks... Generated bam file - 1.8 GB

ADD REPLY
0
Entering edit mode

You may want to cut your losses and redo the alignment. Even if you manage to get past this error there may be something else wrong.

ADD REPLY

Login before adding your answer.

Traffic: 2168 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6