.bam format is not enough for mapped long reads
1
0
Entering edit mode
5.2 years ago

Hello,

After mapping Nanopore reads to reference genome with ngmlr, I converted the output to .bam with samtools and found that some part of information about mapping in CIGAR string was lost.

I tried to map again with using --bam-fix function of ngmlr, and now I don't really understand how to process my output .sam files (I am going to search for structural variations).

I tried to convert these fixed .sam files to .bam with samtools, and received an error:

[W::sam_read1] Parse error at line 347854
[main_samview] truncated file.

Which software I can proceed my data? Maybe there is another data format for long read data I don't know?

Many thanks, Anastasiia

alignment • 1.9k views
ADD COMMENT
0
Entering edit mode

Did the suggestion solve the problem? If so please mark the answer as accepted.

ADD REPLY
3
Entering edit mode
5.2 years ago

Make sure you're using the most recent version of samtools, as support for very high numbers (>64k) of CIGAR operations was only relatively recently added to the BAM specification. Alternatively, use the CRAM format.

ADD COMMENT

Login before adding your answer.

Traffic: 2671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6