Long DNA reads aligned
1
1
Entering edit mode
6.4 years ago
tarek.mohamed ▴ 370

Hi

What are the best long reads aligners? I have dna sequencing data generated by minion. Which aligners should I use, I need to detect SNPs and indels downstream

Thanks

nanopore • 3.1k views
ADD COMMENT
2
Entering edit mode
6.4 years ago

Minimap2 is the currently recommended aligner for long read DNA and cDNA/RNA sequencing. If you are specifically interested in large structural variation you should also try ngmlr.

ADD COMMENT
0
Entering edit mode

I used Minimap2 to align my minion reads to a reference genome. I then tried to sort it using picard but I got an error. Why there is no sequence for this read? $ java -jar /Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar SortSam I=BC01_minimap.sam O=BC01_aln_sorted.bam SORT_ORDER=coordinate

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533

Here is the complete error message

20:45:39.629 INFO  NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Tue Jul 17 20:45:39 CDT 2018] SortSam INPUT=BC01_minimap.sam OUTPUT=BC01_aln_sorted.bam SORT_ORDER=coordinate    VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Tue Jul 17 20:45:39 CDT 2018] Executing as tarekmagdyshehatamohamed@FSMD25VN00GJ1GN on Mac OS X 10.13.5 x86_64; OpenJDK 64-Bit Server VM 1.8.0_121-b15; Deflater: Intel; Inflater: Intel; Picard version: 2.17.0-SNAPSHOT
[Tue Jul 17 20:45:39 CDT 2018] picard.sam.SortSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=257425408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533
    at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:457)
    at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:355)
    at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:255)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:228)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
    at picard.sam.SortSam.doWork(SortSam.java:100)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
ADD REPLY
1
Entering edit mode

That read (flag 272) is not a primary alignment, therefore the sequence is not reported (twice).

Just use samtools for sorting, and you can do it simultaneously with alignment and avoid intermediate files:

minimap2 -t 8 -a yourgenome.fa yourreads.fastq.gz | samtools sort -@8 -o youralignment.bam

In my example I used 8 threads for alignment and sorting, which you'll have to adapt for your system.

ADD REPLY
0
Entering edit mode

I am trying to call SNPs and indels, but all the tools I found are compatible with bam files generated with specific aligners. Is there a universal SNP caller that an work with minimap2. I do not want to use nanopolish since it works only with fast5 files. I am currently testing nanosv, is there any other suggestions?

ADD REPLY
0
Entering edit mode

@WouterDeCoster Sorry if this sounds silly, second command "samtools sort -@8 -o youralignment.bam" doesn't seem to work. It instead says samtools sort [options] <in.bam> <out.prefix>, whereas I have input file in .sam format. I have also tried with "minimap2 -ax map-ont -L -t 8". Is there something I missed/misunderstood? Thanks!

ADD REPLY
0
Entering edit mode

Make sure you have a reasonably recent version of samtools. It seems you have an older version, which does not support the syntax I described above.

ADD REPLY

Login before adding your answer.

Traffic: 1661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6