Hi, I am using Pilon to polish and elongate my PacBio assembly.
But I realized Pilon always generate error like shown in the bottom.
I found that one of my .bam file have something wrong and I performed Picard ValidateSamFile.
This program diagnosed this .bam is missing group(↓).
## HISTOGRAM java.lang.String
Error Type Count
ERROR:MISSING_READ_GROUP 1
WARNING:RECORD_MISSING_READ_GROUP 22324694
How to fix this .bam file? Any comments would be appreciated.
Best, Jung
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.simontuffs.onejar.Boot.run(Boot.java:340)
at com.simontuffs.onejar.Boot.main(Boot.java:166)
Caused by: java.lang.UnsupportedOperationException: Cannot query stream-based BAM file
at htsjdk.samtools.BAMFileReader.query(BAMFileReader.java:410)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:498)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:503)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryOverlapping(SamReader.java:365)
at org.broadinstitute.pilon.BamFile.readsInRegion(BamFile.scala:329)
at org.broadinstitute.pilon.BamFile.recruitBadMates(BamFile.scala:357)
at org.broadinstitute.pilon.GapFiller$$anonfun$recruitJumps$1.apply(GapFiller.scala:380)
at org.broadinstitute.pilon.GapFiller$$anonfun$recruitJumps$1.apply(GapFiller.scala:379)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.broadinstitute.pilon.GapFiller.recruitJumps(GapFiller.scala:379)
at org.broadinstitute.pilon.GapFiller.recruitReads(GapFiller.scala:391)
at org.broadinstitute.pilon.GapFiller.assembleAcrossBreak(GapFiller.scala:51)
at org.broadinstitute.pilon.GapFiller.fixBreak(GapFiller.scala:45)
at org.broadinstitute.pilon.GenomeRegion$$anonfun$identifyAndFixIssues$4.apply(GenomeRegion.scala:383)
at org.broadinstitute.pilon.GenomeRegion$$anonfun$identifyAndFixIssues$4.apply(GenomeRegion.scala:381)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.broadinstitute.pilon.GenomeRegion.identifyAndFixIssues(GenomeRegion.scala:381)
at org.broadinstitute.pilon.GenomeFile$$anonfun$processRegions$4.apply(GenomeFile.scala:120)
at org.broadinstitute.pilon.GenomeFile$$anonfun$processRegions$4.apply(GenomeFile.scala:109)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.parallel.ParIterableLike$Foreach.leaf(ParIterableLike.scala:972)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51)
at scala.collection.parallel.ParIterableLike$Foreach.tryLeaf(ParIterableLike.scala:969)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.internal(Tasks.scala:169)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.internal(Tasks.scala:443)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:149)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Hi Pierre and other experts,
I used Picard ValidateSamFile and found the following errors. about mate not found and missing read group.
I used tophat instead of BWA. Is there any difference between Tophat and BWA derived BAM files.
Also what to do abut Mate not found in a PE reads. It should have been a warning instead of Error. Any significance of this error wrt. RNA-Seq Tophat BAM file.
Thanks Adrian
Hello oriolebaltimore ,
You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.