Entering edit mode
17 months ago
khanhlpbao
•
0
I has successfully installed gatk for a long time on my computer. Now I got a workstation and cloning it to ubuntu server VM. However, when run the testing with MarkDuplicatesSpark it does not worked like on my computer. This is the log file, can anyone tell my what error is and how to fix it?
09:07:15.341 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/test/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 20, 2023 9:07:15 AM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
09:07:15.549 INFO MarkDuplicatesSpark - ------------------------------------------------------------
09:07:15.549 INFO MarkDuplicatesSpark - The Genome Analysis Toolkit (GATK) v4.2.0.0
09:07:15.549 INFO MarkDuplicatesSpark - For support and documentation go to https://software.broadinstitute.org/gatk/
09:07:15.549 INFO MarkDuplicatesSpark - Executing as root@PostAlignmentNode on Linux v4.15.0-213-generic amd64
09:07:15.549 INFO MarkDuplicatesSpark - Java runtime: OpenJDK 64-Bit Server VM v17.0.7+7-Ubuntu-0ubuntu118.04
09:07:15.549 INFO MarkDuplicatesSpark - Start Date/Time: July 20, 2023 at 9:07:15 AM UTC
09:07:15.549 INFO MarkDuplicatesSpark - ------------------------------------------------------------
09:07:15.549 INFO MarkDuplicatesSpark - ------------------------------------------------------------
09:07:15.550 INFO MarkDuplicatesSpark - HTSJDK Version: 2.24.0
09:07:15.550 INFO MarkDuplicatesSpark - Picard Version: 2.25.0
09:07:15.551 INFO MarkDuplicatesSpark - Built for Spark Version: 2.4.5
09:07:15.551 INFO MarkDuplicatesSpark - HTSJDK Defaults.COMPRESSION_LEVEL : 2
09:07:15.551 INFO MarkDuplicatesSpark - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
09:07:15.551 INFO MarkDuplicatesSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
09:07:15.551 INFO MarkDuplicatesSpark - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
09:07:15.551 INFO MarkDuplicatesSpark - Deflater: IntelDeflater
09:07:15.551 INFO MarkDuplicatesSpark - Inflater: IntelInflater
09:07:15.551 INFO MarkDuplicatesSpark - GCS max retries/reopens: 20
09:07:15.551 INFO MarkDuplicatesSpark - Requester pays: disabled
09:07:15.551 INFO MarkDuplicatesSpark - Initializing engine
09:07:15.551 INFO MarkDuplicatesSpark - Done initializing engine
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
23/07/20 09:07:15 INFO SparkContext: Running Spark version 2.4.5
23/07/20 09:07:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/07/20 09:07:16 INFO SparkContext: Submitted application: MarkDuplicatesSpark
23/07/20 09:07:16 INFO SecurityManager: Changing view acls to: root
23/07/20 09:07:16 INFO SecurityManager: Changing modify acls to: root
23/07/20 09:07:16 INFO SecurityManager: Changing view acls groups to:
23/07/20 09:07:16 INFO SecurityManager: Changing modify acls groups to:
23/07/20 09:07:16 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
23/07/20 09:07:16 INFO Utils: Successfully started service 'sparkDriver' on port 41987.
23/07/20 09:07:16 INFO SparkEnv: Registering MapOutputTracker
23/07/20 09:07:16 INFO SparkEnv: Registering BlockManagerMaster
23/07/20 09:07:16 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
23/07/20 09:07:16 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
23/07/20 09:07:16 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-22787b06-66c3-493a-8f19-840aad0dfe66
23/07/20 09:07:16 INFO MemoryStore: MemoryStore started with capacity 19.0 GB
23/07/20 09:07:16 INFO SparkEnv: Registering OutputCommitCoordinator
23/07/20 09:07:16 INFO Utils: Successfully started service 'SparkUI' on port 4040.
23/07/20 09:07:16 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://PostAlignmentNode:4040
23/07/20 09:07:16 INFO Executor: Starting executor ID driver on host localhost
23/07/20 09:07:17 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35963.
23/07/20 09:07:17 INFO NettyBlockTransferService: Server created on PostAlignmentNode:35963
23/07/20 09:07:17 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
23/07/20 09:07:17 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, PostAlignmentNode, 35963, None)
23/07/20 09:07:17 INFO BlockManagerMasterEndpoint: Registering block manager PostAlignmentNode:35963 with 19.0 GB RAM, BlockManagerId(driver, PostAlignmentNode, 35963, None)
23/07/20 09:07:17 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, PostAlignmentNode, 35963, None)
23/07/20 09:07:17 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, PostAlignmentNode, 35963, None)
09:07:17.287 INFO MarkDuplicatesSpark - Spark verbosity set to INFO (see --spark-verbosity argument)
23/07/20 09:07:17 INFO GoogleHadoopFileSystemBase: GHFS version: 1.9.4-hadoop3
23/07/20 09:07:17 WARN BlockManager: Putting block broadcast_0 failed due to exception java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @5f341870.
23/07/20 09:07:17 WARN BlockManager: Block broadcast_0 could not be removed as it was not found on disk or in memory
23/07/20 09:07:17 INFO SparkUI: Stopped Spark web UI at http://PostAlignmentNode:4040
23/07/20 09:07:17 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/07/20 09:07:17 INFO MemoryStore: MemoryStore cleared
23/07/20 09:07:17 INFO BlockManager: BlockManager stopped
23/07/20 09:07:17 INFO BlockManagerMaster: BlockManagerMaster stopped
23/07/20 09:07:17 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/07/20 09:07:17 INFO SparkContext: Successfully stopped SparkContext
09:07:17.882 INFO MarkDuplicatesSpark - Shutting down engine
[July 20, 2023 at 9:07:17 AM UTC] org.broadinstitute.hellbender.tools.spark.transforms.markduplicates.MarkDuplicatesSpark done. Elapsed time: 0.04 minutes.
Runtime.totalMemory()=167772160
java.lang.reflect.InaccessibleObjectException: Unable to make field transient java.lang.Object[] java.util.ArrayList.elementData accessible: module java.base does not "opens java.util" to unnamed module @5f341870
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Field.checkCanSetAccessible(Field.java:178)
at java.base/java.lang.reflect.Field.setAccessible(Field.java:172)
at org.apache.spark.util.SizeEstimator$$anonfun$getClassInfo$3.apply(SizeEstimator.scala:336)
at org.apache.spark.util.SizeEstimator$$anonfun$getClassInfo$3.apply(SizeEstimator.scala:330)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.SizeEstimator$.getClassInfo(SizeEstimator.scala:330)
at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:222)
at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:201)
at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:69)
at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
at org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
at org.apache.spark.storage.memory.DeserializedValuesHolder.storeValue(MemoryStore.scala:665)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:222)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1165)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:914)
at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1481)
at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:123)
at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1489)
at org.apache.spark.rdd.NewHadoopRDD.<init>(NewHadoopRDD.scala:79)
at org.apache.spark.SparkContext$$anonfun$newAPIHadoopFile$2.apply(SparkContext.scala:1160)
at org.apache.spark.SparkContext$$anonfun$newAPIHadoopFile$2.apply(SparkContext.scala:1146)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:699)
at org.apache.spark.SparkContext.newAPIHadoopFile(SparkContext.scala:1146)
at org.apache.spark.api.java.JavaSparkContext.newAPIHadoopFile(JavaSparkContext.scala:478)
at org.disq_bio.disq.impl.file.PathSplitSource.getPathSplits(PathSplitSource.java:96)
at org.disq_bio.disq.impl.formats.bgzf.BgzfBlockSource.getBgzfBlocks(BgzfBlockSource.java:66)
at org.disq_bio.disq.impl.formats.bam.BamSource.getPathChunks(BamSource.java:125)
at org.disq_bio.disq.impl.formats.sam.AbstractBinarySamSource.getReads(AbstractBinarySamSource.java:86)
at org.disq_bio.disq.HtsjdkReadsRddStorage.read(HtsjdkReadsRddStorage.java:166)
at org.disq_bio.disq.HtsjdkReadsRddStorage.read(HtsjdkReadsRddStorage.java:127)
at org.broadinstitute.hellbender.engine.spark.datasources.ReadsSparkSource.getHeader(ReadsSparkSource.java:188)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeReads(GATKSparkTool.java:575)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.initializeToolInputs(GATKSparkTool.java:554)
at org.broadinstitute.hellbender.engine.spark.GATKSparkTool.runPipeline(GATKSparkTool.java:544)
at org.broadinstitute.hellbender.engine.spark.SparkCommandLineProgram.doWork(SparkCommandLineProgram.java:31)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
23/07/20 09:07:17 INFO ShutdownHookManager: Shutdown hook called
23/07/20 09:07:17 INFO ShutdownHookManager: Deleting directory /tmp/spark-9f07a197-15fa-46d8-a87a-289b746dc51b
Using GATK jar /mnt/test/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx32g -jar /mnt/test/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar MarkDuplicatesSpark -I /mnt/test/test.bam -O sort.bam -M M.txt -OBI
Try using their docker distribution. They have a tutorial on how to set it up.