Hi everyone :
I used the software cutadapt to remove the adapter after I received the raw data from sequencing company.(The command is:
cutadapt \
-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC \
-A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT \
-o A01_R1.clean.fastq.gz -p A01_R2.clean.fastq.gz \
A01_R1.fastq.gz A01_R2.fastq.gz )
After I gained the clean data,I mapped short reads of WGS to the hg19 reference using BWA mem.But but the sam file of the output didn't update after run more than 24h or more longer even the Linux Server showed program still running in the background,when I used multi-thread to run the command
nohup bwa mem -t 4 -M -R "@RG\tID:A01\tSM:A01\tPL:illumina\tLB:lib1\tPU:unit1" /public1/data/reference/public/Homo_sapiens_assembly19.fasta /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R1.clean.fastq.gz /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R2.clean.fastq.gz 1>A01.sam 2>stderr/A01.bwa.stderr &.
The log of A01.bwa.stderr using the multi-thread as follows:
[M::mem_pestat] mean and std.dev: (1405.67, 1625.87)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10748)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (1291, 3007, 3959)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9295)
[M::mem_pestat] mean and std.dev: (3058.62, 2334.28)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 12396)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
Strangely,when I changed single-thread, the output was fine.It looks like OK,because the log as follows:
[M::mem_pestat] (25, 50, 75) percentile: (303, 344, 394)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (121, 576)
[M::mem_pestat] mean and std.dev: (349.67, 66.67)
[M::mem_pestat] low and high boundaries for proper pairs: (30, 667)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 14216 reads in 6.793 CPU sec, 6.725 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -M -R @RG\tID:A01\tSM:A01\tPL:illumina\tLB:lib1\tPU:NONE
/public1/data/reference/public/Homo_sapiens_assembly19.fasta /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R1.clean.fastq.gz /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R2.clean.fastq.gz
[main] Real time: 123044.077 sec; CPU: 123276.915 sec
So,I'm confused very much if the operation of cutadapt makes the pair-end fq files incomplete.And why the single-thread seem to make no mistakes?Whether the output after run single thread command is believable? Too many questions and thanks for your time. Best
Actually you need to paste the command line you used, otherwise we can not analyze the problems. Thanks.
the command line I used is:
Do you check the nohup.out file and stderr file to see if there is any error message?
Yes,I have checked them ,but there's nothing wrong have been reported.
It is possible that a lot of stuff is happening in memory, but 24h is suspicious. What's the config on the machine you're running
bwa
on?I used 32 threads ,8 cpu cores of the Linux server to map the reads.I don't know the other config of the machine ,but if it is necessary I'll check them.Thank you!