Hello everybody,
Let me give you some background so you can understand my problem.
1- I produced a normal .BAM file.
2- I extracted ONLY the unmapped reads, which are 29M reads. It looks like this (first line):
HX6_24184:8:1205:14783:73264 141 * 0 0 55S22M74S * 0 0 TCAAGAAGTTTTAGCAGAAGAAATTCCAATGCTTTTATTATATGGAGAAATTGAAAATACAGTTTATAGACCAGAAAAATATGATTATTGGACAACTAGATATGACCATACTAAACTAGATCATCCTAAATTATCATATGTAATAAGACCA AAAAFJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJFFFFJJFJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJAJFJJFJJJFFJJJJJJJJJFFJAJFJJJJJJJ<JJJJFJJJJJJFFJAAFAJFJJA RX:Z:AAACACCCAATGAAAC QX:Z:-AFFFJJJJFJJJJJJ BC:Z:CGTATCGG QT:Z:AAF<AFJJ XS:f:-138 XC:Z: AC:Z: AS:f:-137 XM:A:0 AM:A:0 XT:i:0 BX:Z:AAACACCCAATGAAAC-1 RG:Z:HAP6977146:LibraryNotSpecified:1:unknown_fc:0
3- I wanted to convert the unmapped.bam to paired-end fastq files (R1, R2 and singletons).
To convert the .BAM to .fastq I used samtools fastq
:
samtools fastq -T BX -s ./singletons.fastq ./phased_possorted_unmapped_bf0x4_bam.bam -1 phased_possorted_unmapped_bf0x4_R1.fastq -2 phased_possorted_unmapped_bf0x4_R2.fastq
From the output I have the following sizes and number of reads:
- phased_possorted_unmapped_bf0x4_R1.fastq --> 2.1GB and 7M reads
- phased_possorted_unmapped_bf0x4_R1.fastq --> 2.4GB and 7M reads
- singletons.fastq --> 4.9GB and 15M reads
Then I wanted to do exactly the same, but with my .BAM sorted by read name before doing this step.
I used samtools sort -n
of my .bam:
samtools sort -n ./phased_possorted_unmapped_bf0x4_bam.bam -o ./phased_possorted_unmapped_bf0x4_sorted_bam.bam
Then I used samtools fasq again and these are the results:
- phased_possorted_unmapped_bf0x4_sorted_R1.fastq --> 3.7GB and 12.5M reads
- phased_possorted_unmapped_bf0x4_sorted_R1.fastq --> 4.2GB and 12.5M reads
- sorted_singletons.fastq --> 1.4GB and 4M reads
I don't understand why I have these differences in the number of reads for the fastq files. Sorting should not modify anything and I should have the same number of reads regardless the sorting.
I would appreciate if someone knows what is happening.
guillepalou4 : Please don't change the tag on this question back to
tool
.tool
tag is generally reserved for posts that are about new tools (not for questions, like this one, about existing tools).Hey, sorry I didn't put it back, I just updated my post because I had an error in it. Thanks for the advice though.
Gah, you rebroke the formatting :(