Hi!
I have data from paired-end Illumina RNA-Seq in one .fastq file. I want to make comparison of alignment quality for bwa aln and mem algorithm, but I don't know how could I use aln for this type of data - I know that for bwa mem, I just have to set parameter '-p'. I cannot find information about how should declare that I have all reads in one file for aln though - if I do not declare any additional parameters, in quality control I can see that single-end alignment was performed.
If anyone could give me some tip or explain where I could find the answer (I dind't find int here: http://bio-bwa.sourceforge.net/bwa.shtml#3, or maybe I just don't quite understand the description :) ), I would highly appreciate it.
Agata
If you have a file with interleaved reads those can be split into two constituent files easily using
reformat.sh
from BBMap suite. To be safe you should verify that the reads are correctly paired before splitting them.Thank you, I will definitely try it :)
Most aligners assume your pair-end data is in 2 fastq files. Why do you have it in one file?
Data from SRA often contain paired-end FASTQs as a single file. They can be separated on download using the "--split" option of fast-dump.
This is somewhat different. In SRA files the paired-end reads are joined. In OP's case, the fastq files have interleaved paired-end reads (this seems to be more common in the metagenomics community for some reason).
I actually cannot answer this question - I got the data from my supervisor and untill now I didn't even thought about it :) I don't know about standards but I will definitely ask about it on the next consults.