Entering edit mode
8.6 years ago
gsuryawanshi
▴
20
I am running following command to convert paired end .sai output of bwa aln to one sam file:
bwa sampe ref.fa read1.sai read2.sai read1.fastq read2.fastq >read.sam
and I am getting following error.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[fread] Unexpected end of file
and sam file has headers but no sequences When I use samse to convert read1.sai and read2.sai into separate read1.sam and read2.sam files it works fine. I also tested for first 50 sequences from read1.fastq and read2.fastq and did alignment and converted to sam file using sampe it also worked. So I am not sure why I am getting this error. Really appreciate any help. Thanks
Mis-paired reads in your two fastq files may be the problem. Did you trim/process the two files separately (instead of using a paired-end aware trimming program)?
Thanks for the reply I have trimmed both reads simultaneously so there are no mis-paired reads. Though it is possible that during trimming entire sequence of read got trimmed and now read is empty (reads with no sequences). When I deleted the sequence keeping read name as is for one read (out of 50 in my test fastq file) I am getting the same error that I got with with full data. So my guess is sampe has problem when sequence length is 0. I am going through the source code to find out if there is way to change minimum sequence length value.
Before you spend time looking through code verify the number of reads you have in your trimmed files (taking into account 4-line fastq record). Trimming programs would not leave 0 base reads behind (with just ID, unless you used non-standard means to trim the data).
I did the read count, there are no missing reads. I wrote the script for trimming which keeps the 0 base trimmed read if its mate is longer than 25bp. It creates no issue when I use samse option.
Can you try this:
Also make sure you indexed the Reference.