I am looking for advice on the output of SolexaQA++ lengthsort - c
when preprocessing my RNA-Seq data.
Having already used SolexaQA++ dynamictrim
to trim by read quality I then sought to remove any short reads which resulted using SolexaQA++ lengthsort
and as the data is paired end I used the -c
flag.
Here are the relevant lines from my bash script:
path1="/PATH/TO/trim/Sample1_R1"
path2="/PATH/TO/trim/Sample1_R2"
SolexaQA++ lengthsort -c -l 36 -d "/PATH/TO/sort" $path1$".fastq.trimmed.gz", $path2$".fastq.trimmed.gz"
I expected there to be six resulting files; paired-end, singleton and discard for each input file (R1 and R2). However what was produced was just two Sample1_R2.trimmed.gz.clean
and Sample1_R2.trimmed.gz.paired
. What happened to R1?
Has something gone wrong? if so how? and if not what do these files contain?
EDIT:
If it helps the input files are trimmed FASTQ files. Here is the top 8 lines of Sample1_R1.fastq.trimmed
when unzipped.
@HWI-7001326F:29:C732HANXX:8:1101:1258:1926 1:N:0:ATCACGAT
TCATGAGAAAAGGAACTCCGTCTCATCTGGCATTGCCAATAAAC
+
FFFFFFFFFFFF<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@HWI-7001326F:29:C732HANXX:8:1101:1457:1985 1:N:0:ATCACGAT
CAACAACTTTGAAGGGTCTTGAAAGGGCAGGTAGTCCTCTAACTGAAGATTTCTCAACTCTAAAAGGAGTTGGTTTCAAACTCACAGAAGCCATAACTGAAGAGATCGGAAGAGCACACG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB
The output files both appear to be empty.
This was the end of the terminal output:
...
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20703:101335
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20624:101360
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20904:101266
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20879:101307
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20776:101348
Cleaned from read 2: @HWI-7001326F:29:C732HANXX:8:2316:20940:101369
Paired reads were written to:
/PATH/TO/sort/.clean
/PATH/TO/sort/C732HANXX-1721-01-01-01_L008_R2.fastq.trimmed.gz.clean
100% [==================================================]
Writing files...
Why has this happened?
Perhaps you could tell us what they contain? Particularly, the first 8 lines of each file would be helpful, as would the number of input reads and the number of reads in each output file... and of course anything the program printed to the screen.