Dear all,
I have large fastq files (about 6-7 Gb each of them), which their headers are not compatible with trinity program as the following error appeared after running trinity:
Error, pairs.K25.stats is empty. Be sure to check your fastq reads and ensure that the read names are identical except for the /1 or /2 designation
My header left fastq file is:
@D69F08P1:337:C4GGBACXX:2:1101:1378:2161 1:N:0:ATGTCA
GGGGGCAGTGACCAGCTGGACTGAGCTCATAGTATGGAGCTTCAGGCAGCTGCCTTTTCACCTGAAGGCCCCAGATAGGCTTAACCCAGGGGTCTTCTTT
+
BBBFFFFFF<FFFIIIIIIIIIIIIIIIIIIIBFFIIIFIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFFFFFFFFFFFFFFFFFFFF77BBFFFF
@D69F08P1:337:C4GGBACXX:2:1101:1497:2202 1:N:0:ATGTCA
TATGGATGGTGATCACTCAGGCTGAAACCCCCAGCAAGGAATCTTTGGATGAGGGCCAGCTGAGATCTCTCTTGGTCGGAGTATGCATCCATGATCATGG
and the right fastq file header as
D69F08P1:337:C4GGBACXX:2:1101:1378:2161 2:N:0:ATGTCA
GTTCCAATCTGTCTCATGTATGGAAAAGAAGACCCCTGGGTTAAGCCTATCTGGGGCCTTCAGGTGAAAAGGCAGCTGCCTGAAGCTCCATACTATGAGC
+
BBBFFFFFFFFFFIIIIIFIIIIIIIIFFIIIIIIIIIIIFIIIIFIIIIIIIIIIIIIIIIIIFFIIFFFFFFFFFFFFFFFFBBFFFFFFFFFFFFFF
@D69F08P1:337:C4GGBACXX:2:1101:1497:2202 2:N:0:ATGTCA
CGAATATAGAGAAAGCCAGCAGACTTGCCAATGTGCCTTCTGGGATGAAGACAATTCCGACGACTTAGTCTCCCAATCTGCCCAGGAATCAGATCAATGG
I think the header (say left reads) should change to @D69F08P1:337:C4GGBACXX:2:1101:1378:2161:N:0:ATGTCA/1
that is acceptable for trinity, am I right?. Could you please share your commands to this end? Thanks so much in advance
To save to file:
Thanks a lot Geek. It works fine with unzip files, but I didn't know why it doesn't work with gzip file, even I applied "gunzip filename | " before your command! I'm dealing with many large files, any suggested command to solve the problem would be really appreciated.
Did you read the Google thread ? I have used trinity recently on raw fastq from illumina HiSeq1000 and it worked fine without any error regarding /1 or /2. Regarding gz files, use