Entering edit mode
5.8 years ago
priya120195
▴
20
my raw files look like
@ERX123456.1.1 HWI-ST1018:118:D1RUFACXX:6:1101:1217:2216 length=101
NGGTCAAGAGCGCTTCCACCAACGCACAGCTGGTTGCTGAGGACATCGCTCGTCAGCTGGAGAACCGTGTGACCTTCCGCCGTGCTATGAAGCAGTGCATG
+ERX123456.1.1 HWI-ST1018:118:D1RUFACXX:6:1101:1217:2216 length=101
#0<BFFFFFFFFFIIFFFIIFIIIIIFFIIIIIFFIIIIIIIIIIIIIIIIIFFFFFFFFBFFF<BBFFFFFBBBBFFBFFFFFBBFFFFFFFFFFFFBFF
@ERX123456.2.1 HWI-ST1018:118:D1RUFACXX:6:1101:1642:2219 length=101
NATCACGTCCATAGGATAGTAGTTCATTCTCGCTATTTCAAGCGTTATCTGCTCGACCTCTTCGTCGGAGAAATGCTTGAATACTTCGGACGCATTCTCCG
+ERX123456.2.1 HWI-ST1018:118:D1RUFACXX:6:1101:1642:2219 length=101
#0<BFFFFFFFFFIIIIIIIIIFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFIIIIIIIFFFFFFBFFFFFBFFFFFFFFFFFFFFBFFFFFFFB<
@ERX123456.3.1 HWI-ST1018:118:D1RUFACXX:6:1101:1725:2240 length=101
NAGCTTCCGGGAAAAAAAGCGGAAAGCCAAAGAAGCCGGGAAATCTTGCGGCAGATGCACTTTCAGCCAAGGCACATCAGAGTGTGCGCAATGCCGACCAG
When I remove the sample id and length from annotation line the prinseq is working fine for paired end reads. Please suggest me how to write perl script to remove sample id and length from the annotation from entire file.
Perl is terrible, I suspect
awk '{print $1}' input.fastq > output.fastq
would both work and be much faster.Depends on what priya120195 means by
sample ID
. If that refers toERX123456
then this would not work. In that case this should workBTW: This is an interleaved paired-end file. You probably realize that.
Hello priya120195,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!