Hi! Members,
I am using a perl program for Transposable Element analysis, which required unmapped fastq reads from Bismark output. My original raw data was paired end and I used the following command line for Bismark mapping:
bismark_v0.21.0/bismark ~/Bismark/Genome -bowtie2 --ambiguous --non_directional -unmapped -R 10 score_min L,0,0.6 -N 1 -1 Epiril_22C_R1_L1/Epiril368_22_C_R1_L001_R1_val_1.fq.gz -2 Epiril_22C_R1_L1/Epiril368_22_C_R1_L001_R2_val_1.fq.gz -o New_Output/Epiril368_22C_Rep1_L1.bam
The output from this command gave me unmapped read1 and read2 files which I concatenate as:
cat unmapped.read1.fq unmapped.read2.fq > unmapped.fq
Now when I am using perl program epiTEome, it giving me error:
perl epiTEome.pl -gff Analysis/tair10TEs.gff3 -ref Analysis/TAIR10_chr_all.epiTEome.masked.fasta -un trial/Epiril368_4_C_R1_L001_R1_val_1.fq.gz_unmapped_reads_1.fq -t Analysis/teid.lst
Possible precedence issue with control flow operator at epiTEome.pl line 482.
INFO epiTEome.pl Fri Jun 28 09:49:46 2019 Start program!
INFO epiTEome.pl Fri Jun 28 09:49:46 2019 Run Module: readGffFile!
INFO epiTEome.pl Fri Jun 28 09:49:49 2019 STEP 1: read ends mapping.
INFO epiTEome.pl Fri Jun 28 09:49:49 2019 Run Module: splitFastq!
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Missing sequence and/or quality data; line: 4
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/local/share/perl/5.26.1/Bio/Root/Root.pm:449
STACK: Bio::SeqIO::fastq::next_dataset /usr/local/share/perl/5.26.1/Bio/SeqIO/fastq.pm:121
STACK: main::splitFastq epiTEome.pl:501
STACK: main::main epiTEome.pl:93
STACK: epiTEome.pl:55
What I understood from the error is I have problem in unmapped.fq file.
Any suggestion to quality check my unmapped fastq files.
Thank you! Regards
Run
validateFiles
utility from Jim Kent's UCSC tools (after download add execute permissions,chmod a+x validateFiles
) to make sure your fastq files are in proper format.Are you sure that the perl program you are using expects a combined file (like the one you made R1+R2 at end of R1 file)?
Thank you for prompt suggestions.
here is the report from validateFiles:
Yes the program can take concatenate fastq files, however, the error is occurring even I used file without merging.
the head of fastq file is:
Thank you! Regards
As you can see somehow your sequence (line 2) has gotten appended at the end of line 1 (fastq header).
How did that happen?
A good fastq record should look like this.
That is a strange choice of index (if
CAAAAN
is real).Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Many thanks! for your suggestions and corrections. I will use make sure to use formatting bar in future.
I don't have any idea how did it happen. I think I made some mistake while mapping the reads.
If you any suggestions to correct it please let me know. Otherwise I have to start from Indexing genome and mapping.
Thank you! Regards
I am not sure how to tell you. If your original files were fine then just running
bismark
should not have done this. You would need to backtrack and re-do things as needed.