Hi! Members,
I am using a perl program for Transposable Element analysis, which required unmapped fastq reads from Bismark output. My original raw data was paired end and I used the following command line for Bismark mapping:
bismark_v0.21.0/bismark ~/Bismark/Genome -bowtie2 --ambiguous --non_directional -unmapped -R 10 score_min L,0,0.6 -N 1 -1 Epiril_22C_R1_L1/Epiril368_22_C_R1_L001_R1_val_1.fq.gz -2 Epiril_22C_R1_L1/Epiril368_22_C_R1_L001_R2_val_1.fq.gz -o New_Output/Epiril368_22C_Rep1_L1.bam
The output from this command gave me unmapped read1 and read2 files which I concatenate as:
cat unmapped.read1.fq unmapped.read2.fq > unmapped.fq
Now when I am using perl program epiTEome, it giving me error:
perl epiTEome.pl -gff Analysis/tair10TEs.gff3 -ref Analysis/TAIR10_chr_all.epiTEome.masked.fasta -un trial/Epiril368_4_C_R1_L001_R1_val_1.fq.gz_unmapped_reads_1.fq -t Analysis/teid.lst
Possible precedence issue with control flow operator at epiTEome.pl line 482.
INFO epiTEome.pl Fri Jun 28 09:49:46 2019 Start program!
INFO epiTEome.pl Fri Jun 28 09:49:46 2019 Run Module: readGffFile!
INFO epiTEome.pl Fri Jun 28 09:49:49 2019 STEP 1: read ends mapping.
INFO epiTEome.pl Fri Jun 28 09:49:49 2019 Run Module: splitFastq!
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Missing sequence and/or quality data; line: 4
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/local/share/perl/5.26.1/Bio/Root/Root.pm:449
STACK: Bio::SeqIO::fastq::next_dataset /usr/local/share/perl/5.26.1/Bio/SeqIO/fastq.pm:121
STACK: main::splitFastq epiTEome.pl:501
STACK: main::main epiTEome.pl:93
STACK: epiTEome.pl:55
What I understood from the error is I have problem in unmapped.fq file.
Any suggestion to quality check my unmapped fastq files.
Thank you! Regards
Run
validateFiles
utility from Jim Kent's UCSC tools (after download add execute permissions,chmod a+x validateFiles
) to make sure your fastq files are in proper format.Are you sure that the perl program you are using expects a combined file (like the one you made R1+R2 at end of R1 file)?
Thank you for prompt suggestions.
here is the report from validateFiles:
Yes the program can take concatenate fastq files, however, the error is occurring even I used file without merging.
the head of fastq file is:
Thank you! Regards
As you can see somehow your sequence (line 2) has gotten appended at the end of line 1 (fastq header).
How did that happen?
A good fastq record should look like this.
That is a strange choice of index (if
CAAAAN
is real).Please use the formatting bar (especially the
![code_formatting](https://image.ibb.co/fg0nMx/code_formatting.png)
code
option) to present your post better. I've done it for you this time.Thank you!
Many thanks! for your suggestions and corrections. I will use make sure to use formatting bar in future.
I don't have any idea how did it happen. I think I made some mistake while mapping the reads.
If you any suggestions to correct it please let me know. Otherwise I have to start from Indexing genome and mapping.
Thank you! Regards
I am not sure how to tell you. If your original files were fine then just running
bismark
should not have done this. You would need to backtrack and re-do things as needed.