When aligning some cleaned Illumina short reads onto a reference I am getting skip orientation
lines in the BWA mem
output e.g.:
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (169, 207, 262)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 448)
[M::mem_pestat] mean and std.dev: (220.19, 67.49)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 541)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
How much of an issue is this?
I am given to understand this can occur when the read headers lose their orientation, which can sometime happen when trimming ect.
I have manually checked the headers as follows:
cat PSU4_ISF1A_1-cleanQ30.fastq | head -4
@V300033288L3C001R0020004451/1
AGTAGACGCGGGGGGCGGGTCGCGCGAGCAGGTCGGCG
+
EEEFD?DDCADEACEEFAECDFFDFECFEFFEDDBDFE
cat PSU4_ISF1A_2-cleanQ30.fastq | head -4
@V300033288L3C001R0020004451/2
CGACGTCTGGCCCGGCCAGCTCGCCGACGCCGCCGACGACCCGCAGACACCGGACATGCTCGCCCTCCTGC
+
FEDFFFFFFFFCFFBFEFFEBFFBFF>EEFFFEFFADFCFFFFF:DBF7FFFF=FFFEDCDFCFFFFFDFE
(I know Q30 is harsh trimming for alignment but this is a denovo assembly and I am just aligning to look for contamination with blobtools)
I have also used BBTools repair.sh
in which the singletons.fastq
is empty.
What else can be causing this issue (and how much of an issue is it)?
Those fastq headers look weird if this is Illumina data. Did you do something to them? Looks like they are missing the
:
separators.No that I am aware of. They look the same in the raw (non quality trimmed reads). They are from BGI if that makes a difference, they send the reads pre adapter trimmed so maybe something they did?
Not familiar with BGI data but if they are not Illumina then I suppose that is how the headers are for BGI data. Learned something new.
Yeh they call it "BGI Seq" but for all intense and purposes it is just a carbon copy of Illumina (not sure how they get away with it). I just refer to it as Illumina on forums as it is easier :)