Every one of my illumina seqs have an 'N' at base 18 and 48, and -14 on the reverse. I don't think this is normal... Anyone seen this before?
0
2
Entering edit mode
10.5 years ago
Daniel ★ 4.0k

I have just received an amplicon dataset but my seqs all follow the same pattern

>M02538:3:000000000-A6UM4:1:1101:10565:1083 1:N:0:0
TGGGGAATCTTGCACAANGGAGGAAACTCTGATGCAGCGACGCCGCGNGAGTGATGAA---------GCGTNGGGAGCAAACAGG
_________________^-----------------------------^-----------------------^

I find this highly irregular. I have aligned it against a reference sequence and the Ns are in the right place to keep consistency so they're not insertions.

Does anyone have any ideas?

Thanks

illumina • 3.0k views
ADD COMMENT
1
Entering edit mode

It could also be a bubble moving through the lane.

ADD REPLY
0
Entering edit mode

It could have been a failed cycle due to a temporary hardware failure (like a camera communication issue). Not unheard of. Looks like it's a MiSeq run so we can't query other lanes on that flowcell.

Can you check a few reads and tell us the quality score you see at the problematic locations?

ADD REPLY
0
Entering edit mode

Sorry, yes it's a miseq.

Here are some fastqc qual charts.

The quality does drop at the 18 and -14(reverse) Ns but not the 48bp one as far as I can see.

ADD REPLY
1
Entering edit mode

No need to apologize--just thinking out loud :)

I originally thought the cycle was a total wipeout due to a temporary sequencer hardware problem, but those FASTQC graphs just imply a crappy cycle. Weird that the scores recover so quickly after the first blip.

Are you positive that every base on every read at those cycle positions is an "N"? This is very important. Because if not every base is an N at that position, then just a portion of the flowcell could have had a problem (a bubble, for example). It's worth loading up the flowcell in Illumina's Sequencing Analysis Viewer and looking at the images for that cycle.

ADD REPLY
0
Entering edit mode
I will check this. We discussed that a bubble caused the early issues with the reverse reads.
ADD REPLY
0
Entering edit mode

The 48th is merged in with bases 45-49 in your fastqc plot so the drop in quality is not as obvious.

ADD REPLY
0
Entering edit mode
Oh yeah, of course. Being stupid.
ADD REPLY
0
Entering edit mode

Did you ever solve this problem? Devon and I both suspected a bubble, but I'd like to know for certain.

ADD REPLY
0
Entering edit mode

From talking to the guys who ran the machine we think this is probably what happened but I can't think of any way to confirm without doing another run. I should check with whoever gets the next dataset if they see the same thing... (This was one of their first times running the miseq)

I am just doing as best I can with the data. Because it's so consistent I can just factor it in.

ADD REPLY
0
Entering edit mode

Did you look at sequencing analysis viewer? Using it, you can very easily see if there's a certain region of the flowcell that was problematic. From there you can look at the corresponding thumbnail images. Believe me, if there's a bubble you will see it in the thumbnails!

ADD REPLY

Login before adding your answer.

Traffic: 1862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6