Question

Bacterial RNA-SEQ - Quality Control

0

Entering edit mode

3.7 years ago

Sammy ▴ 30

Heyya,

Trying to figure out Quality Control for RNA-SEQ. I ran FASTQC on the first batch and the same sample, forward strand looks the same as on each lane. The same with the reverse. Is this normal?

Plus it looks kind of like this:

I have a huge peak on per sequence GC content enter image description here

Per base and per sequence quality score look alright.

I read all over the internet but those problems seem to be too specific.

RNA-Seq • 1.2k views

ADD COMMENT • link updated 3.7 years ago by swbarnes2 14k • written 3.7 years ago by Sammy ▴ 30

0

Entering edit mode

forward strand looks the same as on each lane.

What does that mean? If you have very good quality data Q scores may be pegged very high.

ADD REPLY • link 3.7 years ago by GenoMax 147k

0

Entering edit mode

I mean L001_R1 "per sequence GC content" looks identical with L002_R1, L003_R1 and L004_R1. Ok, so that looks alright then? What about 2 peaks? I read that's a sign of contamination but again, the quality score is good even then? enter image description here

enter image description here

ADD REPLY • link updated 3.7 years ago by GenoMax 147k • written 3.7 years ago by Sammy ▴ 30

0

Entering edit mode

You should first scan/trim this data before re-running FastQC. Post that result here. As @swbarnes2 says you likely have something odd going on here.

ADD REPLY • link 3.7 years ago by GenoMax 147k

0

Entering edit mode

Hi GenoMax,

I have run Trimmomatic (operation: SLIDINGWINDOW, Number of bases to average across: 4, Average quality required: 20) on some of my samples and done the FASTQC again. The Per Sequence GC content is not much different. Do you have any suggestions? Or how should I approach this issue?

I'll be honest. The prognosis doesn't look that good to me (biostar link) enter image description here

ADD REPLY • link updated 3.7 years ago by GenoMax 147k • written 3.7 years ago by Sammy ▴ 30

0

Entering edit mode

I don't quite understand why you have this result but at this point go ahead and start you alignments. You don't need to worry about splicing so things should be simpler. Let us know what % alignments you get. If alignments look wonky they it is possible that you have some kind of contamination. But we will get to that later.

ADD REPLY • link 3.7 years ago by GenoMax 147k

0

Entering edit mode

I'm thinking of asking the sequencer provider for the adapters he used. Would that be a good idea? Maybe that's the problem. I checked in here but who knows. This was a source of inspiration.

ADD REPLY • link 3.7 years ago by Sammy ▴ 30

0

Entering edit mode

What would you do with the adapter sequence? Again, are you absolutely sure that deviation from that theoretical blue line (which was probably calculated for DNA from eukaryotic species) really is a problem for your sample?

ADD REPLY • link 3.7 years ago by swbarnes2 14k

score 0 · Answer 1 · 2021-02-26

Are you entirely sure that what fastqc is expecting is what you are aiming for? I would suspect a sample mix-up if I had a sample of, say, M.tuberculosis where the GC content peaked at about 50%!

However, that sharp peak on the far side looks like a large number of garabge reads that are all GGGGGGGGGGG