NGS Sequence quality
1
1
Entering edit mode
2.1 years ago

Hi everyone, I am actually confused about my sequence quality. can anyone tell me if the sequence quality is dropped in the middle portion (Please see the attached image) how can we trim it? is this possible to trim only dropped part? I mean as we can see that in the picture the quality of reads are dropped from 61bp to 79bp. So can we trim this part only by fastp or any other tools? enter image description here

Sequence NGS Data quality Trimming • 2.0k views
ADD COMMENT
0
Entering edit mode

Did the samples showing the dip in Q-scores run on the same flowcell/were part of same pool of samples? Or were they from a separate run?

ADD REPLY
0
Entering edit mode

I have a total of six samples S1, S2, S5, S6, S8, and S10. five samples (S2, S5, S6, S8, and S10) run in the same flowcell, and the R1 read of all five samples is good. But the R2 read of all five samples is dropped in the 62 to 79 position.

ADD REPLY
1
Entering edit mode

In that case it seems that there must have been a transient problem with the run e.g. a bubble in the lane that must have lead to the drop in Q scores. If there are no N calls in positions 62-79 you may be able to use the data as is. If there are N calls in that region then you may need to discard those reads unless you can trim 3'-part from N's onwards and use the first 62 bp.

ADD REPLY
0
Entering edit mode

Thanks, you GenoMax i am confused about how can check the N calls in the reads.

ADD REPLY
0
Entering edit mode
2.1 years ago
Prash ▴ 280

Dear Smarendra You would find N( Noise/No calls) in your fastq reads. Please check in those file reads.

You can "head" the reads or s imply find them by using "grep "N" If the Ns are many, you may have to discard those reads a s a whole

Hope this helps, Prash

ADD COMMENT
0
Entering edit mode

Thanks @Prash for helping but i am new in bioinfo so i didn't get it how can i check the N.

ADD REPLY
0
Entering edit mode

Assuming you have the reads, pl open them and search whether the 150b reads have "N" instead of ATGCs in your fastq files. You cna even search by opening in notepad, if you are a windows user

ADD REPLY
0
Entering edit mode

My files are not open, these are WGS reads. however please check the attached image generated in multiQC tool. enter image description here

ADD REPLY
2
Entering edit mode

That confirms that you have N's in up to 30% of reads in the said positions. If you have plenty of data available then you can simply remove the reads with N's in middle. You can do that using bbduk.sh from BBMap suite. Use option maxns=0.

maxns=-1            If non-negative, reads with more Ns than this 
                    (after trimming) will be discarded.

A guide for BBDuk is available: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/

ADD REPLY
0
Entering edit mode

Thanks, GenoMax, if I run this tool will my R2 reads remove it? or they will be trimmed? I mean the whole R2 read will remove or the N calls are removed from the R2 read?

ADD REPLY
0
Entering edit mode

You should run the tool on R1/R2 files together since any reads removed from R2 file will need its mate removed from R1 file to keep the reads from getting out of sync.

ADD REPLY
0
Entering edit mode

Thanks, @genomax. I got your point.

ADD REPLY
0
Entering edit mode

Hi GenoMax if you don't mind could you please share your mail ID so that I will share some output results?

ADD REPLY

Login before adding your answer.

Traffic: 1808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6