Most accurate way to measure adapters quantity in NGS illumina reads
1
0
Entering edit mode
8.1 years ago
Denis ▴ 310

I need to make benchmark of several tools for adapters removal (like trimmomatic, cutadapt, etc.) on my 18S MiSeq metagenomics data. I'm wondering how i can measure the quantity of survived adapters in my data after trimming (so called False Negatives)? I'm thinking about usage of blastn-short option of standalone blast for that. Is it correct and most effective way to do the job. Any suggestions would be appreciated. Many thanks, Denis

next-gen • 1.8k views
ADD COMMENT
1
Entering edit mode
8.1 years ago
chen ★ 2.5k

What's your read length? Is it pair-end or single-end?

If your sequencing is pair-end, like 2*150, for those reads from DNA template shorter than 150, then you may find adapter bases from the tails of both read1 and read2.

How to find these adapters? There are 3 different overlapping patterns for different DNA template length (TLEN):

1, not overlapped TLEN > 2x150

ATCGATTTAGTTT...ATTAGGGATTA
------------------------------------------------------------TGTAATCGTAGT...AATACGATCGA

2, overlapped with 150 < TLEN < 2x150

ATCGATTTAGTTT...ATTAGGGATTA
-------------------...-----------AGGGATTACTATCT...AGATTC

3, overlapped with TLEN < 150

ATCGATTTAGTTT...ATTAGGGATTA-adapter
ATCGATTTAGTTT...ATTAGGGATTA-adapter

Try to search for pattern 3, and count the adapter bases. It should be close to 0, if the data is really clean.

BTW, I also developed a tool AfterQC(http://github.com/OpenGene/AfterQC, Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data) to automatically cut the adapters by utilizing the pattern 3.

ADD COMMENT
0
Entering edit mode

Thanks for your response! I apologize not all experiment design details were provided in initial post. So we have 250X250 MiSeq PE reads. As to TLEN in my experiment, it corresponds rather to pattern 2 than 3. Which tool i should use to search adapters in clean data? Is blastn short Ok for that?

ADD REPLY
0
Entering edit mode

You'd better program by yourself, it is not difficult.

You can take a look at code of AfterQC and may find your way.

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6