how to remove Overrepresented sequences for paired end with cutadapt?
1
1
Entering edit mode
7.1 years ago
Lila M ★ 1.3k

Hi guys, I have a question. After running fasqc, I've discovered that some of my reads has overrepresnted sequences as follow

fastq.R1
Sequence    Count   Percentage  Possible Source
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA  141860  0.4972976921930607  TruSeq Adapter, Index 13 (97% over 40bp)
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 55712   0.19530134659142676 No Hit
GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTAT  50886   0.17838354973167975 TruSeq Adapter, Index 13 (97% over 40bp)

fastq.R2
Sequence    Count   Percentage  Possible Source
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG  72457   0.2540018249205738  No Hit
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 61284   0.21483428569265142 No Hit

but the adapter content is perfect. I would like to remove those adapters or overrepresented sequences. I've never done that before in PE, so I'm trying to figure out. At that moment I'm trying:

cutadapt -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAA -a NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -a GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAACAATCTCGTAT -A GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG -A NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN -o out.1.fastq -p out.2.fastq R1.fastq R2.fastq

any one with experience could tell me if this is right?

Thank you in advance!

RNA-Seq fastqc cutadapt trimming paired-end • 8.3k views
ADD COMMENT
0
Entering edit mode

If someone else has the same issue, I would like to add more information. If you are planning to map the sequences with STAR, for example, you may have an error like EXITING because of FATAL ERROR in reads input: short read sequence line . It can be solved if you add the parameter -m N, in my case I've chosen it based on the minimum Sequence length reported in fastqc. I hope this may help!

ADD REPLY
3
Entering edit mode
7.1 years ago
glihm ▴ 660

Hello Lila M,

as mentioned in the cutadapt documentation you are doing the things well.

You can use several adapters (-a/g multiple time) and you can set the adapter search for a particular mate of the pair (-a/g for R1 and -A/G for R2).

So, if you try the command you mentioned it should work as you are expecting. ;)

ADD COMMENT
0
Entering edit mode

Thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2268 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6