How to figure out which adapter has been used
2
0
Entering edit mode
5.7 years ago

I have a degradome file downloaded from: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR3690386 and https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR3710487

In the publication, the authors stated: "Small RNA and degradome reads were generated from Illumina HiSeqTM analysis." (Original publication: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-017-3556-2 )

So I came to this Illumina file where there's tons of sequences adapters:

https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-10.pdf

How do I know which one to use? Eventually I want to be able to trim adapters with scythe. The software asks for a _adapter_file.fasta_.

https://github.com/vsbuffalo/scythe

degradome adapter trimming • 1.6k views
ADD COMMENT
0
Entering edit mode
  1. from experimental/sequencing core
  2. from bbmerge
ADD REPLY
0
Entering edit mode

You have only two options available in the Illumina document you included above. You could include both sequences in your adapters file or zgrep sequence file with either sequence to see if you can quickly identify one to use.

ADD REPLY
0
Entering edit mode

could you elaborate in the zgrep option? I have two fastq files right now

ADD REPLY
0
Entering edit mode

I've tried the truSEQ adapters and the contamination rate: 0.000132 which seems low right?

ADD REPLY
2
Entering edit mode
5.7 years ago
GenoMax 147k

If you convert SRR3710487 data file to plain sequence, it is apparent what the 3'-adapter used was.

TTGCATGTGGACTAATCTATCTCAATTATCGTATGCCGTCTTCTGCTTG
AGAGCATCCCAGGCTCATGACTCAATTATCGTATGCCGTCTTCTGCTTG
TAGGTCCCAGCAATGAAGGTCTCAATTATCGTATGCCGTCTTCTGCTTG
NTGGAGTCGAAATCGGTAGGCTCAATTATCGTATGCCGTCTTCTGCTTG
NCATTCTTGAGTCAGTTGGTCTCAATTATCGTATGCCGTCTTCTGCTTG
ACAGACGTTATGTTCCCGCTCTCAATTATCGTATGCCGTCTTCTGCTTG
TGACCATCCTGTGACCCGTGCTCAATTATCGTATGCCGTCTTCTGCTTG
TCTTATTTTTGTGTACTGTACTCAATTATCGTATGCCGTCTTCTGCTTG
NAGGAGGTCGGATGGAACTACCTCAATTATCGTATGCCGTCTTCTGCTT
CCAAGTCTGATCCTATGGTTCTCAATTATCGTATGCCGTCTTCTGCTTG
CTACAGTCCGACCATCGAGTCTCAATTATCGTATGCCGTCTTCTGCTTG
ACCCTTATTCATGTTGTTTCCTCAATTATCGTATGCCGTCTTCTGCTTG
CTTCCTTTTACTGGGTTGAACTCAATTATCGTATGCCGTCTTCTGCTTG
CCACCTCTTGGCCGCTTCGCCTCAATTATCGTATGCCGTCTTCTGCTTG
AGAAGAACTATTTGAAGTTCCTCAATTATCGTATGCCGTCTTCTGCTTG
AGCAATTCGGCATCGGTGGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TACGTACGTAGTAGTATCGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TTCCTGGTTCAGGCCTCAGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TTCCGAGGAGCGAGATCAGCCTCAATTATCGTATGCCGTCTTCTGCTTG

For SRR3690386

TACCTGGTTGATCCTGCCAGTTGCAACTCGTATGCCGTCTTCTGCTTGA
GGGTGTGGCGCACGTATCTTGTTGCAACTCGTATGCCGTCTTCTGCTTG
TGTGGATAATTCTGTTATAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
GATCGTCCACAAGAAGACTGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
AAATCGAATGCTTTGTTACCGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TATCATCTTGATAGTCCTTTAGTTGCAACTCGTATGCCGTCTTCTGCTT
CTGTCACTGCTAGACCTGTGCGTTGCAACTCGTATGCCGTCTTCTGCTT
TGGTGGTGCGAAGTATCGTGCGTTGCAACTCGTATGCCGTCTTCTGCTT
CGAGAGAGAGGGAGGGAGGGAGTTGCAACTCGTATGCCGTCTTCTGCTT
CAAGAACAAGATTTGGAGAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
AAGGTGGAGTCAAAGAATGCGTTGCAACTCGTATGCCGTCTTCTGCTTG
NTCCTTCTGATTACCTCTTCCGTTGCAACTCGTATGCCGTCTTCTGCTT
GGGATCGGAGTAATGATTAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
NTTAGACTGTATATGGATACGTTGCAACTCGTATGCCGTCTTCTGCTTG
ACTGGCTTCTGAATTCGACCGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
CTTTTGCGGCCCTTCATTTCGTTGCAACTCGTATGCCGTCTTCTGCTTG
NACGCCTGCCTGGGCGTCACGTTGCAACTCGTATGCCGTCTTCTGCTTG

Use reformat.sh from BBMap suite this way to see the result.

 reformat.sh in=SRR3690386.fastq.gz out=stdout.fa | grep -v ">" | less
ADD COMMENT
0
Entering edit mode

this is very helpful!

ADD REPLY
0
Entering edit mode

One more question here! If i use those adapters, scythe remove all of the sequences!

(bio) ➜ mrcv git:(master) ✗ ./sw/scythe/scythe -a data/degradome/adapter.SRR3710487.fasta -o data/degradome/SRR3710487.trimmed.fastq data/degradome/SRR3710487.fastq prior: 0.300

Adapter Trimming Complete contaminated: 20305282, uncontaminated: 15, total: 20305297 contamination rate: 0.999999

I'm using

1 CTCAATTATCGTATGCCGTCTTCTGCTTG

and

1 GTTGCAACTCGTATGCCGTCTTCTGCTTG

ADD REPLY
0
Entering edit mode

Check what default min length filter is for scythe. You may need to turn it off or reduce its value.

With bbduk.sh from BBMap you can use literal=CTCAATTATCGTATGCCGTCTTCTGCTTG ktrim=r and that should only remove the adapter. By default minlength=10 of part that remains after trimming is what BBMap trims at. You can change that as needed.

ADD REPLY
1
Entering edit mode
5.3 years ago

According to the authors, I ended up doing this:

gunzip -c ${file_name}.fastq.gz | fastx_clipper -Q33 -l 21 | fastx_trimmer -Q33 -l 21 | paste - - - - | sed 's/^@/>/g'| cut -f1-2 | tr '\t' '\n' > ${file_name}.trimmed.21.fasta

so I kept the first 21bp and dismissed the rest

ADD COMMENT

Login before adding your answer.

Traffic: 2910 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6