How to figure out which adapter has been used
2
If you convert SRR3710487 data file to plain sequence, it is apparent what the 3'-adapter used was.
TTGCATGTGGACTAATCTATCTCAATTATCGTATGCCGTCTTCTGCTTG
AGAGCATCCCAGGCTCATGACTCAATTATCGTATGCCGTCTTCTGCTTG
TAGGTCCCAGCAATGAAGGTCTCAATTATCGTATGCCGTCTTCTGCTTG
NTGGAGTCGAAATCGGTAGGCTCAATTATCGTATGCCGTCTTCTGCTTG
NCATTCTTGAGTCAGTTGGTCTCAATTATCGTATGCCGTCTTCTGCTTG
ACAGACGTTATGTTCCCGCTCTCAATTATCGTATGCCGTCTTCTGCTTG
TGACCATCCTGTGACCCGTGCTCAATTATCGTATGCCGTCTTCTGCTTG
TCTTATTTTTGTGTACTGTACTCAATTATCGTATGCCGTCTTCTGCTTG
NAGGAGGTCGGATGGAACTACCTCAATTATCGTATGCCGTCTTCTGCTT
CCAAGTCTGATCCTATGGTTCTCAATTATCGTATGCCGTCTTCTGCTTG
CTACAGTCCGACCATCGAGTCTCAATTATCGTATGCCGTCTTCTGCTTG
ACCCTTATTCATGTTGTTTCCTCAATTATCGTATGCCGTCTTCTGCTTG
CTTCCTTTTACTGGGTTGAACTCAATTATCGTATGCCGTCTTCTGCTTG
CCACCTCTTGGCCGCTTCGCCTCAATTATCGTATGCCGTCTTCTGCTTG
AGAAGAACTATTTGAAGTTCCTCAATTATCGTATGCCGTCTTCTGCTTG
AGCAATTCGGCATCGGTGGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TACGTACGTAGTAGTATCGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TTCCTGGTTCAGGCCTCAGCCTCAATTATCGTATGCCGTCTTCTGCTTG
TTCCGAGGAGCGAGATCAGCCTCAATTATCGTATGCCGTCTTCTGCTTG
For SRR3690386
TACCTGGTTGATCCTGCCAGTTGCAACTCGTATGCCGTCTTCTGCTTGA
GGGTGTGGCGCACGTATCTTGTTGCAACTCGTATGCCGTCTTCTGCTTG
TGTGGATAATTCTGTTATAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
GATCGTCCACAAGAAGACTGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
AAATCGAATGCTTTGTTACCGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TATCATCTTGATAGTCCTTTAGTTGCAACTCGTATGCCGTCTTCTGCTT
CTGTCACTGCTAGACCTGTGCGTTGCAACTCGTATGCCGTCTTCTGCTT
TGGTGGTGCGAAGTATCGTGCGTTGCAACTCGTATGCCGTCTTCTGCTT
CGAGAGAGAGGGAGGGAGGGAGTTGCAACTCGTATGCCGTCTTCTGCTT
CAAGAACAAGATTTGGAGAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
AAGGTGGAGTCAAAGAATGCGTTGCAACTCGTATGCCGTCTTCTGCTTG
NTCCTTCTGATTACCTCTTCCGTTGCAACTCGTATGCCGTCTTCTGCTT
GGGATCGGAGTAATGATTAAGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
NTTAGACTGTATATGGATACGTTGCAACTCGTATGCCGTCTTCTGCTTG
ACTGGCTTCTGAATTCGACCGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
TACCTGGTTGATCCTGCCAGGTTGCAACTCGTATGCCGTCTTCTGCTTG
CTTTTGCGGCCCTTCATTTCGTTGCAACTCGTATGCCGTCTTCTGCTTG
NACGCCTGCCTGGGCGTCACGTTGCAACTCGTATGCCGTCTTCTGCTTG
Use reformat.sh
from BBMap suite this way to see the result.
reformat.sh in=SRR3690386.fastq.gz out=stdout.fa | grep -v ">" | less
According to the authors, I ended up doing this:
gunzip -c ${file_name}.fastq.gz | fastx_clipper -Q33 -l 21 | fastx_trimmer -Q33 -l 21 | paste - - - - | sed 's/^@/>/g'| cut -f1-2 | tr '\t' '\n' > ${file_name}.trimmed.21.fasta
so I kept the first 21bp and dismissed the rest
Login before adding your answer.
Traffic: 2008 users visited in the last hour
You have only two options available in the Illumina document you included above. You could include both sequences in your adapters file or
zgrep
sequence file with either sequence to see if you can quickly identify one to use.could you elaborate in the zgrep option? I have two fastq files right now
I've tried the truSEQ adapters and the contamination rate: 0.000132 which seems low right?