Hi All
I have some illumina reads but I don't know which adapters has been used. How I can find if my illumina reads has adapters and of which type.
Best
Hi All
I have some illumina reads but I don't know which adapters has been used. How I can find if my illumina reads has adapters and of which type.
Best
FastQC only gave you warning when overrepresented sequences were in first 200,000 sequences. see FastQC documentation
Supposed read1.fastq and read2.fastq is the paired end data with 4 lines per read.
Download common Illumina adapters from https://github.com/vsbuffalo/scythe/blob/master/illumina_adapters.fa
Go through each adapter, e.g. sampling 1 million read1.fastq for truseq-forward-contam adapter:
cat read1.fastq | head -4000000 | grep AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | wc -l
cat read1.fastq | tail -4000000 | grep AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC | wc -l
If the above command output >100, then sampling 1 million read2.fastq for truseq-reverse-contam will output with similar number:
cat read2.fastq | head -4000000 | grep AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA| wc -l
cat read2.fastq | tail -4000000 | grep AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA| wc -l
In reference to the comment above, i tried the following commands:
cat read1.fastq | head -4000000 | grep <Adaptersequence> | wc -l
cat read1.fastq | tail -4000000 | grep <Adaptersequence> | wc -l
on my fastq file and got different output numbers (2731 and 1818 respectively).
What does this signify?
You can find the illumina adapters on the illumina website.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
hi, I have the same problems, how do you resolve it?