reliable way to trim
1
0
Entering edit mode
5.7 years ago
gbl1 ▴ 80

Hello,

I realized that my data were not trimmed and I wish to get something out of them… I was advised "trimomatic"

So, I tested, and obtained sequences like:

>M02764:119:000000000-C5R9K:1:1101:8653:1645 1:N:0:1
GTTACTAACTTCTTAGGACCCATGATCGGGGACTGAGCAAGCTGTTGCTGAAACCAGCCGACCTGCCT
GGGCCGACTAACCCTGCCCTGGCCGGCTGCAAGGTGAGGACCTGCCGCAACTCGCTGTAGATCGGA
AGAGCACACGTCTGAACTCCAGTCACGCTGAAGAAGCTCGTATGCCGTCTTCTGCGTGAAAAAAAAAAAAATCAGG

I still have a lot of craps. My adaptor is GGTGAGGACCTGCCGCAACTCGCTGT Therefore I would like to obtain GTTACTAACTTCTTAGGACCCATGATCGGGGACTGAGCAAGCTGTTGCTGAAACCAGCCGACCTGCCTGGGCCGACTAACCCTGCCCTGGCCGGCTGCAA as output.

I could used "sed" but it would need a perfect match, and you know sequencers… often not reliable… Any advise?

radseq trimming • 2.1k views
ADD COMMENT
0
Entering edit mode

How did you adapter end up in the middle of the read? :o

ADD REPLY
0
Entering edit mode

adapter was add by restriction/ligation

Illumina gives something with a full size primer (50 b) then there's that poly A and random bases

ADD REPLY
0
Entering edit mode

Can you show how you used trimomatic?

ADD REPLY
0
Entering edit mode
java -jar /Users/benjamin/Downloads/Trimmomatic-0.38/trimmomatic-0.38.jar PE -phred33 /Users/benjamin/Downloads/Leduc_PCR_MiSeq-20190221R/A001-Vs-04-P-GCTGAAGA-CAGATGTA-Leduc-run20190221R_S1_L001_R1_001.fastq.gz /Users/benjamin/Downloads/Leduc_PCR_MiSeq-20190221R/A001-Vs-04-P-GCTGAAGA-CAGATGTA-Leduc-run20190221R_S1_L001_R2_001.fastq.gz output_forward_unpaired.fq.gz output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
ADD REPLY
0
Entering edit mode

ILLUMINACLIP:TruSeq3-PE.fa while you say your adaptor is `GGTGAGGACCTGCCGCAACTCGCTGT?

ADD REPLY
2
Entering edit mode
5.7 years ago
GenoMax 147k

Use bbduk.sh from BBMap suite.

bbduk.sh -Xmx2g in=your.fq.gz out=trim.fq.gz literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21
ADD COMMENT
0
Entering edit mode

Need assistence:

MachincBenjamin:Leduc_PCR_MiSeq-20190221R benjamin$ /Users/benjamin/Downloads/bbmap/bbduk.sh -Xmx2g in=/Users/benjamin/Downloads/Leduc_PCR_MiSeq-20190221R/A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R2_001.fastq.gz   out=trim.fa literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21
java -ea -Xmx2g -Xms2g -cp /Users/benjamin/Downloads/bbmap/current/ jgi.BBDuk -Xmx2g in=/Users/benjamin/Downloads/Leduc_PCR_MiSeq-20190221R/A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R2_001.fastq.gz out=trim.fa literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21
Exception in thread "main" java.lang.NoClassDefFoundError: java/util/concurrent/ThreadLocalRandom
    at jgi.BBDuk.<clinit>(BBDuk.java:4827)
Caused by: java.lang.ClassNotFoundException: java.util.concurrent.ThreadLocalRandom
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    ... 1 more
ADD REPLY
1
Entering edit mode

After you uncompress the bbmap source don't move any of the folders below the top level folder. Add the folder to your $PATH.

If you have paired-end data then you need to do:

bbduk.sh -Xmx2g in1=your_R1.fq.gz in2=your_R2.fq.gz  out1=trim_R1.fq.gz out2=trim_R2.fq.gz  literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21 tbo tpe

Do not trim paired-end reads independently. Otherwise read order would get messed up.

If you need the data converted to fasta format then do conversion after trimming.

reformat.sh in=trimmed.fq.gz out=trimmed.fa
ADD REPLY
0
Entering edit mode

I guess there's something wrong:

MachincBenjamin:Leduc_PCR_MiSeq-20190221R benjamin$ bbduk.sh -Xmx2g in1=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R1_001.fastq.gz in2=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R2_001.fastq.gz   literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21 tbo tpe
java -ea -Xmx2g -Xms2g -cp /Users/benjamin/Code-source/bbmap/current/ jgi.BBDuk -Xmx2g in1=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R1_001.fastq.gz in2=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R2_001.fastq.gz literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21 tbo tpe
Exception in thread "main" java.lang.NoClassDefFoundError: java/util/concurrent/ThreadLocalRandom
    at jgi.BBDuk.<clinit>(BBDuk.java:4827)
Caused by: java.lang.ClassNotFoundException: java.util.concurrent.ThreadLocalRandom
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    ... 1 more
MachincBenjamin:Leduc_PCR_MiSeq-20190221R benjamin$ echo $PATH
/opt/local/bin:/opt/local/sbin:/opt/local/bin:/opt/local/sbin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/opt/X11/bin:/Users/benjamin/Code-source/bbmap/
ADD REPLY
0
Entering edit mode

What OS are you using? Are you using the latest java available for that OS?

ADD REPLY
0
Entering edit mode

I use OSX

MachincBenjamin:Leduc_PCR_MiSeq-20190221R benjamin$ java -version
java version "1.6.0_51"
Java(TM) SE Runtime Environment (build 1.6.0_51-b11-457)
Java HotSpot(TM) 64-Bit Server VM (build 20.51-b01-457, mixed mode)
ADD REPLY
0
Entering edit mode

Can you upgrade your java? I believe bbmap is only tested for Java 1.7 and up. I have Java 9.0.1 on a mac and that works fine.

ADD REPLY
0
Entering edit mode

Grrr… I tried, and it is still the same… Actually, the java that is used is in /usr/bin/ and the newer versions are after in the path… any idea how to get it out?

ADD REPLY
0
Entering edit mode

Can you directly run

/path_to_java_you_want/java -ea -Xmx2g -Xms2g -cp /Users/benjamin/Code-source/bbmap/current/ jgi.BBDuk -Xmx2g in1=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R1_001.fastq.gz in2=A008-Sj-D-N-GCTGAAGA-CAGTTTGT-Leduc-run20190221R_S8_L001_R2_001.fastq.gz literal=GGTGAGGACCTGCCGCAACTCGCTGT ktrim=r k=21 tbo tpe
ADD REPLY
0
Entering edit mode

What are the parameter to play with in order to increase sensitivity for mismatches?

>Sj-A-N_M02764:119:000000000-C5R9K:1:2115:12793:16601
CTCTGAGCCGGGGTGCCACAGGTCTGAACTCCAGTCACGGTGAGAACCTGCCGCAACTCGCTGT
>Sj-A-N_M02764:119:000000000-C5R9K:1:2115:10399:16010
CTCTGAGCCGGGGTGCCACACGTCTGAACTCCAGTCACGCTGAAGAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAACAAACAAAACCAAAATAGTATTGAAAGAAAGAATCAATAAAGATACTCTAACGGCAACGCTATCAGCTAAGAGCCGTTCATACGTAGTGTCAATAGATTCGATAAAAATGTTTCACCATACAGCATAAAACACGACCGCCACAGACCACCCATGAGATCACATGAAAAAGCTCAGAGATCAAACGTCATCACAGACAGTTATAGTATCTTAAAACCCATATTCCACTGATTCAATGAACAATTA
ADD REPLY
0
Entering edit mode

hdist=. Use a number to allow mismatches. Here is a guide for bbduk.

ADD REPLY
0
Entering edit mode

Hi,

I actually get a small issue:

CCGTGTGCGCCTCACCCCTGCATGGTGAGGACCTGCCGCA CTCGCTGT
CCGTGTGCGCCTCACCCCTGCATGGTGAGGACCTGCC CAACTCGCTGT
                       GGTGAGGACCTGCCGCAACTCGCTGT

If there is a missing base in the sequence from illumina, it cannot be trimed… How to clean that?

ADD REPLY
0
Entering edit mode

You could use literal=CCGTGTGCGCCTCACCCCTG,GGTGAGGACCTGC in this specific case. Is hdist=1 or 2 not working?

ADD REPLY
0
Entering edit mode

No, hdist=1 or 2 is only working for mismatch, not for insertion/deletion

I showed just 2 exemples, but it might happen everywhere…

ADD REPLY
0
Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY

Login before adding your answer.

Traffic: 1492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6