Question

trimming of Rna seq

1

Entering edit mode

7.0 years ago

fatimarasool ▴ 10

I am working on wheat genome, I want to do analyze comparative genome analysis of 3 varieties of wheat .I have sequence files from illumina 1.9 in fastq format. I checked the quality of reads by fastqc tool.GC content are not in normal range.(47- 49). What is the normal value of % GC for RNA seq reads ? The other question is that kmers are also not in correct range.How can I correct it. For trimming,adaptor seq file is required,but i don't have this file..Is that possible to remove these two error?If yes then how can I do?Can I skip the trimming step and go to next step of mapping?

In this file all parameter values are correct except kmer s and GC content.Is there any need to trim it?If yes then how can I do?

file:///home/comsats-ra/fatimamphilldata/G1_cleaned_R1_fastqc.html#M11

RNA-Seq sequencing alignment • 3.2k views

ADD COMMENT • link updated 7.0 years ago by chen ★ 2.5k • written 7.0 years ago by fatimarasool ▴ 10

score 3 · Accepted Answer · 2017-12-07

One expects many failed FastQC modules in RNAseq datasets. GC content should be similar over samples, but otherwise ignore a "Fail" in FastQC there. Similarly, you expect enriched k-mers. You should not attempt to correct this, it's already correct.

You can trim reads with Trim Galore!, which has the default adapters all built in. Having said that, it's quicker to just use STAR for alignment, in which case you don't need to bother trimming adapters.

score 2 · Accepted Answer · 2017-12-07

You can also look into DNApi : De novo adapter prediction algorithm for small RNA sequencing data

link: https://github.com/jnktsj/DNApi

de novo adapter prediction (iterative) algorithm for small RNA sequencing data. DNApi requires Python (2 or 3) under a Linux/Unix environment. DNApi accept (un)compressed FASTQ files or redirected standard input (stdin) as an input. You can simply run:

$ python dnapi.py <fastq>

or

> $ <process-generates-fastq> | python dnapi.py -

To see the detailed usage, type:

$ python dnapi.py [-h | --help]

DNApi can predict most 3′ adapters correctly with the default parameters. However, if you want to tweak the parameters or want to run other prediction modes, see [prediction modes and parameters] (https://github.com/jnktsj/DNApi#prediction-modes-and-parameters) for more detail.

score 2 · Accepted Answer · 2017-12-07

You can use fastp to trim adapters for Illumina sequencing data, without the need of knowing the adapter sequences.

Just download fastp and run:

fastp -i in.fq -o out.fq

And then everything is done, the adapters are trimmed in out.fq

For paired end data, the command is like:

fastp -i in1.fq -o out1.fq -I in2.fq -O out2.fq

Gzip is supported for both input and output.