I am working on wheat genome, I want to do analyze comparative genome analysis of 3 varieties of wheat .I have sequence files from illumina 1.9 in fastq format. I checked the quality of reads by fastqc tool.GC content are not in normal range.(47- 49). What is the normal value of % GC for RNA seq reads ? The other question is that kmers are also not in correct range.How can I correct it.
For trimming,adaptor seq file is required,but i don't have this file..Is that possible to remove these two error?If yes then how can I do?Can I skip the trimming step and go to next step of mapping?
In this file all parameter values are correct except kmer s and GC content.Is there any need to trim it?If yes then how can I do?
One expects many failed FastQC modules in RNAseq datasets. GC content should be similar over samples, but otherwise ignore a "Fail" in FastQC there. Similarly, you expect enriched k-mers. You should not attempt to correct this, it's already correct.
You can trim reads with Trim Galore!, which has the default adapters all built in. Having said that, it's quicker to just use STAR for alignment, in which case you don't need to bother trimming adapters.
de novo adapter prediction (iterative) algorithm for small RNA sequencing data. DNApi requires Python (2 or 3) under a Linux/Unix environment. DNApi accept (un)compressed FASTQ files or redirected standard input (stdin) as an input. You can simply run:
$ python dnapi.py <fastq>
or
> $ <process-generates-fastq> | python dnapi.py -
To see the detailed usage, type:
$ python dnapi.py [-h | --help]
DNApi can predict most 3′ adapters correctly with the default parameters. However, if you want to tweak the parameters or want to run other prediction modes, see [prediction modes and parameters] (https://github.com/jnktsj/DNApi#prediction-modes-and-parameters) for more detail.