I'm using Trim_galore tool for quality and adaptor trimming, but I found no option to remove poly T in the sequencing data resulted from poly A enrichment libraries sequencing. Could anybody please let me know how should remove poly T in data? Thanks
ADD COMMENT
• link
updated 2.0 years ago by
Ram
44k
•
written 9.5 years ago by
seta
★
1.9k
1
Entering edit mode
Have you tried mapping the RNA-seq data yet? It may not be necessary to remove polyA stretches because they won't map to a unique location and you may already have enough reads to not worry about it. Another poster asked a similar question: Statistics About Poly-A Tails In Rnaseq Reads .
It is not clear-cut trimming poly-A tails is beneficial, are you certain you need to trim the poly-A tail? As Jason pointed, many short read mappers won't be affected by poly-A tails. For assembly, it may help discern splice variants (see comment on MIRA's manual).
Automatic Filtering, Trimming, and Error Removing for fastq data
Currently it supports Illumina 1.8 or newer format
AFTER can simply go through all fastq files in a folder and then output a good folder and a bad folder, which contains good reads and bad reads of each fastq file
Besides remove PolyX, it also can do:
Trim reads at front and tail according to bad per base sequence content
Detect and eliminate bubble artifact caused by sequencer due to fluid dynamics issue
Filter low-quality reads
ADD COMMENT
• link
updated 2.0 years ago by
Ram
44k
•
written 9.4 years ago by
chen
★
2.5k
You can use bbduk from the bbmap suite. Just create a polyA.fa (e.g. >polyA\nAAAAAAAAAAAAA) and zip it into the bbmap resources folder; run then bbduk with ref=resources/polyA.fa.gz. Maybe it is necessary to add a polyT sequence into your fasta.
I'm quite sure you can add the sequences to the adapter-file of trim_galore likewise.
It performs poly-N trimming as well as quality trimming, searching for the best larger fragment in the whole read. It has a high percentage of base conservation. It is pretty effective.
Have you tried mapping the RNA-seq data yet? It may not be necessary to remove polyA stretches because they won't map to a unique location and you may already have enough reads to not worry about it. Another poster asked a similar question: Statistics About Poly-A Tails In Rnaseq Reads .
However there are other posts that may help you such as: Trim Poly-A Tail And Trailing Nucleotides