Trim contamined reads for RNA-seq
1
0
Entering edit mode
22 months ago
AlonsoCid • 0

Hello friends,

I'm doing an RNA-seq, according to kraken 12% my reads belong to other organisms, I need to clean them before continuing. I'm using an organism that have never been use before in this field, I don't have the reference genome I need to do everithing from scratch. Kraken allow me to detect them but not delete them, do you know any other tool?

I would like to use kraken database.

contamination Trimming RNA-seq • 1.0k views
ADD COMMENT
2
Entering edit mode
22 months ago
kashiff007 ★ 1.9k

"12% contamination" is too general information. The ideal step to check for adaptors and if they are present remove them by FASTX toolkit, Trimgalore, Trimmomatics the list is long. Next, you can check for the reads' average quality; usually, 5' ends have lousy quality, so just trim them for a few bases. You can also filter reads which are bad in quality over the length. All these initial analyses could be done by any of the reads assessment tools, as I mentioned before.

Coming to your next concern, you don't have the assembled genome for mapping. You can use De novo approach: This approach does not require a reference genome to reconstruct the transcriptome, and is typically used if the genome is unknown, incomplete, or substantially altered compared to the reference. But for better quality analysis, you will need reference for the RNA-seq analyses. You can make your reference genome (initially of bad quality but would be enough for your RNA-seq) by using WGS reads or any highthrouput seq which covers the whole genome.

ADD COMMENT
0
Entering edit mode

Thanks for your answer. I have already trim them, the problem is that of the remaining high quality reads, 12% belong to bacteria and fungi not to my organism, I would like to delete those reads. We are using wild mussels and they are full of bacteria and fungi.

Should I just continue the process with the contamination?

ADD REPLY
1
Entering edit mode

The best to remove such contamination (other micro-organism reads) it to map your fastq on the reference wild mussels genome, the mapped reads sam/bam have all your reads of interest.

In case you don't have wild mussels genome, you can use blast to map all you reads from fastq to nrdatabase. Then remove the reads which are hitting against any bacteria or fungi (or with specific names). The remaining reads (which are not hitting against any bacteria/fungi but may hit against other organisms) can be use for further analyses.

I don't prefer to continue the process with the contamination.

ADD REPLY
0
Entering edit mode

Thanks again, I'm using BBSplit to get rid of the exogenous reads. Let's see if I manage to make it work.

ADD REPLY

Login before adding your answer.

Traffic: 1685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6