Pair-end sequences assembly with Trinity (data mining doubt)
1
0
Entering edit mode
9.6 years ago

Hello everyone, I'm a Master student from Spain and this is my very first job with bioinformatics and RNA-Seq, so please excuse me if my questions are too easy or are not very clearly explained.

I have two files containing 100pb pair-end reads from Illumina RNA-seq and I want to assembly them into a De Novo transcriptome using Trinity. Up to here everything is OK, but I have some doubts about the process of combining the two files (containing the F and the R reads) to obtain a "consensus" sequence for the k-mer dictionary construction and the downstream processes (actually I don't really know if such a "consensus" sequence is formed or not when you perform the Inchworm algorithm of the assembly).

My main doubt is if the F and the R reads of the 100pd fragment need to be of the same length. I wonder this because the first 9-10 bases of each read have poor per sequence position quality, and if I trim them I don't know if it's going to be a disaster (because I don't know if Trinity align the F and R reads or if it just transforms the R reads to their reverse complementary and obtains the k-mers from the F and the R-transformed reads independently).

I know it's a bit messy but I will be very grateful if anyone can help me.

RNA-Seq next-gen Assembly • 2.0k views
ADD COMMENT
0
Entering edit mode
9.6 years ago
seta ★ 1.9k

Hi, if your two file are R and F data,separately, you can combine them or not based on trinity commands. You should evaluate your quality data and try to trim the poor quality bases at the first, having poor quality base at the beginning of read is normal and can trim them to have the better assembly, no worry about it.

ADD COMMENT
0
Entering edit mode

Thank you so much for answering Seta! It's difficult to know what you are really doing and how it affect your results when working with this huge data sets.

I think I will trim the beginning of the reads in both files (F and R reads) with the FastQ trimmer per column tool implemented in Galaxy and then run Trinity. Hope it helps improving the results.

Thank you again.

ADD REPLY

Login before adding your answer.

Traffic: 2256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6