Difficulty understanding SOAPdenovo2 parameters
1
0
Entering edit mode
8.1 years ago
eischzj12 • 0

Hello,

I'm currently trying to construct contigs using DNA libraries and am stuck on a particular parameter within the SOAPdenovo2 configuration file. I'm having a hard time understanding what to set "rd_len_cutoff" to. I've read the instructions from the website and they weren't very thorough, at least within my spectrum of understanding. Specifically, the instructions say that rd_len_cutoff tells the assembler what length to cut the reads from the current library to.

How do I determine the ideal length? I apologize if there is already a post that explains this somewhere on here, but I couldn't find one that was more thorough than the source below (I took my time to look before posting).

I also had the same problem for the "rank" parameter.

Thanks for your time!

http://soap.genomics.org.cn/soapdenovo.html

Assembly soapdenovo2 • 2.4k views
ADD COMMENT
0
Entering edit mode

can you post some info about the data you have? specially length (after trimming and remove adaptors)

rd_len_cutof The assembler will cut the reads from the current library to this length i.e. the position after which the reads will be cut, Soapdenovo trims off all the bases after that point.
so if you have read length = 200 and you put rd_len_cutof = 150 you will cut till 150 bp from your reads

ADD REPLY
0
Entering edit mode

Do you mean after trimming/removing adaptors from something like Trimmomatic? I did that and was able to generate a FastQC report; would I be able to get the information you're asking about from that? Or when you say length do you mean genome size estimate?

And I understand that soapdenovo trims all bases after a particular position, but I'm not sure that I understand the utility in it. So from your example, you have an example read length of 200, why would you set a rd_len_cutoff to 150? Apologies once again for the confusion, I'm an undergrad with limited knowledge about this topic.

ADD REPLY
0
Entering edit mode

the question regarding length it was about library preparation info, 200 thing is just example. In conclusion after trimming set this parameter to the longest read you have

ADD REPLY
0
Entering edit mode
8.1 years ago
Rohit ★ 1.5k

As Medhat already mentioned, rd_len_cutoff is the length to which the reads are trimmed to. Usually it is given based on the data quality, the value I use is the length of the longest-read so that no reads are trimmed further.

The Rank parameter denotes the order in which the read-libraries to consider. For example, a library of rank-1 is first considered for scaffolding followed by rank-2 and so on. Multiple-libraries can have the same rank in-order to be used at the same time.

ADD COMMENT
0
Entering edit mode

How do I determine the length of the longest-read? Is that something I can find in the FastQC report?

Also for rank, if multiple libraries can have the same rank in order to be used at the same time, then why bother considering one before the other?

ADD REPLY

Login before adding your answer.

Traffic: 2930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6