Hello, I make assembly using spades. i get a set of contigs.
I would like to know if these contigs are forward (from 5' to 3') or reverse(from 3' to 5') or a mixture of foward and reverse . thank you
Hello, I make assembly using spades. i get a set of contigs.
I would like to know if these contigs are forward (from 5' to 3') or reverse(from 3' to 5') or a mixture of foward and reverse . thank you
Neither sequencing methods nor the assembly programs have the concept of forward or reverse. DNA pieces are sequenced in random orientations, so R1
reads are called forward just for convenience. It doesn't mean that all of them are on the same strand of DNA.
Real DNA is double stranded molecule; each end has the one strand's 5' end, and the other strand's 3' end
We of course write out DNA sequences by just writing out one strand. But which way is forward is largely arbitrary. It's not part of the chemistry of the DNA molecules, so no software is going to be able to infer a human arbitrary convention from sequencing data.
A mixture, assembly methods generally extend the contigs in any direction unless you are using a reference guided assembly.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
just for convenience ! really ! because mapping tools make reverse complement of R2 . So !! i can't understand what do you mean by "just for convenience"
Forward (R1) and reverse (R2) have meaning only with regard to that particular read - not with regard to the whole DNA molecule. You can have a piece of DNA sequenced in a particular forward-reverse orientation, and then another read pair may have the same piece of DNA in reverse orientation.
Really ! And here is couple more exclamations to further stress my point !!
I see.
So without looking for foward or reverse , can we say that: R1 = reverse_complement_of(R2) ? because i made i paired-end sequencing , i got R1 and R2 , and i would like to merge R1 and R2 like : R= R1 + reverse_complement_of(R2) I would like to use R in single-end tools.
For contigs and scaffolds, can i say that all contigs or scaffolds generated by spades are in the same orientation ?
Thank you !!
Because of how the chemistry of Illumina library preps work, each read reads inwards from the end of the fragment, so they run in opposite directions. And if you are implying that read 1 should literally be the rev-comp of read2, why would anyone pay to sequence the exact same thing twice? In most applications, read1 and read 2 are separated by an unknown number of bases, so you can't jut put them end to end.
If you are absolutely sure that the tool you want will not take paired end data, then throw away read 2, and hope that someone has a good answer as to why you paid for data you aren't using.
And putting your question in bold will not change the answer. No, not all scaffolds will be in the same orientation.
we use data in many kind of analysis , so this time we would like to use it in single-end tools. So we can't choose between R1 and R2 , and we make R1+complement_rev(R2).
You do as you wish, but I would never use a tool that prefers/requires single-end data when paired-ends data is available. Like @swbarnes2 said, what would be the purpose of throwing away half the data?
If I understand correctly what you are trying to do, taking R1+complement_rev(R2) is not preserving the information either. In paired data, R1 and R2 are on opposite strands and are known to be separated by a relatively small number of bases. If you use R1+complement_rev(R2), that will not preserve any information about the relative position (relative closeness) of those two reads.
I will say it one more time, and promise this will be the last: I don't see a sound rationale for what you are trying to do, because you are bound to lose some information compared to using paired-ends data as intended. But hey, do as you wish.