Question

Which is better: PacBio or Illumina for de novo transcriptome?

4

Entering edit mode

3.5 years ago

karenkvn ▴ 40

I am obtaining quotes for transcriptome sequencing. I see that researchers often use both PacBio and Illumina to obtain and correct their transcriptome, but I only have the funds to do one of these.

My aim is to produce a de novo assembly for a eukaryote with no reference genome - a protist with a very large genome.

There seems to be a lot of conflicting advice out there. I have been told that PacBio Iso-Seq is great for transcriptomes as it produces whole transcripts. I am puzzled about why it is recommended above Illumina for de novo assembly, as it seems that the high indel rate will make it difficult to predict and annotate genes. I am concerned that there will be a large number of genes that I will miss as they will have poor database matches.

The quotes that I have obtained are similar for PacBio and Illumina NextSeq (2 x 75 bp, 150 cycles).

Any advice or opinions would be welcome!

Thank you.

Karen

transcriptome PacBio Illumina • 2.3k views

ADD COMMENT • link updated 3.5 years ago by Friederike 9.0k • written 3.5 years ago by karenkvn ▴ 40

score 4 · Answer 1 · 2021-06-04

PacBio Iso-Seq is based on dT priming and we generally do not see rRNA being an issue. https://www.pacb.com/applications/rna-sequencing/

There has been many studies that utilize Iso-Seq in the absence of a reference genome not only for genome annotation purpose but also for providing a reference transcriptome that can be used to couple with multi-sample RNA-seq data for DE studies.

ex: garlic https://www.pacb.com/blog/garlic-study/

ant brain https://www.biorxiv.org/content/10.1101/2021.04.20.440671v1

purslane https://www.mdpi.com/2223-7747/10/4/655

on Sequel II(e) systems Iso-Seq generates full-length isoforms w >99% accuracy. ORF prediction can be directly done.

If you have any further questions feel free to contact me at etseng@pacb.com or via twitter @Magdoll

score 2 · Answer 2 · 2021-06-03

2

Entering edit mode

3.5 years ago

Friederike 9.0k

I'd say it depends a bit on how long transcripts typically are and how much alternative splicing is going on, i.e. how many isoforms you expect to see for the same gene. For highly diverse transcriptomes with lots of alternative splicing, I'd definitely not bank on Illumina alone, the reads are just too short and they don't cover exon-exon junctions nearly often enough.

ADD COMMENT • link 3.5 years ago by Friederike 9.0k

1

Entering edit mode

karenkvn : If you have only one shot at doing this then you are likely to get more data with Illumina and it will likely be useful (though you may not get information about alternative splicing etc). If your organism has not been successfully sequenced using PacBio then the hurdles of getting a good library there are going to be higher. Unless you do some selection rRNA's otherwise those would likely form a large part of the data.

Not doing/having enough sequence is going to severely constrain what you can discover. Ideally you would want to do DNAseq as well but in real world non-research constraints always take center stage.

ADD REPLY • link 3.5 years ago by GenoMax 147k

0

Entering edit mode

If your organism has not been successfully sequenced using PacBio then the hurdles of getting a good library there are going to be higher.

Excellent point that I haven't given much thought. Can you elaborate a bit? Do you mean if there's no genome or if there's no information about the rRNA?

ADD REPLY • link 3.5 years ago by Friederike 9.0k

1

Entering edit mode

I was thinking of practical experimental hurdles. We informaticians never consider the possibility that a particular organism may be difficult to extract DNA/RNA from. They may have tough cell walls, may produce carbohydrates/proteins that end up in nucleic acid prep that cause problems with sequencing. PacBio will require special handling for Isoseq and any additional problem like the one above will add to risk of failure. If rRNA sequence is not known (likely) then trying to deplete it would be yet another challenge.

ADD REPLY • link 3.5 years ago by GenoMax 147k

0

Entering edit mode

Right, making sure that you're able to obtain enough high-quality starting material is probably going to be a greater issue for PacBio because they generally need more to start with. On the other hand, Illumina sequencing will present a biased picture, too, if the starting material isn't of sufficiently high quality.

ADD REPLY • link 3.5 years ago by Friederike 9.0k