Queries regarding RNASeq sequencing
2
0
Entering edit mode
5.2 years ago
glady ▴ 320

Hello all,

I have some very simple questions regarding Human transcriptome sequencing:

1) Which library size would be better for sequencing: 1x150 or 2x75?

2) What can I consider as a good depth for performing the transcriptomic analysis: 20M, 30M or >30M?

3) Can we perform the transcriptomic study only on Tumor samples without considering the Normal samples?

The goal of the project is to use the multi-omic approach, to analyze more than 50 tumor samples for transcriptome and exome study. Since the dataset is going be big, my question was can we do this without sequencing the normal samples for transcriptome. And what might be a good library(single-end/paired-end) and depth(20M or >30M) to perform this task.

Thank you in advance.

RNA-Seq • 1.5k views
ADD COMMENT
1
Entering edit mode

The answer to all these questions depends on what genome you are using. Whether you are interesting in splicing variation and what the biological question being addressed is.

ADD REPLY
0
Entering edit mode

Very likely human since normal/tumor are being referenced.

ADD REPLY
1
Entering edit mode

yes, tell us your ultimate goal and the organism.

ADD REPLY
0
Entering edit mode
5.2 years ago
JC 13k

1) if you have a well-defined transcriptome, it doesn't matter which one you use, if you want to find new isoforms, prefer paired-ends 2) how many genes/transcripts do you have? this is again how much information you have for your genome/transcriptome 3) maybe

ADD COMMENT
1
Entering edit mode

3 depends more than most on the question being asked. TCGA has very few or no normal samples and is still very useful for asking some questions, just not for the question "which genes are upregulated in cancer?".

ADD REPLY
0
Entering edit mode

How can I co-relate my expression dataset with TCGA, to identify the enriched pathways or genes? Are there any R packages which I can used to do so? Or any other tutorials?

ADD REPLY
0
Entering edit mode

Unfortunately not. Its not usually possible to combine the results of different studies into a single analysis. Was just using TCGA as an example of where useful information can be provided by a data set that does not have normals.

Really, we need to know what biological question you are attempting to answer to know what the right way to design and analyse the experiment is.

ADD REPLY
0
Entering edit mode

I agree on this. 3) is well possible without normals if you are interested in clustering or classifying samples based on the relative expression in each patient within a cohort or if you are interested in finding co-expression networks.

ADD REPLY
0
Entering edit mode

I'm sorry but can you provide me some more details on this point. How can I cluster the samples on their expression? Are there any kinds of literature where they have performed similar work?

ADD REPLY
0
Entering edit mode

Hierarchical clustering based on Z-scored expression values for a selection of genes is an option. Check any NGS paper where people analyzed cohorts of samples / patients. A clustering step (or any kind of complexity reduction step) is typically among the first figures.

ADD REPLY
0
Entering edit mode

Thank you for your useful suggestions. Can you share some links with me about such literatures, it would be really helpful.

ADD REPLY
0
Entering edit mode
5.2 years ago

With regards to question 1 I will have to disagree with @JC. Even if you have a well defined transcriptome paired end is preferable since it give more accurate results (even for gene level analysis). You can read more about that and suggested tools here.

For question 2 it depends on what you want to do with the data. If you are only interested in gene-level analysis you can get away with less depth than if you want to do transcript level analysis. Just remember that a lot happens at transcript level in cancer - see e.g. this and this paper. Such analysis can be done with IsoformSwitchAnalyzeR and an example (from TCGA data) can be found here.

Like mentioned in the comments the answer to question 3 depends on what you want to do with the data and what to goals of the project is?

ADD COMMENT

Login before adding your answer.

Traffic: 2421 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6