Transcriptomes comparison and check for shared transcripts with arbitrary similarity
0
0
Entering edit mode
7.7 years ago
pbigbig ▴ 250

Hi everyone,

I have Illumina RNAseq data from individuals of 2 groups (2 different phenotypes), after preprocessing reads and running de novo assembly with Trinity, I have group 1st with assemblies sized from 45-55 Mb (megabyte, fasta format) and another group (group 2nd) with assemblies sized from 2-9 Mb. This difference could be due to high level of duplication rate (checked with FASTQC "deduplicate" module) in 2nd group raw read data.

To make sure that these assemblies could be feasible for further analysis (e.g differential expression, SNP discovery) or in an unfortunate case, we have to do it all over again (from library preparation steps), I want to check how large the portion of transcripts (with arbitrary similarity) that were shared between individual transcriptomes of two groups is. Which tool or method could help me do that? Any idea on the usefulness of these data (i.e to which extent we can exploit from this bad data) is also welcomed.

Thank you in advance for your suggestion !

rnaseq transcriptome de novo assembly • 1.5k views
ADD COMMENT
0
Entering edit mode

I don't think I quite follow what you're trying to do but could you BLAT one against the other with a specified evalue cutoff and look for the number of contigs with a hit?

ADD REPLY
0
Entering edit mode

Thanks for your suggestion, I will BLAT them to each other. Sorry for my clumsy explanation, for short, I just have RNAseq data from different individuals (same tissue and species), some of them are pretty small sized compared to others after deduplicate (because of PCR artifacts), therefore I would like to examine the portion of similar transcripts they have shared to see if it is possible to continue for further analysis, e.g SNPs discovery.

ADD REPLY

Login before adding your answer.

Traffic: 1922 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6