Hi,
I have a once produced RNA-seq data of many individuals (~ hundreds) with no reference genome and I am interested only in a subset of transcripts (i.e. transcripts of only a few specific genes). So given the number of individuals the assembly of the whole transcriptomes would be very time consuming. I have been thinking that maybe a subsetting the reads specific for the genes of interests before the actual assembly would be an option. I have been thinking that I could use orthologous sequences of the specific genes of closely related species and for instance bowtie aligner to obtain reads for these genes and then assemble de novo transcripts based on the subset of reads. Do you think this is a good approach? Is there any other approach how to select a subset of reads based on sequence similarity? I appreciate any suggestion. Thanks.
Thanks! It helps.