I know there are a couple of good tools around to detect chimeric artifacts arose during transcriptome assembly for species with reference genomes, but what about de novo assemblies with no reference genome?
I know there are a couple of good tools around to detect chimeric artifacts arose during transcriptome assembly for species with reference genomes, but what about de novo assemblies with no reference genome?
Detecting chimeric artifacts without a reference genome can be done using tools such as UCHIME or Slayer.
Both these tools have options to use the initial sequences as reference, and to detect PCR chimeric artefacts by comparing abundance of chimeric sequences to abundance of potential parental sequences, based on the hypothesis that PCR chimeric artefacts should be less abundant than the sequences they originate from (as they were subject to fewer PCR cycles). You will, of course, be discarding potentially biologically interesting chimeric transcripts if these are expressed at lower levels than the non-chimeric transcripts they originate from.
Check this paper: Optimizing de novo assembly of short-read RNA-seq data for phylogenomics
FullLengtherNEXT tool also has option to check if the sequences are chimeric or not. But, it is better if you download the tool locally along with databases if you have too many sequences. I tried the command line with option -q, --chimera_detection apply chimera detection mode
to check the chimeric sequences.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks, but I was thinking more of chimeras resulting from mis-assembly of reads.
By mis-assembly of reads, do you mean that there will be no chimeric reads supporting the chimeras? If so, it should be quite easy to spot these ones out. But if you do have chimeric reads supporting the mis-assembly, how would you interpret their presence?