Hello,
I have recently been using a pseudoaligner (Salmon) to run RNA-Seq for differential gene expression, which has a lot of documentation and works very well. I was wondering whether anybody has done any work with using pseudoaligners such as Salmon to do variant calling and/or fusion detection?
I know variant calling on RNA-seq data has limitations as in it would not uncover intronic variants, and variant calling/fusion pipelines using more traditional splice-aware aligners (STAR) are out there. But if I were to want to do differential gene expression and variant calling on the same dataset it seems pointless to use a pseudoaligner if I would have to run something like STAR anyway for variant detection. Thanks.
Hopefully, someone will let me know if this is wrong. The way I understand pseudoaligners is that don't have the stringency of checking the alignment of all bases of a read to a reference. They are checking what transcripts the reads are compatible with and may not align reads to reference completely if no further information can be gained. There are some youtube videos that do a good job of explaining the algorithm from a high level. . So I don't know if you can get good SNP data or not from these methods.
Apparently, there are some methods for fusion detection with pseudoaligners by working with reads that are compatible to multiple transcripts. There is a preprint that describes this: https://www.biorxiv.org/content/10.1101/166322v1 This group is really well known for pseudoalignment development.
I am learning myself , so do you own research.
Thanks for the info!