Hi everyone,
i am currently trying to do transcript level differential expression analysis using the output from Salmon and the swish method from the fishpond library. I have observed something quite interesting that is potentially messing up my data analysis. For some transcripts, i get 0 counts in the primary estimates from the quant.sf file. However, after running swish, those same transcripts are identified as highly differentially expressed, and plotting the count values using plotinfreps shows count values way larger than 0.
Is it correct to think this might be due to very high uncertainty in mapping from Salmon which then yields to 0 counts in primary estimates VS very high values in infreps from boostraps ?
If yes, would it be best practice to filter out those kind of genes from the analysis and do oyu have any recommendation on how to do it ?
Any inputs are appreciated
I did some further analysis and this may actually be more of a purely salmon related question. I will take the example of a specific transcript from my data.
The quant.sf file from salmon outputs the following data :
However, after running the ConvertBootstrapsToTSV.py script on my bootstraps and extracting the count values for this same transcript from the tsv file, i get the following counts :
So as you can see, there is some variations between the different bootstraps but the value in the boostraps is far from 0 compared to the primary estimate in the quant.sf file.
I ran salmon 1.4.0 with the following command :
Any idea why this is happening ?