Hi,
I have an issue regarding using Salmon and Deseq2 with mixed (paired-end and single-end) read libraries.
My libraries were originally paired end, but I quality-trimmed them, resulting in a lot of single-end reads that I don't want to throw away. I want to use Salmon to quantify expression. But since Salmon can't operate with both types of libraries at the same time, I ended up quantifying single- and paired-end reads separately, and then adding the read counts from same sample for each transcript.
salmon quantmerge --column numreads -o cohort95_nr_quant.sf --quants Sample*_quant
python3 -c "
import pandas as pd
dfs=pd.read_csv('quants/cohort95_nr_quant.sf',sep='\t')
ndf=pd.DataFrame(columns=['Name'])
ndf['Name']=dfs['Name']
libs=['Sample'+str(num) for num in range(1,95)]
for lib in libs:
ndf[lib]=dfs[lib+'.paired_quant']+dfs[lib+'.single_quant']
ndf.to_csv('quants/cohort95_summed_nr_quant.sf',index=False,header=True,sep='\t')
"
So I ended up with a file like:
Name Sample1 Sample2 Sample3 ...
Transcript1 4811.874 11930.11 7938.97
Transcript2 34.0 79.0 104.0
Transcript3 229841.3 262170.9 222405.4
Transcript4 0.0 11.0 6.0
Transcript5 0.0 0.0 0.0
Now I want to use Deseq2. I'm following the tximport
tutorial for going from transcripts to genes, but I don't know how to use the file above. tximport()
only takes original quant.sf
files from salmon as long as I understand.
What can I do?
Hi Andrés,
I am now in the same predicament you were and trying to decide how to sum the single-end and paired-end Salmon outputs. Did you ever decide on a solution?
Best, Natalie