Hi, I'm running a script with STAR, umi_tools dedup and RSEM. When running the RSEM without the deduplication, everything goes fine, but when running the RSEM after doing the deduplication, it generates this error:
Read ST-E00114:1178:HFL75CCX2:8:2209:25652:18291_CGATTAATAT: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)
I tried using the convert-sam-for-rsem, but the result is the same. Do you know a way to solve this?
Thanks
Yes, it is sorted by name, but I tried when sorting by coordinate and the results are the same. Before the mapping the fastq had the same number of reads (so I suppose that there were no orphans), and if I run the RSEM without the deduplication step, the RSEM worked perfectly. Then I suppose that there is a problem with the deduplication step, that it generates some orphans. I ran the umi-tools dedup with the following parameters.
So in theory it shouldn't have unpaired-reads
It's not at all clear that those command option will affect reads which started out as properly paired if the deduplication itself breaks up a pair. It's hard to do anything but speculate if you won't provide full command lines.