Entering edit mode
2.1 years ago
yoshifumimiya
▴
50
I have no experience with RNA-Seq of human plasma, but I tried it with kaggle notebook.
I used GRCh38.primary_assembly.genome.fa as reference sequence, but the mapping rate was low (1.18-12.41%).
Does anyone know the reason for the low mapping rate?
Thank you in advance.
Is this your own data? You should basically take a few unmapped reads from these alignments (10 or so is fine) and then blast them at NCBI to see what they align to. You can get those reads by using
--un-conc filename
option withhisat2
.Human plasma should not contain a lot of human reads?
Thank you for your valuable input. The data used is public database.
The paper mentioned "by pretreatment with rtStarâ„¢ tRF&tiRNA Pretreatment Kit (Arraystar Inc., MD, USA)". I thought that the data was apparently for microRNA analysis.
As you suggested, I will also check the unmapped reads with blast. Thank you very much for your kind reply.
If the data is for miRNA then you probably need to be using a different aligner (e.g use bowtie v.1.x to do gapless alignments) and find a way of making sure to trim the data properly. miRNA data may contain a special adapter that is added directly to miRNA that needs to be trimmed before alignment of the data. If you are not looking to do novel miRNA discovery then you could also use the reference miRNA from miRBase or RNAcentral for human genome.
Thank you for your comments. It is very informative. I will try to use the reference miRNA first. I am very appreciate your opinion.