Dear all
I am working on 12 samples of strand specific bacterial RNA-seq data(4 reps for 3 treatment). I have finished the DE analysis by using BWA-MEM for mapping, HTSeq for read count summarization and DESeq2 for DE test.
Now, I want to investigate the antisense RNA (find where are they located and do different expression quantification), because my libraries were constructed from strand specific protocol. And test from RseQC also confirmed that they were strand specific.
I searched the web and list several steps I need to work through:
1,separate reads/fragments mapped to + and -strand(my data is PE, so maybe Fragments) (Do I need to do?) 2,identify reads that mapped on the opposite strand of the genome annotation. (Here I mean if a gene is annotated on + strand of the genome, I need to identify reads mapped to the - strand of this location, if there were);
3, summarized those counts!
4, there might be some contamination, so a test was needed to calculation the base level and eliminate predictions below noise level.
5, use IGV to visualize each of them and check them.
6, use DESeq2 or edgeR to do differential expression test of those predicted antisense RNA
7, Moreover, do some wet expression to verify the expression those antisense RNAs.
Were these steps enough to do the job, please give some suggestions.
I get to know that SeqMonk has an antisense RNA analysis pipeline (it has the 3-UTR’ overlapping drawbacks.) And I am going to use seqmonk for this task. But for the steps list above, I still have some questions that confused me.
1, the results of HTSeq has a large proportion of reads get the marks of “ no features”.
I think antisense RNAs were in this proportion, because I had set the strand when using HTSeq for fragments count. And those fragments mapped to opposite strand of a gene would get “no features”, am I right?
2, how to separate reads mapped to the + and – strand and how to summarize antisense counts?
Could I use HTSeq to summarize them by just set the strand parameter opposite to which I used to summarize reads mapped to transcripts/genes.
Thanks!
sorry to reply to you so later, I am working on the dissertation. Thanks for you answers. the HTseq output showed very low unambigous alignment (0 for archeal samples and 0.001%). there is about 10-20% of no feature, so I now i decide to use HTseq for counts summarization, and DEseq2 for test, and also use seqmonk for check, which got an antisense RNA built-in pipeline!