Edit after noticing that this is mainly about differential RNA-seq analysis:
First and foremost, to assess significance you need biological replicates, only replicates grant you with an estimate of variance, this has been treated for example in this question:
Rna-Seq Biological Replicates...
Second, I would like to mention that you cannot prove absolutely that a gene is not expressed only because one hasn't found evidence (a non-existence proof is not feasible here).
For computing p-values of differential expression I recommend R packages DEseq or edgeR.
Some of this I have explained in this answer already, there are links to other materials and papers:
What Metrics Are Best To Describe The "Coverage" Of Rna-Seq Data?
However, it is definitely a problem if one gene has very few or zero counts in one or more group and the current methods might not be able to assign p-values properly or at all in these cases.
If I understand you correctly, you want to know if a very small number of reads (say at least one) in an RNA-seq experiment is evidence for the region being transcribed (not necessarily expressed).
Yes, every single sequence and it's alignment is evidence in itself, given the sequencer or protocol doesn't make up sequences! We have to agree on this point: the sequence doesn't lie, but ofc there can be errors.
Of course you would like to have more evidence and so for very lowly covered exons you will have to study them more deeply.
Where could the reads come from:
- They could originate from a duplicated/highly similar or repetitive region
- They could be poor alignments of reads with many sequencing errors
- The sequences could be contaminations with vectors, adaptors
To prove your gene being transcribed you have to take a look at the individual alignments:
- Filter alignments for duplicate hits to the genome, do you still get coverage
- Look at the single alignments, how good are they, large in-dels?
- apply quality filtering (after removing duplicates, not before)
- look for protocol specific contamination
- look at where in the gene the alignments are: are they all in one locus or do they span exons/ introns?
- re-align the reads against the genome using a more sensitive aligner e.g.(FASTA or SSearch). Do they still align only a single position?
Hope this helps.
Are you interested in differential analysis, or simply in evidence of transcripts, that is not clear. For DE analysis you are better off using the raw counts. At least for DEseq or edgeR. The packages will internally compute normalization.
I am interested finding genes that are expressed only in one tissue and not in the others. So I would say it is differential, but in an absolute way, No up & down regulation.
The problem is you cannot detect that something is not expressed just because you have no reads.