Entering edit mode
5.8 years ago
Michel Edwar
▴
80
Ihave two conditions to compare. I am using HISAT2, sam tools and HTSEQ pipeline. However, I end up getting zero values for well known active genes in both WT and treated. I have viewed the output of the .bam files using IGV and it shows reads for these particular genes. Here is my code
hisat2-build -p 4 mus-genome.fa mkr
# run the following command to produce sam file and convert to bam file sort
hisat2Ref=/mnt/c/Users/Admin/Desktop/3adera-rna/mkr-index/mkr1
hisat2 -x ${hisat2Ref} \
-U RORgt24h_S9_L001_R1_001.fastq |
samtools view -bh - |
samtools sort - > mkr1_file.bam
samtools index mkr1_file.bam
# finally i use HTseq-count
htseq-count -f bam mkr1_file.bam mkr-mus-genome.gtf > 3adera1.txt
I wonder if you can help me pinpoint which step I am missing here. thank you
Did you check if your sequences are named the same in your gff file as in your fasta file? (that's a typical cause of these kind of issues)
yes i checked that's what i get mus-genome.fa
for mkr-mus-genome.gtf, i get
so what should i change?
Is it that all counts are zero (supporting the suggestion of lieven.sterck ) or are there some values > 0?
Not all counts are zero. i checked some of the known genes and they are zeros, but the volcano plot looks normal
right, since not all counts are zero the cmdline is technically fine it seems, so it will be likely something else then.
You said there are reads aligned in those regions when inspecting it in IGV. do they 100% coincide with the given gene-structures in your gtf file (== are all intron/exon boundaries respected)?