Entering edit mode
5.8 years ago
makwana.kd
▴
60
I have an output file (text format) which I exported into excel spreadsheet. I see three columns, but I do not see the numeric value for the counts. Is this normal?
Column1 Column2 Column3
" XF" Z __ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2]
" XF" Z __ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2]
" XF" Z __alignment_not_unique
" XF" Z __alignment_not_unique
" XF" Z __alignment_not_unique
" XF" Z __alignment_not_unique
" XF" Z __no_feature
" XF" Z __no_feature
" XF" Z __alignment_not_unique
3 (5) words: "never use excel" ( for this)
importing this kind of data files into excel can often cause unexpected behaviour.
You're better of processing this file commandline in your linux environment. I assume you ran the previous steps commandline as well, no?
Now for your specific issue: can you post the output of
head <your htseq output file>
Hi Lieven, Following is the command i used :
htseq-count -m union -f bam -s no -r name ALZT22-2Cunsorted.bam geneassembly.gff3 -o counread.text
The bam file is name sorted
head command gives me the following output:
XF:Z:__ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2] XF:Z:__ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2] XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__no_feature XF:Z:__no_feature XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique
from which file is this the
head
?It does not looks to be from
counread.text
, is it? If so then the output from your htseq command is not correctSorry, there was a misspelling in the above-mentioned command. This is the corrected one:
htseq-count -m union -f bam -s no -r name ALZT22-2Cunsorted.bam geneassembly.gff3 -o countread.text
Yes, the head command output was for countread.text
krishna@dntdaretouchit:/mnt/e/cannon$ head countread.text XF:Z:__ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2] XF:Z:__ambiguous[ENSMUSG00000098178.1+ENSMUSG00000106106.2] XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique XF:Z:__no_feature XF:Z:__no_feature XF:Z:__alignment_not_unique XF:Z:__alignment_not_unique krishna@dntdaretouchit:/mnt/e/cannon$
Did you not mention in a previous post you converted the bam file to sam format. If so then you need to change your htseq command accordingly.
In any case the output of your countread.text file is not correct (looks like a kind of sam format?)
That was a different BAM file which was giving me an error, so I converted it to SAM file and I ran through HTSeq, that file gave me the following output:
chr1 3206084 255 1S139M = 3206084 -139 NTACAGTTAACCAACTTATACAGTTAACCAACTCCTACACTAGGTTCCTGAGCATTTCCTTAAACTTGCTAGTTCTGGTTTCCTGGCATGTGAGAGTAAGTCACATGGTAGGAGGCTGCCTTTCTATCJJJFJJFJJJAJFJJJFJFFJFJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJFJJJJJA<<jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjfjjjjjjjffjjjjjfjfjjjjjjjjfjjjjjjjjjjjjjjafaaa nh:i:1="" hi:i:1="" as:i:276="" nm:i:0="" xf:z:ensmusg00000051951.5="" gwnj-0965:181:gw180227920:7:2124:9881:65265="" 163="" chr1="" 3206084="" 255="" 139m1s="3206084" 139="" tacagttaaccaacttatacagttaaccaactcctacactaggttcctgagcatttccttaaacttgctagttctggtttcctggcatgtgagagtaagtcacatggtaggaggctgcctttctatcattcaattttagn<="" p="">
Because I wanted to bypass the BAM-SAM conversion step (I have 36 files and each SAM file would be around 40,000,000KB), I wanted to try a different BAM file and hence I generated a new BAM file STAR aligner which was sorted by name and ran it through the HTSeq. And this is the file which is giving me above mentioned output in this post.