HI, While I analyzed RNAseq data, I have some questions in using samtools and htseq-count. Can anyone help?
- I want to use htseq-count to get all counts on known genes, so i run:
htseq-count brain_fetus1.sam ~/knowngene_hg19.gff
Error occured in line 3 of file /cchome/che/knowngene_hg19.gff.
Error: Feature uc001aaa.3. does not contain a 'gene_id' attribute
[Exception type: SystemExit, raised in count.py:55]
The gff file like this:
chr1 hg19_knownGene gene 11874 14409 0.000000 + . ID=uc001aaa.3;Name=
chr1 hg19_knownGene mRNA 11874 14409 0.000000 + . ID=uc001aaa.3;Name=;Parent=uc001aaa.3
chr1 hg19_knownGene exon 11874 12227 0.000000 + . ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
chr1 hg19_knownGene exon 12613 12721 0.000000 + . ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
chr1 hg19_knownGene exon 13221 14409 0.000000 + . ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
Is there an appropriate tools to convert gtf to gff?
- another question is in samtools:
while i use samtools to figure out the counts in specified region, i run like this:
samtools mpileup -l test.bed brain.bam > test.txt
the test.bed file:
chr1 11873 14409 uc001aaa.3 0 + 11873 11873 0 3 354,109,1189, 0,739,1347,
chr1 11873 14409 uc010nxr.1 0 + 11873 11873 0 3 354,52,1189, 0,772,1347,
chr1 11873 14409 uc010nxq.1 0 + 12189 13639 0 3 354,127,1007, 0,721,1529,
It seems the -l option doesn't work. the result test.txt still contain the counts from the whole genome.
Thanks,
Che
Thanks. I think I got it.