Hi all,
I run the code below
samtools view MT1/Tophat_Out/accepted_hits.sorted.bam | python -m
HTSeq.scripts.count -q -s no - ~/Indexes/Mus_musculus/UCSC/mm10/Genes/genes.gtf >
MT1/MT1.count.txt
Then I got this error: /bin/python: No module named HTSeq.scripts
I rerun the code and not sure what changed but I got a new error:
Error occurred when reading beginning of SAM/BAM file.
file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False
[Exception type: ValueError, raised in libcalignmentfile.pyx:1000]
I found a solution on the Internet:
https://www.seqanswers.com/forum/bioinformatics/bioinformatics-aa/9688-problem-using-htseq
which said: "the script was added to ./local/bin instead of /bin"
However, I don't know how to apply in my case. Would you please help? Thank you so much!
What does
pip show HTSeq
return?And is there a particular reason for using HTSeq outside a Python script in the first place? Not to say that it doesn't work, but
salmon
orfeatureCounts
are way more common programs to quantify RNA-seq data. And it has been a long time that I saw someone using TopHat...since there are faster and more accurate aligners out there in the meantime.Thank you for your help!
Name: HTSeq
Version: 2.0.2
Summary: A framework to process and analyze data from high-throughput sequencing (HTS) assays
Home-page: https://github.com/htseq
Author: Simon Anders, Fabio Zanini
Author-email: fabio.zanini@unsw.edu.au
License: GPL3
Location: /gpfs2/home/user/.pyenv/versions/3.8.0/lib/python3.8/site-packages
Requires: numpy, pysam
Required-by:
I try to learn to do bulk RNA-seq so I try to reproduce this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6373869/
The code above is in step 4 of protocol 1 in this paper. I could not find any better free bulk RNA-seq material with data and code to follow along that why I choose this paper even though I know we have better tools as you said.
I rerun the code in the post and not sure what changed but I got a new error:
Error occurred when reading beginning of SAM/BAM file. file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False [Exception type: ValueError, raised in libcalignmentfile.pyx:1000]
With
pip show HTSeq
, you corroborated that this module is installed on your system. However, its install path is within a Pyenv virtual environment. So you need to activate the Python 3.8 virtual environment first, which you probably did without realizing when the error message changed.Did you enter something like this or put it into your
.bash_profile
or.zprofile
file?As far as the second error is concerned:
This indicates that your SAM/BAM file has no header. If I am not mistaken, older versions of
samtools view
by default printed the header as well. Now, you need to specify the-h
flag for this behavior.Therefore, try:
Thank you so much! It worked.