Hello Biostars, I am using HTSeq to get raw counts from STAR-mapping. Here is my code.
#!/bin/bash. #SBATCH --job-name=HTSeq-count
#SBATCH --constraint=molpro
#SBATCH --nodes=1
#SBATCH --mem=50GB
#SBATCH --ntasks=8
#SBATCH --partition=mid
#SBATCH --time=1-0
#SBATCH --output=ak19-%j.out module load anaconda/3.6
source activate env_name
TMPDIR=/kuacc/users/akashgari19/tmp mainPath=/kuacc/users/akashgari19.
inpPath=/kuacc/users/akashgari19/Mouse_Mapping1
outdir=/kuacc/users/akashgari19/New_HTSeq gtf=/kuacc/users/akashgari19/Mouse_Genome declare -a pairedSamples1=("CON-1_" "CON-2_" "EDA-1_" "EDA-2_" "IL6-1_" "IL6-2_" "LIF-1_" "LIF-2_" "OSM-1_" "OSM-2_") for pairedSample in ${pairedSamples1[@]} ; do python -m HTSeq.scripts.count -f bam -r pos -m intersection-nonempty -s reverse -t exon $inpPath/$pairedSample"Aligned.sortedByCoord.out.bam" $gtf/Mus_musculus.GRCm38.101.chr.gtf > $pairedSample_HTSeq.txt ;done
I got the following error message:
Warning: Mate records missing for 12126 records; first such record: <SAM_Alignment object: Paired-end read 'A00957:13:H2TJY DSXY:3:2176:14805:10692' aligned to 1:[4897735,4897836)/->.
Also, I did not get any output file, and when I tried several times, I got the error message saying can not find the index file for the bam file. I tried to search online and read from the HTSeq manual, but I could not tackle it. below is another error message I get.
Could not retrieve index file for '/kuacc/users/akashgari19/Mouse_Mapping/CON-1_Aligned.sortedByCoord.out.bam'
Can you please help me? I also want to get differential exon usage and tried to read about DEXSeq following this manual DEXSeq manual, but could not figure out how to use HTSeq for getting the input data for DEXSeq. Can you please guide me on this too? Thank you very much.
how did you sort the sam/bam files that you use as input for HTseq?
in any case make sure they are name-sorted (not the default position sorted).
EDIT: ok, the
-r pos
should evoke that input is positional sorted.can you check if those missing mates are part of your input sam/bam file? Perhaps STAR omitted them in the output of the alignment step?
Sorry, if it was not clear. So, it worked when I just use a single bam file, with pos setting , instead of feeding all of them. I will try again with changing it to name. How about getting reads for DEXSeq? Thanks.
ok, then something might be wrong with your bash loop.
do you by any chance have a hidden file called ".txt" ? at first sight I think
$pairedSample_HTSeq
is not set and you might thus end up with only files called.txt
.I think I understood what you said. Thx. Can someone help me with how to prepare the data for DEXSeq from HTSeq? I read the manual but still could not prepare the input data.
Thank you very much.