Entering edit mode
8.3 years ago
Mike
★
1.9k
Hi,
I am running HTseq for read count with STAR output BAM file, but I got following errors:
STAR --genomeDir /databank/igenomes/Homo_sapiens/STAR --readFilesIn R1.fastq --runThreadN 8 --outFileNamePrefix R1a --outSAMtype BAM Unsorted
htseq-count -m union -i gene_name R1aAligned.out.bam hg19.gtf > output_usingbam.counts
Error occured when reading first line of sam file.
('SAM line does not contain at least 11 tab-delimited fields.', 'line 1 of file R1aAligned.out.bam')
[Exception type: ValueError, raised in _HTSeq.pyx:1276]
But no erorr when I use default STAR output (SAM ) file in htseq-count :
STAR --genomeDir /databank/igenomes/Homo_sapiens/STAR --readFilesIn R1.fastq --runThreadN 8 --outFileNamePrefix R1a
htseq-count -m union -i gene_name R1aAligned.out.sam hg19.gtf > output_usingsam.counts
it run successfully.
And I also got error when I sorted SAM file.
Please help where is problem.
Thanks
Thanks Michael,
my bam file is not sorted, so I used following command, but still got error:
htseq-count: error: no such option: -f
Maybe you should update your htseq program.
You might also try the gene counting from STAR itself (it works with STAR 2.5.2). Setting the parameter --quantMode GeneCounts will produce a file which "counts coincide with those produced by htseq-count with default parameters (STAR Manual 2.5.2a)".
Thanks,
Updated htseq, but.. still error
Has your SAM file the correct header (samtools view -SH R1aAligned.out.sam) and does it match the gtf's chromosome-names? What version of STAR are you using and what is the output of the R1aLog.final.out?
Thanks, now it works,
it was problem in STAR output sam file,
Could you please check, alignment results, Is it OK?
This is R1aLog.final.out file:
Mapping speed, Million of reads per hour | 147.32
% of reads unmapped: too many mismatches | 0.00% % of reads unmapped: too short | 7.66% % of reads unmapped: other | 0.23%
It looks OK; you have ~ 92% mapping rate (82 % uniquely mapped reads + 10% multi-mapping reads). With HTSeq-count only the 82% will be processed.
You can check your raw fastq file for adapter contamination (e.g. with FastQC reporter) and maybe trim if found. You can also check your alignment with RSeQC (gene body coverage; strandedness, read distribution over annotation).
Thank you very much... highly appreciated for your nice explaination.
Hi Mike, I am having the same problem. You mentioned it was the problem of STAR output sam file. So how do you solve it? Thanks!