Entering edit mode
8.0 years ago
biologo
▴
40
hello, friends:
i was using the htseq to calclate the reads, but it was sth wrong, like that:
Error occured when processing SAM input (line 35008273 of file sort.sam):
'pair_alignments' needs a sequence of paired-end alignments
[Exception type: ValueError, raised in __init__.py:603]
the input the paired_end reads, and i also do like this:
samtools sort -n uniqmap.bam -o sort.bam
samtools view -@ 4 sort.bam -O SAM -o sort.sam
htseq-count -m union -s no sort.sam $GTF > union.out
i check it for several times, but still in upset, i am looking forward your reply, thanks
As it says, HTSeq-count expects paired-end alignments but did not find it there. Maybe, in your uniqmap.bam a part of the read was filtered out, because it was not uniquely mapping whilst the other part did. I'd check that.
I already told him he didn't have to pre-filter earlier in this thread: what's wrong with htseq-count codes???
But reading is hard.
dear WouterDeCoster, i am so sorry, cause such a busy day. well, some guys told me maybe the main problem is the original bam file (tophat2 output), since i am still a newbie,,,but i do not think the original file error. anyway, thank you for your kind help, i will till you the result later.
it empty again: samtools sort -@ 15 -n accepted_hits.bam -o accepted_hits_sort.bam htseq-count -f bam -m union -s no accepted_hits_sort.bam $GTF > union.out
even the orignal file, it empty again
What is empty? Could you be more specific? Earlier you were talking about that error message, but now it's something else?
Yes, the same error message. I try to search the answer, and maybe caused by the original bam file from tophat2 which mixed with paired and single reads, and then i do the seperation.(that before even mixed, i never met this error.) J00142:73:HGKWTBBXX:2:1101:1215:7679 65 chr5 156482411 50 55M chr16 3454521 0 GCCATTTTGGCATGTGAATAGAGAACATGAGCCTCTATTCCAGCACATTGATGTG FJAFJFFJJF<jffjjjjjj7afjfjfjjf-f-aj<jjjjjj7f<a7f<fajjj< as:i:-4="" xn:i:0="" xm:i:1="" xo:i:0="" xg:i:0="" nm:i:1="" md:z:48g6="" yt:z:uu="" xs:a:-="" nh:i:1="" j00142:73:hgkwtbbxx:2:2227:1215:43796="" 0="" chrmt="" 10806="" 50="" 42m="" *="" 0="" 0="" gactttccaaaaaacacataatttgaatcaacacaaccaccc="" fjj<affjfjjjjjjjjjff<jjjfffjjjajjjffjjjjjj="" as:i:0="" xn:i:0="" xm:i:0="" xo:i:0="" xg:i:0="" nm:i:0="" md:z:42="" yt:z:uu="" xs:a:+="" nh:i:1<="" p="">
HTseq-count successed, but the tough thing is the bam file i would used for macs2 callpeak, and mixed one seems does not work. again, i am so happy for your reply.
dear michael.ante: thank you for your kind suggestion, it was running now. well ,cause the bam file i would use next in another software, and it report error, even the accepted_hits.bam, and after that i used the bed file in success days ago. but till yesterday, i started to think about what's wrong with bamfile and do HT-seq again, then post it.
I'm not sure what's wrong, but htseq-count can take bam files so there is no need to convert to sam. See the -f flag in the manual
thanks a lot, i will do try later, but question is still there.
finally, i find what's wrong, cause on the processing of clean_data, i use the single-end way, which cause the unpaired reads. thank you for your precious time.