Entering edit mode
3.1 years ago
Simon Ahn
▴
10
Hi. I'm new in Bioinformatics and try to extract read counts from fastq files.
I compared my result with answer count matrix, and read counts are doubled.
(Left one is from the answer read count matrix, and right one is my result.)
I used these commands on ubuntu to get my result:
Could you please tell me what went wrong?
hisat2 -p 50 \
-x [ENSEMBL refrence file] \
-1 [fastq file_1] \
-2 [fastq file_2] \
-S [output file name].sam
samtools sort -@ 8 -o [output file name].bam [input file name].sam
featureCounts -p -T 10 -a [GTF file] \
-o [output file name] \
[input file name].bam
I think I didn't apply pair-end option in some commands but I couldn't figure out which one.
basically, it sounds like that they have tacitly changed how the tool operates and with most training materials become outdated, leading to bugs and inconsistencies down the line ...
http://subread.sourceforge.net/
I don't even understand this:
I kind of sound like in the past
-p
would count as pairs, now one needs to pass both-p --countReadPairs
together.But then what effect does
-p
alone have?Problem solved thanks to you guys!
According to the featurecounts manual, I should've put --countReadPairs to count a fragment (forward + backward for paired-end). That explains why my result was doubled. IMHO, putting only -p command makes run stop when I put wrong data type. Thanks a lot!