I'm trying to understand whether or not I should be using the -p
flag for featureCounts
.
Here's the explanation:
-p If specified, fragments (or templates) will be counted
instead of reads. This option is only applicable for
paired-end reads; single-end reads are always counted as
reads
Here's the output with and without the flag:
# Program:featureCounts v2.0.1; Command:"/expanse/projects/jcl110/anaconda3/envs/VEBA-assembly_env/bin/featureCounts" "-a" "veba_output/assembly/SRR5720299/intermediate/1__assembly/scaffolds.fasta.saf" "-o" "featurecounts.tsv" "-F" "SAF" "--tmpDir" "veba_output/assembly/SRR5720299/tmp/featurecounts" "-T" "1" "veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam"
Geneid Chr Start End Strand Length veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam
SRR5720299__NODE_1_length_71757_cov_11.4961 SRR5720299__NODE_1_length_71757_cov_11.4961 1 71757 + 71757 11519
SRR5720299__NODE_2_length_69346_cov_13.9814 SRR5720299__NODE_2_length_69346_cov_13.9814 1 69346 + 69346 13008
SRR5720299__NODE_3_length_55956_cov_13.3612 SRR5720299__NODE_3_length_55956_cov_13.3612 1 55956 + 55956 10725
SRR5720299__NODE_4_length_53799_cov_18.5165 SRR5720299__NODE_4_length_53799_cov_18.5165 1 53799 + 53799 12722
SRR5720299__NODE_5_length_48507_cov_9.1492 SRR5720299__NODE_5_length_48507_cov_9.1492 1 48507 + 48507 6543
SRR5720299__NODE_6_length_45224_cov_17.2711 SRR5720299__NODE_6_length_45224_cov_17.2711 1 45224 + 45224 10538
SRR5720299__NODE_7_length_41565_cov_12.9338 SRR5720299__NODE_7_length_41565_cov_12.9338 1 41565 + 41565 8035
SRR5720299__NODE_8_length_41525_cov_13.7932 SRR5720299__NODE_8_length_41525_cov_13.7932 1 41525 + 41525 7738
(base) [jespinoz@exp-15-06 TestVEBA]$
(base) [jespinoz@exp-15-06 TestVEBA]$ head featurecounts.paired.tsv
# Program:featureCounts v2.0.1; Command:"/expanse/projects/jcl110/anaconda3/envs/VEBA-assembly_env/bin/featureCounts" "-a" "veba_output/assembly/SRR5720299/intermediate/1__assembly/scaffolds.fasta.saf" "-o" "featurecounts.paired.tsv" "-F" "SAF" "--tmpDir" "veba_output/assembly/SRR5720299/tmp/featurecounts" "-T" "1" "-p" "veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam"
Geneid Chr Start End Strand Length veba_output/assembly/SRR5720299/intermediate/2__alignment/mapped.sorted.bam
SRR5720299__NODE_1_length_71757_cov_11.4961 SRR5720299__NODE_1_length_71757_cov_11.4961 1 71757 + 71757 5801
SRR5720299__NODE_2_length_69346_cov_13.9814 SRR5720299__NODE_2_length_69346_cov_13.9814 1 69346 + 69346 6573
SRR5720299__NODE_3_length_55956_cov_13.3612 SRR5720299__NODE_3_length_55956_cov_13.3612 1 55956 + 55956 5411
SRR5720299__NODE_4_length_53799_cov_18.5165 SRR5720299__NODE_4_length_53799_cov_18.5165 1 53799 + 53799 6386
SRR5720299__NODE_5_length_48507_cov_9.1492 SRR5720299__NODE_5_length_48507_cov_9.1492 1 48507 + 48507 3322
SRR5720299__NODE_6_length_45224_cov_17.2711 SRR5720299__NODE_6_length_45224_cov_17.2711 1 45224 + 45224 5330
SRR5720299__NODE_7_length_41565_cov_12.9338 SRR5720299__NODE_7_length_41565_cov_12.9338 1 41565 + 41565 4111
SRR5720299__NODE_8_length_41525_cov_13.7932 SRR5720299__NODE_8_length_41525_cov_13.7932 1 41525 + 41525 3892
Using paired-end method, it's ~2x without the paired-end method but not exact.
My questions:
Why isn't -p run 2X of a run without -p? Is it because only one read in the pair aligns in some cases?
Would it be a safe bet to just set -p for all my paired-end runs?
the
--countReadPairs
applies only for new versions of featurcounts, in those cases-p
has no effect on the counts that are reportedThe OP must be using an older version, and the question is rather why isn't the number exactly 2x and whether they should be using -p at all.
Based on the names this data looks to be some kind of assembly. We have not idea how that was made and what aligner is being used to align the reads so some of the odd behavior seen may be a result of the data in case.
So just to be clear, I should use both
-p
and--countReadPairs
for my paired reads? Also, if I happened to use single ended reads (e.g., ONT reads) and the above options were on, would it throw an error or just ignore?As long as you have upgraded to a recent version that has these options.
There are no paired-end reads in case of ONT so these options do not apply.
Ok cool I just tested it. You can only use the above with paired end as it fails with single ended long reads. So to be clear, if you are using ONT reads then don't do -p and --countReadPairs.