Question

FeatureCounts Assertion `l_name < 100' failed.

0

Entering edit mode

8.2 years ago

shg018 ▴ 10

Hi I have been trying to run featureCounts on my mapped bam extension files with the command: featureCounts -s 2 -t exon -g gene_id -a /oasis/.../genes_l2.gtf -o /oasis/.../CompleteRNA.txt file.bam

and have been getting:

    ==========     _____ _    _ ____  _____  ______          _____  
    =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \ 
      =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
        ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
          ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
    ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
  v1.5.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 1 BAM file                                       ||
||                           P FF1_1_R.bam                                    ||
||                                                                            ||
||             Output file : /oasis/tscc/scratch/shagupta/L2_oocyte/Final ... ||
||                 Summary : /oasis/tscc/scratch/shagupta/L2_oocyte/Final ... ||
||              Annotation : /oasis/tscc/scratch/shagupta/Genome_new/gene ... ||
||      Dir for temp files : /oasis/tscc/scratch/shagupta/L2_oocyte/Final ... ||
||                                                                            ||
||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : no                                               ||
||         Strand specific : reversely stranded                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||       Overlapping bases : 0.0%                                             ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /oasis/.../genes_l2. ... ||
||    Features : 690739                                                       ||
||    Meta-features : 46170                                                   ||
||    Chromosomes/contigs : 165                                               ||
||                                                                            ||
|| Process BAM file FF1_1_R.bam...                                            ||
||    Paired-end reads are included.                                          ||
||    Assign reads to features...                                             ||
featureCounts: readSummary.c:1578: process_pairer_header: Assertion `l_name < 100' failed.
/var/spool/torque/mom_priv/jobs/6660772.tscc-mgr.local.SC: line 18:   668 Aborted                 featureCounts -s 2 -t exon -g gene_id -a /oasis/.../genes_l2.gtf -o /oasis/.../CompleteRun1.txt ${file}

--------------------------------------------------------------------------------------------------------------------------------------------------------------

I was wondering about the possible reason for said error. Can it be due to low mapping percentage in bam file.

RNA-Seq • 3.8k views

ADD COMMENT • link updated 8.2 years ago by Devon Ryan 104k • written 8.2 years ago by shg018 ▴ 10

1

Entering edit mode

This suggests that you have a read whose name is at least 100 character long. Apparently the buffer used by featureCounts can't handle longer. That's not really a problem with you BAM file, though I've never seen a real file with a read name that long.

ADD REPLY • link 8.2 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks!! This really helped! :)

ADD REPLY • link 8.2 years ago by shg018 ▴ 10

0

Entering edit mode

Would you recommend any way of fixing this, without going and manually editing gene name in the annotation file?

ADD REPLY • link 8.2 years ago by shg018 ▴ 10

1

Entering edit mode

You mean the name in the BAM file...that and finding where this is coded in featureCounts and changing it are the only solutions. You might just give htseq-count a try (it's much slower, so keep that in mind).

ADD REPLY • link 8.2 years ago by Devon Ryan 104k

0

Entering edit mode

So, I used htseq-counts to read counts this time, using the sam files this time, and I get a file with 0 reads:

__no_feature    1358293
__ambiguous 0
__too_low_aQual 0
__not_aligned   0
__alignment_not_unique  2714159

and a warning during runtime:

Warning: Mate records missing for 30774 records; first such record: <SAM_Alignment object: Paired-end read 'HWI-D00611:118:C6KAHANXX:1:1110:15398:49565' aligned          to ENSMUST00000209497.1|ENSMUSG00000110001.1|OTTMUSG00000060230.1|OTTMUST000001         47449.1|RP23-323L3.5-001|RP23-323L3.5|2415|:[1812,1899)/->.

and I was wondering if the output is because of this warning, (if you could also tell me what this warning means?). Or could the probable cause be that my gtf file was wrong (I used long-noncoding RNA specific gtf file which I also used in the genome build).

ADD REPLY • link updated 8.1 years ago by Devon Ryan 104k • written 8.1 years ago by shg018 ▴ 10

1

Entering edit mode

The warning isn't a big deal, the buffer that it uses to store a read while looking for its mate isn't that big.

If you alignments matched a feature then presumably your BAM and GTF files use different chromosome names (featureCounts can handle that, htseq-count can't).

ADD REPLY • link 8.1 years ago by Devon Ryan 104k

0

Entering edit mode

But if I mapped using the same gtf file, that is I included the annotation file while building the genome, then how would my BAM/ SAM(in this case), have different names? Shouldn't it be continuous across all the files?

ADD REPLY • link 8.1 years ago by shg018 ▴ 10

0

Entering edit mode

What matters is the fasta file chromosome names, if those match the GTF then the BAM will too. If not then you'll have issues.

If they do match, have a look at things in IGV.

ADD REPLY • link 8.1 years ago by Devon Ryan 104k

0

Entering edit mode

Hi, sorry to keep bugging you, but what exactly do I look for in IGV?

Also, I used my gtf and fa files from http://www.gencodegenes.org/mouse_releases/current.html since I needed specific ones for lon-non coding rna's, and I don't observe this problem when I map to a complete genome! Would you recommend any other site to get lnc-specific gtf and fa files from to build my genome?

ADD REPLY • link 8.1 years ago by shg018 ▴ 10

0

Entering edit mode

Hello shg018!

It appears that your post has been cross-posted to another site: https://support.bioconductor.org/p/87502/

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 8.2 years ago by WouterDeCoster 47k