Question

R removes 1st column (gene-id) from featureCounts count.txt table

1

Entering edit mode

21 months ago

Pegasus ▴ 120

Hi all,

I generated a count.txt for sorted.bam files using featureCounts on Linux following the RNA-SEQ data analysis steps.

1- Using txt.editor, I checked the count.text file and found the following columns;

geneid  Chr Start   End Strand  Length  sample1 sample2 etc

However,

2- The first column name (geneid) was removed when I opened the file using R.

(EMPTY) Chr Start   End Strand  Length  sample1 sample2 etc

both colnames(), and rownames() did not show me the geneid title.
I tried changing or adding the name of the 1st column, but R changed the 2nd column name, so replacing (Chr) with geneid.

So, why did R remove the geneid name, and how can I add it in in which I can advance to edgeR.

Any help you can provide is greatly appreciated

featureCounts RNA-Seq R • 1.3k views

ADD COMMENT • link 21 months ago by Pegasus ▴ 120

1

Entering edit mode

Not an R expert.

It sounds as if the first column is being used as an index, and is therefore not named. It may be helpful to others if you paste the output of head count.txt and show the exact Rstudio command used to open the file.

ADD REPLY • link 21 months ago by Mensur Dlakic ★ 28k

1

Entering edit mode

That's unusual. Please show the command to read the file.

ADD REPLY • link 21 months ago by ATpoint 86k

score 1 · Answer 1 · 2023-03-06

Thanks for the reply, I could fix it using the steps below. I will keep the script here in case anyone else faces the same issue

Read in the count matrix without row names: counts <- read.table("feature.counts", header = TRUE, check.names = FALSE)
Add a new column with the gene IDs: counts <- cbind(geneid = rownames(counts), counts)
Remove the row names: rownames(counts) <- NULL