issues using Value Matching command in R
0
0
Entering edit mode
4.8 years ago
Adeler001 • 0

Hello I am trying to use the Value Matching command in R to find a specific ensemble exon code in my exon count table generated by featurecounts for my RNAseq data . Everything works until I used the %in% command. Here is the script I used:

tab= read.table("exons_RNA-seq_sorted.csv", header=F, sep="\t")
tab= tab[-c(1), ]
line1 = tab[1,]
line1b = as.matrix(line1)
colnames(tab) = line1b
tab= tab[-c(1), ]


tab = tab[tab$'ENSG00000223972.4'%in%'Geneid',]
printFilter(tab, 'variants in CTNNA1 for D4440')
tab=zzef1
R RNA-Seq • 914 views
ADD COMMENT
0
Entering edit mode

Please provide example data, couple of rows from your file - exons_RNA-seq_sorted.csv.

Why not use read.csv and keep the headers?

Where does 'Geneid' come from, is it a variable, or a string?

What packages are loaded - printFilter ?

ADD REPLY
0
Entering edit mode

Geneid is one of the column headers of my table. I didn't think of using the read.csv command. i didn't install any packages . i just tried the read.csv command and i get this error message : Error in read.table(file = file, header = header, sep = sep, quote = quote, : more columns than column names

ADD REPLY
0
Entering edit mode

here is what my table looks like

Geneid                                   Chr    Start      end       Strand   Length    D4001   D4002    D4003   D4004    D4005 D4006
ENSG00000223972.4           chr1  11869    12227      +            35            8             22           33         44           55     66
ADD REPLY
0
Entering edit mode
tab[tab[,1]=="ENSG00000223972.4",]

You have to check that the first column is of character type. I also post the link to a similar question on SO, where you can get further help.

ADD REPLY
0
Entering edit mode

So it is not a CSV file. Not sure what we are trying to do, if we are trying to subset based on Geneid value, then try this example:

# example data (in your case you would be reading the file.)
df1 <- read.table(text = "
Geneid                                   Chr    Start      end       Strand   Length    D4001   D4002    D4003   D4004    D4005 D4006
ENSG00000223972.1           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.2           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.3           chr1  11869    12227      +            35            8             22           33         44           55     66
ENSG00000223972.4           chr1  11869    12227      +            35            8             22           33         44           55     66
           ", header = TRUE)

df1[ df1$Geneid == "ENSG00000223972.4", ]
#              Geneid  Chr Start   end Strand Length D4001 D4002 D4003 D4004 D4005 D4006
# 4 ENSG00000223972.4 chr1 11869 12227      +     35     8    22    33    44    55    66
ADD REPLY

Login before adding your answer.

Traffic: 1674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6