Entering edit mode
9.2 years ago
zizigolu
★
4.3k
Friends,
I have a microarray dataset AGI IDs in row and samples in column and txt file of transcription factor (AGI IDs), my supervisor asked me to find and extract the expression of only genes existed in my transcription factor list in another word I should extract the rows those are transcription factor. as already I need your help
Thank you
sorry Alex,
I did like below but the input file contains the same rows as my microarray dataset while I need only the rows shared between two files
What you are trying to pass to
intersect
is not what that function takes as input. Run?intersect
to see what it takes as input. Then look again at tablesmycounts
andS
and decide what rows or columns you need to extract from those tables to pass tointersect
.thank you, but
mycounts
has 22000 rows and S has 100 rows, then I need 100 rows frommycounts
which the same withS
You need to extract whichever row or column from
mycounts
contains your genes of interest. You need to do the same forS
. Then pass those tointersect
.sorry, I donno how to extract...then I should extract 22000 rows from one file and 100 rows from another one then pass those to intersect.
Perhaps you are trying to take a subset of one table, based on values in a second table. It's not clear to me what you are trying to do. Maybe you need to use the
%in%
operator. Perhaps read the edit to my answer and go through the linked examples, as well as search online about this operator.Alex,
I am only trying to extract 100 genes from 22000 genes. 100 genes are in S file and 22000 genes are in my microarray dataset
anyway thank you
Alex gave you the correct hint. You only need to select the corresponding column IDs:
In case the columns are named
id
andprobe_id
thank you
I did like below
what is the NULL??
Try to leave out the
c(mycounts)
, this will mess up your data...thank you,
I open the output file but it has 21000 row while I need only the 100 genes which I name with S variable
In case of reading
S
, you stateheader=F
. Which means you haveS$V1
representing the first column.Please have a look at your data (e.g. with
head(S)
) and what data type is used.sorry, the result was the same
identifier
: name of the column containing IDs in yourmycounts
datasetS$V1
: the first column of yourS
datasetYes but told
Error in match(x, table, nomatch = 0L) : object 'identifier' not found
Just substitute identifier with the name of the id column in your
mycounts
data frame. The error you get means that there is no column calledidentifier
in yourmycounts
dataframe. It would be easier to help you posted at least the first lines of the datasets you want to intersect.sorry, look at the below please
then what is the solution?
thank you, no error anymore
but the output file contains 22000 yet while my purpose here is extracting only 100 rows (the same excluding gene list) from those 22000 rows in
mycounts
. maybe intersection means different and I should apply another word for my question. something like removing all rows except the 100 genes in excluding file.How many rows there are in dataset
S
? Can you post the first lines?totally 102 lines
The code above should extract only the 102 genes in the
mycounts
dataset. See this example:I am not sure if this is what you are asking, but you can do the inverse filter by using the
!
operator:Note that subset will only save the filtered dataset on the screen. To save it to a file, you need to assign it to a variable and save it: