Hello everyone,
I downloaded Bladder cancer data from TCGA . I extracted the sample id with this code:
head(Blca_res$id)
output: 'a8c61671-89cb-43bc-8c88-5c107954d11c,''b03b7b9b-00ef-4e0d-bac2-0b1059d57a87,''bf98764d-1604-4a14-8e06-1c785a085db9,''c0bc697a-ac64-4605-9abc-f0fe85eb481a,''bd52f6c8-6f8b-4056-8a3e-8cdc96644952,''ab504dbf-e1f0-46d2-83f9-0f4066055c71'
I wrote this to get same from clinical
data:
head(tcgaBlca_data@colData$sample_id)
output: 'f9bd70b2-6cde-48e5-9f0d-55d86ccfeba8,''3cae49a3-6deb-40f9-84cc-68b9b53543ff,''015e6b08-ab3c-4d1d-99e4-77b5e10bd7fc,''f09e1eeb-bcd5-4dba-92f0-7d4b34b81ce7,''0ac8e522-3c64-42f2-a66f-bd40530a328a,''3c71158d-98ff-4ef5-923f-ba31a25036ec'
.
There are more than 60,000
rows with this sampl_id's
. What I want to find out is if each sample Id in Blca_res$id
are same with tcgaBlca_data@colData$sample_id
. e.g, is 'a8c61671-89cb-43bc-8c88-5c107954d11c
from Blca_res$id
also in tcgaBlca_data@colData$sample_id
?
Any suggestion on how I can implement this with lines of code in R
?
Regards,
is the format of your
head
output correct? Do the sample ids actually have commas in the string?No. There no commas. but a dot like this
.
but, I have sorted it using a more readable column in the data.
Thanks