TCGAbiolinks: Accessing Matched Samples' Data Across -Omics Hierarchies
0
0
Entering edit mode
11 weeks ago
Desmond • 0

Hi All,

I'm an MSc. student trying to gain experience in multi-omics data analysis. To this end, I've been working with the TCGAbiolinks and mixOmics R packages, but I would appreciate clarification about how to proceed.

Basically, I start by downloading RNA-seq and miRNA sample metadata for TCGA-BRCA via TCGAbiolinks::GDCquery() and TCGAbiolinks::getResults(), deduplicating each output based on their 'case' entries, and observing whether there are any matching cases between my RNA-seq output and my miRNA output. When I apply this procedure to the GDCquery outputs, I remove duplicate cases within an output but find no matching cases between outputs. My R script, thus far, is provided below:

 library(TCGAbiolinks)
    library(tidyverse)
    library(stringr)

    deduplicate_tcga_query_outputs <- function(x){

      output <- x[!duplicated(x$cases),]

      return(output)

    }

    project <- "TCGA-BRCA"


    query_rnaseq <- GDCquery(
      project = project,
      data.category = "Transcriptome Profiling",
      data.type = "Gene Expression Quantification",
      access = "open"
      )


    query_mirna <- 
      GDCquery(
        project = project,
        data.category = "Transcriptome Profiling",
        data.type = "miRNA Expression Quantification",
        access = "open"
      )

    output_rnaseq <- getResults(query_rnaseq)
    output_mirna <- getResults(query_mirna)

    output_rnaseq <- deduplicate_tcga_query_outputs(output_rnaseq)
    output_mirna  <- deduplicate_tcga_query_outputs(output_mirna)

    sum(duplicated(output_rnaseq$cases)) 
    sum(duplicated(output_mirna$cases)) 

    shared_cases <- intersect(output_rnaseq$cases, output_mirna$cases)

    shared_cases

To my understanding, the mixOmics package requires multi-omics data to be measured across matching participants (when performing N-integration). Additionally, it provides a small multi-omics toy data set constructed from the TCGA-BRCA project to help users get started (so clearly, matching cases must exist within TCGA-BRCA across -omics hierarchies).

If anyone has experience integrating samples across -omics hierarchies using TCGAbiolinks in R, I would greatly appreciate your input.

Thanks so much.

TCGAbiolinks mixOmics multi-omics TCGA-BRCA • 262 views
ADD COMMENT

Login before adding your answer.

Traffic: 1268 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6