Join, Subtract and Group: Compare using dataset lists
0
0
Entering edit mode
8.3 years ago
khaynes ▴ 50

Hello community. I am working on my first Galaxy workflow. I have a question about the "Join, Subtract and Group: Compare" tool.

My goal is to generate "fetch closest" output: the overlapping or the closest feature (elements of interest from the ChromHMM chromatin state class ENCODE data) to transcription start sites (my own list of 23,245 genes and 1 bp intervals at their TSS's).

Here is a summary of my workflow:

  • Input 1: Single interval file - TSS sites
  • Input 2: Dataset collection of interval files - 15 files total: chromatin states and their coordinates (i.e., 1_Active_Promoter, 2_Weak_Promoter, 3_Poised_Promoter, ...15_Repetitive/CNV)
  • Join TSS intervals (1 bp long) with chromatin state intervals collection (overlapping by 1 bp or more) -- output is 15 files
  • Fetch closest non-overlapping chromatin state to TSS -- output is 15 files

Goal for next step: The TSS's that have an overlapping feature (join) will also appear in the Fetch output with the nearest feature instead of the overlapping feature. I want to replace this subset of genes in the Fetch output with the information from the Join output.

I am thinking that the next best step is running "Join, Subtract and Group: Compare" (based on column 1, gene names) to retrieve the rows from the Fetch output that do not match the rows from the Join output. I would then take the "cleaned" Fetch output and Concatenate with the Join output.

Will Compare work on outputs that have multiple files? Can this function pair up the appropriate files, or should I pair up the output files manually for the Compare and Concatenate steps?

Thank you.

Compare workflow dataset list • 1.9k views
ADD COMMENT
1
Entering edit mode

You'll want to post this on the Galaxy site.

ADD REPLY

Login before adding your answer.

Traffic: 2577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6