Hi recently I downloaded data from FANTOM5 Phase 2.0 ( http://biomart.gsc.riken.jp/ ) with the goal of figuring out TSS sites, however I cannot seem to find any documentations on this. For example, I downloaded FANTOM5 Phase 2. I'm a bit confused,
So for example FANTOM site has many human sample CAGE data from different cell lines and tissues however this file seem to suggest that it is combined from phase 1 and phase2, so does this mean that all the data were somehow average and these are the peaks of the averages? Also why do each row seem to base of of different transcripts?
thanks in advance.
great thanks that is super helpful.
Here is couple of followup questions. 1. So using that example above, p1@LINC200277 is about 29 bp in width. So does that mean the TSS for this gene can be within any of the 29 bp in this range?
1) That mean we found evidence of transcription start sites at all those positions. 2) To make it easier to you we take the peak (the position with the most TSS signal) and used that to calculate distances.