NIH roadmap epigenomics data Bam files
1
1
Entering edit mode
8.5 years ago
Saad Khan ▴ 440

Hi I was wondering if anyone has access to the bam files of chip-seq data from NIH roadmap project (http://egg2.wustl.edu/roadmap/web_portal/processed_data.html#ChipSeq_DNaseSeq)

They only have bed files or tagalign files available for it. If anyone has converted those tagalign files to bam files and can provide me access to it kindly let me know.

NIH roadmap-epigenomics • 3.0k views
ADD COMMENT
0
Entering edit mode
8.5 years ago
Denise CS ★ 5.2k

It seems the BAMs are available from NIH Roadmap Epigenomics Project Data Listings page. The original paper also points to the sequencing data available from the European Nucleotide Archive (ENA) under the study no. PRJEB4795.

ADD COMMENT
0
Entering edit mode

Bam files are only available for some tissues for Chip-seq data. For others only SRA and bed files are available. I don't want to start from scratch with SRA files and do the pipeline all over again. I was wondering if someone has already done it or has successfully converted the tagalign or bed files to bam. Tagalign files don't have much information about mapping quality etc but the bedfiles seem to have some numbers along with read id I don't know what those numbers are but if anybody does do let me know. The bed files for unconsolidated epigenomes is available and looks something like this ;- ` chr1 10084 10283 62BU8AAXX110111:4:104:1337:6620 0 -

chr1 12881 13080 62BU8AAXX110111:4:72:15560:1099 0 -

chr1 16276 16475 62BU8AAXX110111:4:23:16833:6138 0 -

chr1 48005 48204 62BU8AAXX110111:4:108:5179:18053 0 -

` I am trying to do some comparisons with other data using csaw. Unfortunately csaw only takes bamfiles as input.

With bedToBam I need to specify a mapping quality as well thus I am confused as to what to do. Should I just use consolidated tagalign files which look something as given below and give each read a mapping quality > 10 with bedtobam. -Tagalign files. ` chr1 10153 10189 N 1000 -

chr1 10154 10190 N 1000 +

chr1 10156 10192 N 1000 -

chr1 10156 10192 N 1000 - ` Can someone let me know please!

ADD REPLY
0
Entering edit mode

I converted the bed files to bam files. You don't have to supply the mapping quality in case of of conversion of bed files to bam files if you are using bedtobam in bedtools package. Here is how you may convert it for hg19 dataset:

bedtools bedtobam -i $sortedBedFile -g hg19 > $bamFile

Also, in case you are interested in the quality of the dataset then use phantompeakqualtools that has been extensively used in ENCODE project.

ADD REPLY
0
Entering edit mode

Hi, I was hoping you could help - I'd like to know if you then solved this by going ahead and converting the tagAlign to bam files, or eventually found the original bams or something else entirely?

ADD REPLY

Login before adding your answer.

Traffic: 2075 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6