Entering edit mode
3 months ago
DareDevil
★
4.3k
I downloaded chiseq data from below link GSE25769. How do I use this data for viewing the peaks. The data analysis was performed using old reference genome hg18. What are the values in the column 2, 5, 7 represents?
Below is the information by running zcat GSM632892_HCT405_realign.txt.gz | head
#RUN_TIME Wed Oct 8 13:51:55 2008
#SOFTWARE_VERSION @(#) $Id: qualityFilter.pl,v 1.8 2007/11/26 14:42:26 tc Exp $
#FILTER_CRITERION ((CHASTITY>=0.6))
GGAATGGAATGGAATGGAATGGAACAACCCGAATGG 15906 1 ref_chr4:49347516 F GGAATGGAATGGAATGGAATGGATCAACCCGAGTGG 15906
GAACTTGATTTAAAATAATGTTGTATGTAGTATTTA 18000 1 ref_chr4:133533796 F GAACTTGATTTAAAATAATGTTGTATGTAGTATTTA 14859
GGACTAAGAATTGGGAGTACCCAGGACATCCAATTA 18000 8
GTCTTAGGCACAGTAATCAAGGAACCTAAGACCGAG 18000 1 ref_chr1:84351102 F GTCTTAGGCACAGTAATCAAGGAACCTAAGACCGAG 14859
GCAAAGACAAAAATCTTTCTAAGATTGGCCAAAATG 18000 1 ref_chr4:23417003 F GCAAAGACAAAAATCTTTCTAAGATTGGCCAAAATG 14859
GAAGTGCAGTGGTGGGATCTTGGCTCACTGCAAACT 18000 9
GGAAGGAGAGAAGAGATTGTAATAGAAATTAACAAT 18000 1 ref_chr17:64738595 R ATTGTTAATTTCTATTACAATCTCTTCTCTCCTTCC 14859
Also posted on SE Bioinformatics.
Honestly, download the fastq files and process yourself. Nothing gained by using legacy genomes and formats that are not standard today.
the raw fastq files for this data is not available
Then intrinsically the entire analysis and conclusions are not reproducible. I personally would never touch such a dataset.
This is quite dismissive, and non-scientific.
That having, said by years of experience with ChIP-seq I can confidently say that it is a noisy assay that requires appropriate controls and replication. Without raw data, and with just these tables at hand, unclear how they were created, I think you are just building on uncertainty. Not worth it imo. Rather check whether there are other datasets available.