Entering edit mode
7.6 years ago
Shicheng Guo
★
9.5k
Hi All,
I have a bed file contents hundred of human genomic regions. I want to get some basic characteristics for these genomic regions, like GC contents et.c. Any perl script could do it without download the fastq files for these regions.
I know if you download the fasta files for these regions, you can use the following script to calculate GC contents:
http://alrlab.research.pdx.edu/aquificales/scripts/get_gc_content.pl
or like this:
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from hg19.chromInfo" > hg19.chrom.sizes
Sincerely,
The post in the current version seems to be missing a valid question or is incomplete. Can you take a look and amend as needed?
I am trying to understand your question. Is it as follows:
How can I find nucleotide composition (GC content and such) of genomic regions from bed file using online tools that do not download the reference fasta file to my server/computer. Preferably using ucsc server for the computation.
I believe you might find public versions of Galaxy as the best way to handle such projects. You can get data to Galaxy (like bed files for particular regions) directly from UCSC (no need to download to your server or computer).