Splitting chromosomes in bins of 100kb
3
0
Entering edit mode
10.3 years ago
ChIP ▴ 600

Hi!

I have the genomic table file of hg19, describing the lengths of each chromosomes. How can I split it up into bins of 100kb for each chromosome.

The file looks like this:

chr1    0    249250621
chr2    0    243199373
chr3    0    198022430
chr4    0    191154276
chr5    0    180915260
chr6    0    171115067
chr7    0    159138663
chr8    0    146364022
chr9    0    141213431
chr10    0    135534747
chr11    0    135006516
chr12    0    133851895
chr13    0    115169878
chr14    0    107349540
chr15    0    102531392
chr16    0    90354753
chr17    0    81195210
chr18    0    78077248
chr19    0    59128983
chr20    0    63025520
chr21    0    48129895
chr22    0    51304566
chrX    0    155270560
chrY    0    59373566
chrM    0    16571

Any one liner in awk or perl?

Thank you

ChIP-Seq next-gen • 3.6k views
ADD COMMENT
5
Entering edit mode
10.3 years ago
slw287r ▴ 140

try

bedtools makewindows
ADD COMMENT
3
Entering edit mode
10.3 years ago
$ awk ' \
    BEGIN \
    { \
        binSize = 100000; \
        binIdx = 0; \
    } \
    { \
        chr = $1; \
        start = $2; \
        stop = $3; \
        for (binStart = start; binStart < (stop - binSize); binStart += binSize) { \
            print chr"\t"binStart"\t"(binStart + binSize)"\tbin-"binIdx; \
            binIdx++; \
        } \
    }' chrExtents.bed \
    > myBins.bed
ADD COMMENT
1
Entering edit mode
10.3 years ago
Vivek ★ 2.7k
awk '{for(i=1; i<= $3; i=i+100000) if(i < $3 - 100000) print $1"\t"i"\t"(i+100000); else print $1"\t"i"\t"$3}' file.txt
ADD COMMENT

Login before adding your answer.

Traffic: 2353 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6