Entering edit mode
18 months ago
Sd
•
0
Hello, I have 3000 WGS CRAM files and I want to split them into 1Mbp chunks. I want to split with exact genomic coordinate locations, e.g. starting from 1 to 1000000bp, 1000001bp to 2000000bp, 2000001bp to 3000000 etc. for all chromosomes. Therefore, each chunks have the similar corresponding region in each sample. Is there any way that I can do this?
in addition to the good answers provided, might be useful/helpful to look at tabix. Basically, for a one time cost, tabix will index a file, making subsequent searches for a range of data ultra fast.