extract groups of SnPs and store them in separate files
1
0
Entering edit mode
7.4 years ago
Ana ▴ 200

Hi all,

I have a genotype file containing 500K SNPs (i.e 500000 lines), I want to create groups of SNPs taken every 5K and store each group in seperate files. This means take SNPs 1-5000 and store in file 1(I want to keep the whole line), take SNPs 5001-10000 and store in file 2, take SNPs 10001- 15000 and store in file 3 and so on ....., in the end I should have 500 files! I know that I can do it to create a single group by using sed command in Unix like

sed -n 1,500p geno.file > out.file

but how can I do that sequentially to create all groups?

this is the first few lines of my genotype file.

HanXRQChr01 145635054   N   N   N   T   N   N   N   N   N   N   N   
HanXRQChr01 145798243   N   N   N   C   N   N   N   N   N   N   N   
HanXRQChr01 145823740   N   N   N   N   N   N   N   N   N   N   N   
HanXRQChr01 145933533   N   N   N   N   N   N   N   N   N   N   N   
HanXRQChr01 146276377   N   N   N   N   N   N   N   N   N   N   N
HanXRQChr01 146433063   N   N   N   G   N   G   N   N   N   N   N

Thanks in advance for any suggestion or help

SNP subsetting • 1.4k views
ADD COMMENT
4
Entering edit mode
7.4 years ago

use split with option -l https://linux.die.net/man/1/split

SPLIT(1)                  BSD General Commands Manual                 SPLIT(1)

NAME
     split -- split a file into pieces

SYNOPSIS
     split [-a suffix_length] [-b byte_count[k|m]] [-l line_count]
           [-p pattern] [file [name]]

DESCRIPTION
     The split utility reads the given file and breaks it up into files of
     1000 lines each.  If file is a single dash (`-') or absent, split reads
     from the standard input.

     The options are as follows:

     -l line_count
             Create smaller files n lines in length.
ADD COMMENT
1
Entering edit mode

Thank Pierre for giving me the hint. I had forgotten "split". Actually it was quite easy. I simply used $ split -n 5000 geno.file

ADD REPLY

Login before adding your answer.

Traffic: 2496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6