UBUBTU csplit command
0
0
Entering edit mode
4.6 years ago

I have a file, like below containing 50000000 rows. I want to split it chromosome wise, like output file 1 will contain Chr 1 and output file 2 will contain Chr 2 and so on.

V1    V2 V3 V4 V5   V6
1 chr1 10469  +  3  3 TCGC
2 chr1 10470  - 25 30 GCGA
3 chr1 10471  +  1  5 GCGG
4 chr1 10472  - 13 39 CCGC
5 chr1 10484  +  0  6 CCGG

I am using UBUNTU platform and csplit command. I could not figured it out. Could you please help me what will be the syntax?

Thanks Shrinka

UBUNTU csplit command • 1.5k views
ADD COMMENT
0
Entering edit mode

It can be as simple as grep chr1 yourfile > chr1file, grep chr2 yourfile > chr2file etc. Add the header at top if you need it.

ADD REPLY
0
Entering edit mode

Thanks for your reply. I have used that. It is producing 0 kb output file, may be it is memory related issue. My RAM size is not good to tackle, as my file size is big. So I thought csplit command can be useful in the memory constrained situation

Thanks Shrinka

ADD REPLY
0
Entering edit mode

If your example file above it correct then the above command should work. Did you copy the file over to unix from a windows machine by any chance?

ADD REPLY
0
Entering edit mode

Thank you for your response.

By using this I loaded UBUNTU and I am using that. I have Windows 10 in my laptop https://crashcourse.housegordon.org/split-fasta-files.html

I used precisely this command

grep "chr1" B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG

It is generating 0 KB files

If needed I can send one file to you

Regards

Shrinka

ADD REPLY
0
Entering edit mode

grep don't use any memory, please provide what exact command are you typing and which OS

ADD REPLY
0
Entering edit mode

Thank you for your response.

By using this I loaded UBUNTU and I am using that. I have Windows 10 in my laptop https://crashcourse.housegordon.org/split-fasta-files.html

I used precisely this command

grep "chr1" B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG

It is generating 0 KB files

If needed I can send one file to you

Regards

Shrinka

ADD REPLY
0
Entering edit mode

you need to use output redirection:

grep -w chr1 B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG > B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG.chr1

the new file is generated as B19818.CEMT_178.Bisulfite-Seq.hg38.B19818_2_lanes_dupsFlagged.q5.5mC.CpG.chr1

ADD REPLY
0
Entering edit mode

Nope the same problem remain

ADD REPLY
0
Entering edit mode

Put an example file up (does not need to be complete file) at pastebin.com.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2104 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6