How to split .bed file in R
0
0
Entering edit mode
3.8 years ago

Hello, I am a new to bioinformatics and i have a question about the .bed file.

I have a mini .bed file sample that contains 500,000 rows and a corresponding .bigWig file.

I want to split the .bed file randomly into 50 chunks and get the density of each chunk with the .bigWig file. My question is how can i do this?

  • I found a function "bigWigAverageOverBed" by ENCODE that can take these two files as input and output "the average score of big wig over each bed ", which is exactly what i want.
  • But how should I split the rows in the .bed file? I am doing this in R. Do I just randomly pick 10,000 rows and save it as chunk1.bed, chunk2.bed ...etc and compare with the .bigWig file? Or I can't save the files separately?

thank you.

r rna-seq genome • 1.3k views
ADD COMMENT
0
Entering edit mode
split  --lines=10000  --additional-suffix=.bed input.bed split.
ADD REPLY
0
Entering edit mode

Thank you for the reply, But is this in R? I want to use R.

ADD REPLY
0
Entering edit mode

no, that's why it's not an answer. But anyway, soon or later you'll have to run a bash command with bigWigAverageOverBed...

ADD REPLY
0
Entering edit mode

let's say I successfully use R to split the .bed file into 50 separate files, is it possible use the separated file as input to the bigWigAverageOverBed function? Since the row size is different (than the .bigWig file).

ADD REPLY
0
Entering edit mode

Since the row size is different (than the .bigWig file).

huh ???

ADD REPLY
0
Entering edit mode

Ummm... so I'm basically confuse about how the bigWigAverageOverBed function works, it only says "Compute average score of big wig over each bed" on the documentation. But how exactly?

Since the row size is different (than the .bigWig file). huh ???

So the size of the input .bed file doesn't matter? Maybe i will split the question into the following two questions:

  1. initially I have a .bed file with 500,000 rows and a corresponding .bigWig file. If i input these two file into the bigWigAverageOverBed function. What will i get and in what format? (It seems like (googling) the output would be the same .bed file with a new mean column appended??)

  2. If I split the .bed file into 50 sub files, chunk1.bed, chunk2.bed, ...etc, can I still input the sub file (chunk1.bed) and the same .bigWig file into the bigWigAverageOverBed function? If yes, What will i get and in what format?

ADD REPLY
0
Entering edit mode

See related R solutions at StackOverflow

ADD REPLY
0
Entering edit mode

but I later need to input the .bed file into the function "bigWigAverageOverBed", will splitting and saving into separate files be invalid?

ADD REPLY

Login before adding your answer.

Traffic: 3084 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6