Entering edit mode
7.8 years ago
boczniak767
▴
870
Hi, I'd like to create a bed file defining subsequence (randomly taken 100-400bp, different for each input sequence) of 1kb sequences in bed file. I've looked of course at bedtools, and searched the web but haven't find anything useful.
bedtools random
will generate fixed length sequences restricted by the chromosomes' boundaries
As I'd like to use bed file to extract sequences from genome I've also considered extraction of 1kb fasta sequences (using my bed file) and trimming it. But haven't find the answer how to do that pseudo-random trimming.
It seems that I've found the solution. Because I need such pseudo-peaks as a background for analysis of peaks detected in real data, the easiest solution is to call peaks the same way on some random data. I'll try
samtools merge
andsamtools -s
commands to create the subsample of randomized alignments from mybam
files and call peaks using my standard parameters.I think it's the end of my monologue ;-) It's turns out that using peaks called on random data is not efficient - in fact I've got less ranges than from real data, although I used bigger file.
Eventually I've used
bedtools shuffle
with-incl
option with 2k upstream gene sequences. As input-i
I've used randomized positions with regard to genes from multiplied peak file.