Question

picard – scatter intervals by n uses

0

Entering edit mode

12 months ago

Matteo Ungaro ▴ 120

Hi there,

I have a FASTA genome (GRCh38) for which I want to detect and output a BED file containing intervals of the N sequences in the same. It appears Picard has a functionality to do so — scatter intervals by Ns; however, I'm unsure whether this is actually doing what I need.

In practice, the command below results in a one-based file, as opposed to the standard zero-based format of BEDs...; therefore, if someone has more experience, I would like to know whether and how I can use this output file with bedtools to selectively subtract these regions/intervals from the BED coordinate for the entire genome.

Thanks in advance!

java -jar picard.jar ScatterIntervalsByNs \
      R=hg38.fna \
      OT=N \
      O=hg38_one.intervals

bedtools intervals picard bed • 740 views

ADD COMMENT • link updated 12 months ago by Pierre Lindenbaum 166k • written 12 months ago by Matteo Ungaro ▴ 120

score 2 · Accepted Answer · 2024-06-17

2

Entering edit mode

12 months ago

Pierre Lindenbaum 166k

https://gatk.broadinstitute.org/hc/en-us/articles/360036453012-IntervalListToBed-Picard

Trivially simple command line program to convert an IntervalList file to a BED file.

ADD COMMENT • link 12 months ago by Pierre Lindenbaum 166k

0

Entering edit mode

@Pierre Lindenbaum, I see. Essentially, what it does is removing the header and subtracting 1 from the first column? Just to make sure because my more straightforward approach would have been to use grep -v and awk trying to accomplish the same. Let me know, thanks!

ADD REPLY • link 12 months ago by Matteo Ungaro ▴ 120

1

Entering edit mode

Essentially, what it does is removing the header and subtracting 1 from the first column

yep :-)

ADD REPLY • link 12 months ago by Pierre Lindenbaum 166k