Question

Reformating bedGraph files in shell script

0

Entering edit mode

6.9 years ago

m98 ▴ 440

I have 100 bedGraph files that I need to reformat before converting them into bigWig files for display in UCSC Genome Browser. Briefly:

These bedGraph files were obtained using genomeCoverageBed on bam files (for which the SN tag did not have "chr" in front of the chromosome number).
Since they have no "chr" in the bedgraph file, when I convert them into bigWig files with bedGraphToBigWig, the bigwig files cannot be displayed in the UCSC Genome Browser.

I need to do 2 things to my bedGraph files:

Add "chr" in front of every chromosome number
Replace "MT" by "M"

I have managed to run the following command directly on the command line for a single file:

awk '{print "chr" $0}' file.bedgraph | sed 's/MT/M/g' > $file.modified.bedgraph

However, since I am using an actual script for all my files, I then wrote:

for i in ls($BEDGRAPHDIR/*bedgraph):
do

SAMPLE=`basename $i`
SAMPLE=${SAMPLE%.bedGraph}
echo $SAMPLE

awk -v LINE="$0" '{print "chr" $LINE}' ${OUTDIR}/temp/${SAMPLE}.bedgraph | sed 's/MT/M/g' > ${OUTDIR}/temp/${SAMPLE}.modified.bedgraph

done

However, my .modified.bedgraph are not made - they end up empty and I get a following error (I am working on a cluster: needLargeMem: trying to allocate 0 bytes (limit: 100000000000)

I am a loss as to why my command does not work.

Thanks

bedGraph bigWig ucsc chr awk • 2.7k views

ADD COMMENT • link updated 6.7 years ago by Biostar 20 • written 6.9 years ago by m98 ▴ 440

1

Entering edit mode

shorter:

sed 's/^/chr/;s/^chrMT/chrM/' file.bedgraph