Reformating bedGraph files in shell script
0
0
Entering edit mode
6.2 years ago
m98 ▴ 420

I have 100 bedGraph files that I need to reformat before converting them into bigWig files for display in UCSC Genome Browser. Briefly:

  • These bedGraph files were obtained using genomeCoverageBed on bam files (for which the SN tag did not have "chr" in front of the chromosome number).

  • Since they have no "chr" in the bedgraph file, when I convert them into bigWig files with bedGraphToBigWig, the bigwig files cannot be displayed in the UCSC Genome Browser.

I need to do 2 things to my bedGraph files:

  • Add "chr" in front of every chromosome number
  • Replace "MT" by "M"

I have managed to run the following command directly on the command line for a single file:

awk '{print "chr" $0}' file.bedgraph | sed 's/MT/M/g' > $file.modified.bedgraph

However, since I am using an actual script for all my files, I then wrote:

for i in ls($BEDGRAPHDIR/*bedgraph):
do

SAMPLE=`basename $i`
SAMPLE=${SAMPLE%.bedGraph}
echo $SAMPLE

awk -v LINE="$0" '{print "chr" $LINE}' ${OUTDIR}/temp/${SAMPLE}.bedgraph | sed 's/MT/M/g' > ${OUTDIR}/temp/${SAMPLE}.modified.bedgraph

done

However, my .modified.bedgraph are not made - they end up empty and I get a following error (I am working on a cluster: needLargeMem: trying to allocate 0 bytes (limit: 100000000000)

I am a loss as to why my command does not work.

Thanks

bedGraph bigWig ucsc chr awk • 2.5k views
ADD COMMENT
1
Entering edit mode

shorter:

sed 's/^/chr/;s/^chrMT/chrM/' file.bedgraph
ADD REPLY
0
Entering edit mode

Was literally about to add an update with that :) Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6