Hi,
I followed these steps to use hg38 restriction sites in juicer,
Step 1) Use 'findRestrictionSites.pl' from HiCUP pipeline.
Edit
my $chr_file = "/proj/uppstore2018034/santhilal/bin/juicer/Homo_sapiens.GRCh38.dna.chromosome.$chromosome".".fa.gz";
system("wget ftp://ftp.ensembl.org/pub/release-94/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.chromosome."."$chromosome".".fa.gz");
Run as script
for i in `seq 1 22`;
do
perl findRestrictionSites.pl --genome_version=hg38 --chromosome=${i} --re_site=GATC | awk '{printf("MboI_site_1%d\t%s\n",NR,$0)}' | sed 's/\:/\t/g' | sed 's/\-/\t/g' | awk '{print $2"\t"$3"\t"$4"\t"$1;}' > /proj/uppstore2018034/santhilal/bin/juicer/MboI_restriction_sites/hg38_chr${i}_MboI.rmap ;
cat restriction_sites/hg38_chr${i}_MboI.rmap | cut -f1,2 | awk 'BEGIN{FS="\t"}{ if( !seen[$1]++ ) order[++oidx] = $1; stuff[$1] = stuff[$1] $2 " " } END { for( i = 1; i <= oidx; i++ ) print order[i]"\t"stuff[order[i]] }' | sed 's/\t/ /g' | sed 's/chr//g' > restriction_sites/hg38_chr${i}_MboI.txt ;
done
Step 2) Merge restriction sites from all chromosomes (in order)
cat restriction_sites/hg38_chr1_MboI.txt restriction_sites/hg38_chr2_MboI.txt restriction_sites/hg38_chr3_MboI.txt [....] restriction_sites/hg38_chr22_MboI.txt > restriction_sites/hg38_Mbol.txt
Step 3) Extract chromosome sizes to to add at the end of restriction sites file
cat references/hg38.fa.fai | cut -f1,2 | sort -g -k1,1 | sed 's/\t/ /g' > references/hg38.fa.fai.tmp
Step 4) Combine chromosome size with restriction sites file
join restriction_sites/hg38_MboI.txt.bak references/hg38.fa.fai.tmp > restriction_sites/hg38_MboI.txt
I hope this helps :)
Follow the Wiki which is even linked on the page you linked.
thanks but the problem was that the script was looking for that file in some other place different than the "installation" folder which was weird. but that is fixed