I need to convert CRAM files to BAM files using pysam. I know from this question that converting a CRAM file to a BAM file requires the appropriate reference genome. I know that I can find the right reference genome by looking at the CRAM file header which in my case has lines like this:
@SQ SN:chr1 LN:248956422 M5:6aef897c3d6ff0c78aff06ac189178dd UR:/mnt/ssd/MegaBOLT_scheduler/reference/G42_refdata/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna
Which I think means the CRAM files were produced using a local version of GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna which I also have a local copy of but at a different path.
What I want to know is how I can use pysam to convert these CRAM files to BAM using the correct reference genome?
I've seen code for converting CRAM to BAM in pysam that takes this form: pysam.view("-@", "8", "-b", "-o", bam_file_path, cram_file_path)
but I don't trust it because it appears to assume pysam knows where to look for the correct reference genome (presumably in the CRAM header or some environmental variable). Any help would be much appreciated. Thank you