Entering edit mode
4 months ago
devenvyas
▴
760
I am trying to convert 1000 Genomes data from 38 to 37, but I am getting fairly stuck. I am trying to use crossmap on the cram files (even converted-to-bam files), but I just keep getting this error. Anyone have any suggestions? My whole pipeline is GRCh37, but these files are GRCh38.
https://www.internationalgenome.org/data-portal/data-collection/30x-grch38
[Vyas@sahara 1000G_30]$ CrossMap.py bam hg38ToHg19.over.chain.gz HG02126.final.cram output.bam
Insert size = 200.000000
Insert size stdev = 30.000000
Number of stdev from the mean = 3.000000
@ 2024-06-30 22:12:07: Read chain_file: hg38ToHg19.over.chain.gz
Traceback (most recent call last):
File "/home/progs/anaconda/bin/CrossMap.py", line 1364, in <module>
crossmap_bam_file(mapping = mapTree, chainfile = chain_file, infile = in_file, outfile_prefix = out_file, chrom_size = targetChromSizes, IS_size=options.insert_size, IS_std=options.insert_size_stdev, fold=options.insert_size_fold)
File "/home/progs/anaconda/bin/CrossMap.py", line 714, in crossmap_bam_file
if len(samfile.header) ==0:
File "pysam/calignmentfile.pyx", line 1184, in pysam.calignmentfile.AlignmentFile.header.__get__ (pysam/calignmentfile.c:14583)
ValueError: unknown field code 'AH' in record 'SQ'
devenvyas Why did you delete this post? Please do not delete posts that have received feedback. Interact with the people that have invested effort into helping you, and work towards providing closure to your post.