This code work perfectly
def read_vcf(file_path):
with open(file_path, 'r') as f:
lines = [l for l in f if not l.startswith('##')]
return pd.read_csv(
io.StringIO(''.join(lines)),
dtype={'#CHROM': str, 'POS': int, 'ID': str, 'REF': str, 'ALT': str,
'QUAL': str, 'FILTER': str, 'INFO': str},
sep='\t'
).rename(columns={'#CHROM': 'CHROM'})
However, after reading how to open gzip files I have found this
def read_vcf(file_path):
with io.TextIOWrapper(gzip.open(file_path, 'r')) as f:
lines = [l for l in f if not l.startswith('##')]
return pd.read_csv(
io.StringIO(''.join(lines)),
dtype={'#CHROM': str, 'POS': int, 'ID': str, 'REF': str, 'ALT': str,
'QUAL': str, 'FILTER': str, 'INFO': str},
sep='\t'
).rename(columns={'#CHROM': 'CHROM'})
io.TextIOWrapper is needed based on this post to avoid the error I got as in the post
But now I have got an error I don't understand
I have printed the results of the function and both results look the same
Why this error??
hmm not sure at all, but I never need the TextIOWrapper stuff and use
gzip.open(args.vcf, 'rt')
for reading as textwhat are the contents of line 2654, please?