hii.. I have a file containing reads aligned to the reference genome in fasta format. I want to identify the read-aligned position in the reference genome through python. is there any module that can read this kind of file and return me the aligned positions??
You probably have a bam file. If you want a python module you can look at pysam.
thanks for the reply. but I do not have a bam file. I have a fasta file. I have attached a screenshot. I want to find out the position where the reads aligned to the reference genome.
This is not in fasta format.
if you can post an example of the file, we can help you better
thanks for the reply, actually, it's an aligned file. I am unable to upload the file here so adding a screenshot of the file
SeqIO or AlignIO from Biopython can help you in parsing fasta files. In your case, it looks like all you need to do is count the number of '-' characters before the sequence to determine the mapping position.
I should note that this is rather an unusual format for storing such data, so maybe you should reconsider the procedure that produced this file.
ok thank you for the help