hii.. I have a file containing reads aligned to the reference genome in fasta format. I want to identify the read-aligned position in the reference genome through python. is there any module that can read this kind of file and return me the aligned positions??
You probably have a bam file. If you want a python module you can look at pysam.
thanks for the reply. but I do not have a bam file. I have a fasta file. I have attached a screenshot. I want to find out the position where the reads aligned to the reference genome.![reads aligned to the reference](/media/images/e2150a86-66db-4074-bad5-a9f63d95)
This is not in fasta format.
if you can post an example of the file, we can help you better
thanks for the reply, actually, it's an aligned file. I am unable to upload the file here so adding a screenshot of the file![enter image description here](/media/images/ac3859e5-e833-4f73-8e46-0c26f28d)
SeqIO or AlignIO from Biopython can help you in parsing fasta files. In your case, it looks like all you need to do is count the number of '-' characters before the sequence to determine the mapping position.
I should note that this is rather an unusual format for storing such data, so maybe you should reconsider the procedure that produced this file.
ok thank you for the help