Getting sequence information by using atoms
1
0
Entering edit mode
2.4 years ago

I have a cif format file and its contents are as follows.

My purpose is to get sequence information (amino acid alphabet) from atoms information. Is there a function written in Python to achieve this?

enter image description here

atoms python cif protein • 1.0k views
ADD COMMENT
0
Entering edit mode
awk '($1=="ATOM") {print $6}' < in.cif | paste -s -d '-'

??

ADD REPLY
0
Entering edit mode

Please do not post images of the data. Always post data it self and expected output for better understanding the issue.

ADD REPLY
0
Entering edit mode
2.4 years ago
Joe 21k

You should be able to use BioPython for this, and it has functions already for this type of thing. Alternatively you could always use something like UCSF Chimera or pymol.

If you want to use generic commandline tools, as Pierre pointed out, the information you need can be obtained from column 6.

Note however, it may not be this simple to extract a sequence from this, as crystal files often have discontinuities in the sequences when compared to their genomic annotations.

ADD COMMENT

Login before adding your answer.

Traffic: 1779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6