Entering edit mode
21 months ago
namck
•
0
I have a super large file containing more than 1000 FASTA sequences merged. I want to extract the list of accession IDs from the file. But the following code got me an error. Can anybody tell me why and how to solve this? All my sequences contains a header file like this,
>NZ_QJRW01000001 (<'gap type:'>, <'description'>, 'gap start:', XXXXX, 'gap end:', XXXXX)**
Here is a snippet from the python script I have tried,
if you want to stop the loop at the first time, there is no match use "break". if you want to skip the line with no match and continue use "continue"
Any particular reason you are doing this with python (other than perhaps learning python)? Or is there more to this script than just accession recovery. That can be done with a simple
grep
andcut
otherwise.