Entering edit mode
2.0 years ago
themercenary
•
0
How to split the ID parts such as NP_001007096.1 into one column, the rest of the specific naming into the 2nd column, and the sequence data into the 3rd column in the protein fasta protein file by using R programming?
Provide an example of the headers.
Is this a FASTA? Will the first line always start with '>'? Is the sequence always on a single line regardless of length or will it wrap?
In addition.. If you don't preserve the structure of a fasta file, you will be ended with a worthless fasta file. I mean that the second line must be the sequence itself. You cannot arbitrarily set in the second line any other information but the sequence