Hi,
I have a list of genes of particular interest to me, and some data I obtained from RNASeq / official annotation in GTF format (I can easily convert to the bed format with gtf2bed). They contain an identifier for each gene or transcript.
I would like to do some operations (compare my data to the official annotation for example), but ONLY on these genes of interest. I select them as having the same gene ontology for example.
Is there a parser which allows me to read a gtf (or bed) file, select the lines containing the identifiers I request, and writing down the results to a new gtf (bed) file?
I can code a simple parser, but if a solution exists already (which is very likely..) I am interested. I am coding in R currently, but a Python script would be fine too.
Thanks for your advice
You are perfectly correct, I was going for something needlessly complicated. Thanks a lot for your help.
I'm glad it was that easy :)
you might need the capital F flag to indicate exact matches only, or grep could find MAPK inside the line MAPK3.