Dear all,
Simple google search gave me the following:
http://gtrnadb.ucsc.edu/genomes/eukaryota/Hsapi19/hg19-tRNAs.fa
https://www.ncbi.nlm.nih.gov/genome/51
https://www.ncbi.nlm.nih.gov/gene/?term=tRNA%20AND%20human
But how can I do it programmically in Python and find all human tRNA-sequences using Bio.Entrez?
I’ve read biopython Cookbook. I found the following example for some Arabidopsis thaliana chromosomes.
17.2.2 Annotated Chromosomes Continuing from the previous example, let’s also show the tRNA genes. We’ll get their locations by parsing the GenBank files for the five Arabidopsis thaliana chromosomes. You’ll need to download these files from the NCBI FTP site ftp://ftp.ncbi.nlm.nih.gov/genomes/Arabidopsis_thaliana, and preserve the subdirectory names or edit the paths below:
from reportlab.lib.units import cm
from Bio import SeqIO
from Bio.Graphics import BasicChromosome
entries = [("Chr I", "CHR_I/NC_003070.gbk"),
("Chr II", "CHR_II/NC_003071.gbk"),
("Chr III", "CHR_III/NC_003074.gbk"),
("Chr IV", "CHR_IV/NC_003075.gbk"),
("Chr V", "CHR_V/NC_003076.gbk")]
max_len = 30432563 #Could compute this
telomere_length = 1000000 #For illustration
chr_diagram = BasicChromosome.Organism()
chr_diagram.page_size = (29.7*cm, 21*cm) #A4 landscape
for index, (name, filename) in enumerate(entries):
record = SeqIO.read(filename,"genbank")
length = len(record)
features = [f for f in record.features if f.type=="tRNA"]
#Record an Artemis style integer color in the feature's qualifiers,
#1 = Black, 2 = Red, 3 = Green, 4 = blue, 5 =cyan, 6 = purple
for f in features: f.qualifiers["color"] = [index+2]
cur_chromosome = BasicChromosome.Chromosome(name)
#Set the scale to the MAXIMUM length plus the two telomeres in bp,
#want the same scale used on all five chromosomes so they can be
#compared to each other
cur_chromosome.scale_num = max_len + 2 * telomere_length
#Add an opening telomere
start = BasicChromosome.TelomereSegment()
start.scale = telomere_length
cur_chromosome.add(start)
#Add a body - again using bp as the scale length here.
body = BasicChromosome.AnnotatedChromosomeSegment(length, features)
body.scale = length
cur_chromosome.add(body)
#Add a closing telomere
end = BasicChromosome.TelomereSegment(inverted=True)
end.scale = telomere_length
cur_chromosome.add(end)
#This chromosome is done
chr_diagram.add(cur_chromosome)
chr_diagram.draw("tRNA_chrom.pdf", "Arabidopsis thaliana")
It might warn you about the labels being too close together - have a look at the forward strand (right hand side) of Chr I, but it should create a colorful PDF file.
Is it a good way for human genome? And what about Bio.Entrez?
Many thanks!
Natasha
Thank you very much!