Entering edit mode
4.7 years ago
anamasic03
•
0
Hi! Is it possible to find number of genes in genome and which programmes to use? Thank you in advance.
Hi! Is it possible to find number of genes in genome and which programmes to use? Thank you in advance.
This is a solution using biomaRt
R package:
Anyway, this is your HOMEWORK. I suggest you read the user guide
and try the code yourself. And you can ask more questions: How many transcripts, non-coding-genes, ... etc.
For example: How many protein-coding-genes in Human genome?
# R code
> library(biomaRt)
> mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
> df <- getBM(attributes = c("ensembl_gene_id", "gene_biotype"),
filters = "biotype",
values = "protein_coding",
mart = mart)
> dim(df)
[1] 22799 2
> head(bb, 2)
ensembl_gene_id gene_biotype
1 ENSG00000198888 protein_coding
2 ENSG00000198763 protein_coding
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
which genome ?
my homework is to make a programme which will determine the number of genes in human genome
if it's homework, should you perhaps not give it a try yourself before just asking here?
think of this: are you asked to write a gene prediction program (quite tough ) or a program that can find this info by downloading existing annotations?
Many people have spent their whole careers working on non-continuous gene prediction in eukaryotes. Many still do, yet the problem remains unsolved. So either you have misunderstood your assignment, or whoever gave you that assignment does not understand the difficulty. Either way, writing a program that will determine the number of human genes is not what anyone should be tackling from the scratch. That would still be true even for those who have much more experience in human gene prediction.