Entering edit mode
2.7 years ago
Bioinf_Questions
•
0
Hello everyone. I'm trying to filter the lines from a GTF file of online the 100 genes I'm interested in using R. The GTF file is already in a dataframe format with one column with the gene_id of each line and the file with the genes of interest also already has only one column with the gene_id of said genes. I've tried to use filter (package: dplyr), but I get the mesage:
> gtf_filtered <- filter(gtf2, gene == top3)
Error in `filter()`:
! Problem while computing `..1 = gene == top3`.
x Input `..1` must be of size 1326608 or 1, not size 100.
Run `rlang::last_error()` to see where the error occurred.
Is there a way to solve this or another package/function I can use to filter the file? Using one gene_id per time doesn't cause any problem (ex:
> gtf_filter_10 <- filter(gtf2, gene == 'gene_id PRX4') )
Thanks in advance
Thank you for describing the data frame in detail. Please post some data and expected output or dput in R.
The data is just a normal gtf file with 9 columns (see http://www.ensembl.org/info/website/upload/gff.html for gtf file details). The list with the genes of interest has just one column with 100 lines with the gene_ids (like 'gene_id PRX4'). The expected output is the same as the original gtf, but with only the wanted genes.
Thank you for your data explanation again.