intersect two file
4
0
Entering edit mode
4.0 years ago
harry ▴ 40

I have 2 transcript file - one contain only transcript id and other file contain transcript id, gene name and gene description.

1st file looks like----

ENST00000009589.8
ENST00000084795.9
ENST00000196551.8
ENST00000202773.13
ENST00000211372.9
ENST00000216146.9
ENST00000222247.10
ENST00000225430.9
ENST00000225655.6

2nd file looks like ----

ENST00000387314.1   MT-TF   mitochondrially encoded tRNA-Phe (UUU/C) [Source:HGNC Symbol;Acc:HGNC:7481]
ENST00000389680.2   MT-RNR1 mitochondrially encoded 12S rRNA [Source:HGNC Symbol;Acc:HGNC:7470]
ENST00000387342.1   MT-TV   mitochondrially encoded tRNA-Val (GUN) [Source:HGNC Symbol;Acc:HGNC:7500]
ENST00000387347.2   MT-RNR2 mitochondrially encoded 16S rRNA [Source:HGNC Symbol;Acc:HGNC:7471]
ENST00000386347.1   MT-TL1  mitochondrially encoded tRNA-Leu (UUA/G) 1 [Source:HGNC Symbol;Acc:HGNC:7490]
ENST00000361390.2   MT-ND1  mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 1 [Source:HGNC Symbol;Acc:HGNC:7455]
ENST00000387365.1   MT-TI   mitochondrially encoded tRNA-Ile (AUU/C) [Source:HGNC Symbol;Acc:HGNC:7488]
ENST00000387372.1   MT-TQ   mitochondrially encoded tRNA-Gln (CAA/G) [Source:HGNC Symbol;Acc:HGNC:7495]
ENST00000387377.1   MT-TM   mitochondrially encoded tRNA-Met (AUA/G) [Source:HGNC Symbol;Acc:HGNC:7492]

the 2nd file is download from biomart(ensembl) and i just want to know that my 1st file where intersect to 2nd file then i can know the gene name of my 1st transcript file.

thanks in advance

intersect transcript_id • 1.5k views
ADD COMMENT
0
Entering edit mode

thanks for you valuable suggestion.

ADD REPLY
0
Entering edit mode
4.0 years ago

quick n' dirty: use grep -f file1 file2

right way: use sort to sort both files on the ensembl ID and the join to get the instersection.

ADD COMMENT
0
Entering edit mode

A simple modification of Pierre's suggestion, building the proper pattern before the query, would be more than enough:

perl -lne 'print "^$_\\s"' file1.txt | grep -f - file2.txt
ADD REPLY
0
Entering edit mode
4.0 years ago
abedkurdi10 ▴ 190

Or You can use dplyr R package to do the intersection.

ADD COMMENT
0
Entering edit mode
4.0 years ago

Base R works as well:

joint = df2[df2$ens_id %in% df1$ens_id,]
ADD COMMENT

Login before adding your answer.

Traffic: 1922 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6