How to calculate the number of proteins present in each chromosomes?
1
0
Entering edit mode
7.9 years ago
Naresh ▴ 60

Hi All, I have a bacterial species (Gimesia maria) which has 5865 protein sequence which is present in 14 chromosomes.

I would like to know how many proteins are present in each chromosomes.

Kindly help me.

R genome • 1.7k views
ADD COMMENT
0
Entering edit mode
7.9 years ago
apa@stowers ▴ 600

You need to specify what format your data are in. What to do depends entirely on what format your data are in.

I'll bet you have, or can get, a bed file of protein-coding genes? On the command line:

cut -f1 proteins.bed | sort -u | uniq -c

ADD COMMENT
0
Entering edit mode

I think you mean:

cut -f1 proteins.bed | sort | uniq -c

sort -u will already remove duplicates, there won't be much to count for uniq after that.

ADD REPLY
0
Entering edit mode

You are right, and I consistently make that mistake on the actual command line too! Somehow "sort" always gets "-u", and then I have to go re-run my line... Ahh motor cortex...

ADD REPLY

Login before adding your answer.

Traffic: 2627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6