Get The Total Counts Number From A Collapsed File
1
0
Entering edit mode
11.3 years ago
liran0921 ▴ 150

Hi,

After adapter trimming, I collapsed my total reads to the following fasta format:

>1-5832000 
AGGGCCCTTT

>2-124000
GGCCCTTTT

But now I want to get to total number of counts, not just the number of the unique reads. For example, for the above reads, I want to get the total number of (5832000+124000), not 2.

Could anybody tell me how to do it? Thanks.

Ran

read counts • 2.7k views
ADD COMMENT
3
Entering edit mode
11.3 years ago
Rm 8.3k

Assuming tab in between as above:

cut -f 1 inputfile.txt | awk -F"-"  { SUM += $2} END { print SUM }'

EDIT: If it is a fasta file

 >1-5832000 
 AGGGCCCTTT
 >2-124000 
 GGCCCTTTT

Try this:

awk -F"-" '/>/{ SUM += $2} END { print SUM }' input.fasta
ADD COMMENT
0
Entering edit mode

Hi Rm, i copied the fasta file here but it didn't show correctly. Actually it's a fasta file. So >1-5832000 should be on a single line. where 1 is the seq name and 5832000 is the frequency of this sequence. Could you show me how to calculate the total number in this format? Thanks a lot!

ADD REPLY
0
Entering edit mode

I have edited my answer

ADD REPLY
1
Entering edit mode

Thank you! I don't know what it mean, but it works!

ADD REPLY
0
Entering edit mode

hehe, like magic ;-)

ADD REPLY

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6