Entering edit mode
3.3 years ago
arshad1292
▴
110
Hello
I have a kraken 2 output (k2report.txt file that looks like this:
What I want to do is to extract only number of reads for example for Genus i.e. G with taxid 2316020, 2719313, 207244 and 572511 and so on. I am not interested in D, O or R etc.
I have a large file with many hundred Genuses. Does anyone has any shell/python script that I could use to extract only Genus abundance (number of reads) for sample1, sample2 and sample3?
I would really appreciate your help.
Many thanks,
Do you mean you want to subset your matrix
where lvl_type == "G"
?If yes then you can use
grep "\tG\t" input-file
.Yes that's correct that I want a subset of matrix that contains only "G".
I tried your script but it produced nothing...
Actually, I assumed
tab (\t)
as the field separator.If it is the filed separator and is still not working, you should probably add
-P
to thegrep
command. Something like this.Ok this one works. Thanks a lot!