Forum:Awk in Bioinformatics
2
11
Entering edit mode
6.3 years ago
Shicheng Guo ★ 9.5k

Here to show the examples to use awk with powerful recombination: I will update more examples.

  1. Merge column 4 and 5 and ouput to the file named as column 1 and 3.

    awk '{print $4"\n"$5 > "./snpset/$1.$3.txt}'  GRCH37.SNP150.bed
    
  2. Split and print content to it (as filename)

    awk '{ split($2, a, "_"); print $1"\t"a[2]"\t"$3 >> a[1]".txt"; }' GRCH37.SNP150.bed
    
  3. NF gives you the total number of fields in a record while NR give you current process line number awk '{print NR,"->",NF}' GRCH37.SNP150.bed

  4. NR and FNR will give you current line according to single file or multiple file. FILENAME give you filename. awk '{print FILENAME, FNR, NR;}' hg19.snp150.bed hg38.snp150.bed

  5. With 1,4,8 as parameter for plink and submit as pbs job awk '$8!="." {cmd="plink --bfile ~/1000Genome/"$1 " --ld "$4" "$8 " --out './LD/'"$4"."$8".r2 | qsub -N "$4"."$8;system(cmd)} -e ./temp/ -o ./temp/' hg19.DMR.bed

  6. join, sort, uniq, awk together. join -t $'\t' -1 1 -2 2 <(sort -t $'\t' -k1,1 input.txt) <(sort -t $'\t' -k2,2 ref2.txt) | uniq | awk -F '\t' '{line=sprintf("%s\t%s\t%s\t%s\t%s",$1,$2,$3,$4,$5);if($7>=$2 && $7<=$3) {a[line]+=int($6);} else {a[line]+=0;}} END {for(line in a) printf("%s\t%d\n",line,a[line]);}'

  7. multiple (three) split of awk command: =; space and D

    grep R-sq *log | awk -F'[=\sD]' '$5>0.1{print}'
    
perl awk shell • 5.3k views
ADD COMMENT
3
Entering edit mode

Some of these commands are very specific., and the descriptions don’t really explain what they do or why they are useful.

If you wish to post useful commands, I would suggest contributing to one of the existing threads for example here. These kinds of compendiums work best when they are not spread out all over the place.

Also, why have you tagged Perl?

ADD REPLY
2
Entering edit mode

Hi Shicheng Guo,

These commands are potentially useful, but as they lack information (and are for very specific use cases) users will not find your post when they need it. Perhaps you should consider getting your own blog (e.g. Wordpress) and explain these commands in more detail in a series of post. Awk in bioinformatics already sounds like a good title, perhaps "unix commands" in bioinformatics would broaden your scope.

You labelled this as a "Forum" - which is generally a post type for "a topic for discussion for which no definite answers exist". It's not exactly a "Tutorial" either, since you are not really teaching anything, just showing a couple of commands. It is for example entirely unclear what your sixth command does and why anyone would use it.

We value all contributions to biostars, but right now, this mostly looks like you are trying to get some upvotes.

Cheers,
Wouter

ADD REPLY
1
Entering edit mode

Yes. I just record it for myself. and I am happy if it helps others. if not, I am sure it is no harm to others, right?

ADD REPLY
7
Entering edit mode
6.2 years ago
Batu ▴ 290

I can refer this link that includes more examples and other cases: Useful bash one-liners for bioinformatics.

ADD COMMENT

Login before adding your answer.

Traffic: 2244 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6