add a blank line after particular line
2
0
Entering edit mode
9.5 years ago
amoltej ▴ 100

Hi,

I have a file as follows -

Mp1087439_TGAC_V1.1_scaffold_1    exon    51615    51678    gene_id "3_g"; transcript_id "3_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    53627    53777    gene_id "3_g"; transcript_id "3_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    56113    56171    gene_id "3_g"; transcript_id "3_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    61779    61841    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    61942    62137    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    62322    62513    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    62596    62762    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    63136    63331    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    73319    73368    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    75842    76266    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    76572    76766    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    77576    77751    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    103158    103301    gene_id "6_g"; transcript_id "6_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    103375    103576    gene_id "6_g"; transcript_id "6_t";

I want to add a blank line after the gene_id changes. and it should look like

Mp1087439_TGAC_V1.1_scaffold_1    exon    51615    51678    gene_id "3_g"; transcript_id "3_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    53627    53777    gene_id "3_g"; transcript_id "3_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    56113    56171    gene_id "3_g"; transcript_id "3_t";

Mp1087439_TGAC_V1.1_scaffold_1    exon    61779    61841    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    61942    62137    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    62322    62513    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    62596    62762    gene_id "4_g"; transcript_id "4_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    63136    63331    gene_id "4_g"; transcript_id "4_t";

Mp1087439_TGAC_V1.1_scaffold_1    exon    73319    73368    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    75842    76266    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    76572    76766    gene_id "5_g"; transcript_id "5_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    77576    77751    gene_id "5_g"; transcript_id "5_t";

Mp1087439_TGAC_V1.1_scaffold_1    exon    103158    103301    gene_id "6_g"; transcript_id "6_t";
Mp1087439_TGAC_V1.1_scaffold_1    exon    103375    103576    gene_id "6_g"; transcript_id "6_t";

Can somebody please help me

Thanks

text-formatting awk sed • 1.8k views
ADD COMMENT
2
Entering edit mode
9.5 years ago
iraun 6.2k

This awk command should work:

awk -F'\t' 'NR==1{split($5, a, ";");prevGen=a[1];print}{split($5, a, ";");if (a[1] == prevGen){print}else{print "\n"$0};prevGen=a[1]}' file
ADD COMMENT
1
Entering edit mode

Thank you so much. This is very fast than other answer

ADD REPLY
1
Entering edit mode
9.5 years ago
5heikki 11k

A very graceless way, assumes a tab-separated file, adds a blank line to the top..

TEST=$(printf "empty"); while IFS=$'\t' read -r NAME TYPE START END GENEID TRANSCRIPTID; do if [ "$TEST" == "$GENEID" ]; then printf "$NAME\t$TYPE\t$START\t$END\t$GENEID\t$TRANSCRIPTID\n"; else printf "\n$NAME\t$TYPE\t$START\t$END\t$GENEID\t$TRANSCRIPTID\n" && TEST=$(printf "$GENEID"); fi; done<inputFile
ADD COMMENT
1
Entering edit mode

Thank you so much

ADD REPLY

Login before adding your answer.

Traffic: 1844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6