Question

how extract information from the output text file?

0

Entering edit mode

18 months ago

sata72 • 0

I have several output files and want extract some information from them.

sample1.out
sample2.out
sample3.out
..
..

These information are there in each samples:

  3259390 reads; of these:

   3234126 (99.22%) were paired; of these:
   ----
   292091 pairs aligned concordantly 0 times; of these:
   ----
  59571 pairs aligned 0 times concordantly or discordantly; of these:
  98.90% overall alignment rate

I want to have information of all the samples as this in the output:

sample1    sample1      sample1           sample 2    sample       2 sample 2
reads       paired      overall.alignment  reads       paired      overall.alignment
3259390     3234126     98.90              3259398     3234136     98.98

R linux • 1.2k views

ADD COMMENT • link 18 months ago by sata72 • 0

0

Entering edit mode

You're going to have to write your own code for this, especially since your desired output format is not a straightforward one-row-per-sample table either.

If the line number is consistent and you need the first word from the 1st, 3rd 8th lines, you should just use awk to print the first word for each file where the NR matches one of those three numbers, transpose the output so entries are separated by tabs instead of new lines and generate the header manually.

ADD REPLY • link 18 months ago by Ram 45k

0

Entering edit mode

Looks like this is bowtie* output? You may be able to run multiQC on them to summarize.

ADD REPLY • link 18 months ago by GenoMax 151k

0

Entering edit mode

For the sake of the future reports when you get 100 or maybe 1000 samples: flip the columns and rows and never mix numerical values with "reads" "paired" etc in a column.

Sample names as row names, the specific values as columns.

It is trivial to sort by column values, it is way less so if you want to sort values in a row but from selected columns. No need to use a shotgun for a foot self amputation me thinks.

ADD REPLY • link 18 months ago by Darked89 4.7k

0

Entering edit mode

Good idea! thanks

ADD REPLY • link 18 months ago by sata72 • 0

score 3 · Accepted Answer · 2023-11-07

3

Entering edit mode

18 months ago

Pierre Lindenbaum 166k

learn awk.

use something like:

 awk '/reads; of these/ {print FILENAME,"reads",$1} /) were paired/ {print FILENAME,"paired",$1;print FILENAME,"percent",$2;}' sample.txt

to convert the file to a tabular format.

ADD COMMENT • link 18 months ago by Pierre Lindenbaum 166k

0

Entering edit mode

it works, thanks

ADD REPLY • link 18 months ago by sata72 • 0