Hi,
I have many .xls files containing my read counts
I need to COMBINE these files while I must remove each row of each file containing "exon
", removing columns 1, 2, 3, 5, 6, 8
and removing _mRNA
from the end of each row of column 4
How I can do that in terminal?
I think doing so manually take too much time and would not be accurate enoughstrong text
I don't think you can read xls using
command line easily.basic unix command.Try to export your xls to csv first.
You can find many topics of what you want to do over the web, but the use of
awk
should be the best answer.Take a look at these links:
https://stackoverflow.com/questions/34682182/delete-rows-of-csv-file-based-on-the-value-of-a-column
http://www.tim-dennis.com/data/tech/2016/08/09/using-awk-filter-rows.html
https://unix.stackexchange.com/questions/97070/filter-a-csv-file-based-on-the-5th-column-values-of-a-file-and-print-those-reco
https://stackoverflow.com/questions/44441839/filter-csv-list-with-awk
https://stackoverflow.com/questions/17001849/awk-partly-string-match-if-column-word-partly-matches
Good description of data. Please post example data and expected output