Merging specific columns from different txt files in a unit file
2
0
Entering edit mode
6.5 years ago

Hi,

I counted reads in each bam file by featureCounts, now I have many count.txt files. how I can merge column 7th of each txt file and the first column (gene id) from one file in a unit txt file in mac OS?

$ join 7 counts24.txt counts25.txt counts26.txt counts27.txt counts28.txt counts29.txt counts30.txt counts31.txt counts32.txt counts33.txt counts34.txt counts35.txt counts36.txt counts37.txt counts38.txt counts43.txt counts44.txt counts45.txt counts46.txt counts47.txt counts48.txt counts49.txt counts50.txt counts51.txt counts52.txt counts53.txt counts54.txt counts55.txt counts56.txt counts57.txt > out.txt
*usage: join [-a fileno | -v fileno ] [-e string] [-1 field] [-2 field]
            [-o list] [-t char] file1 file2*
software • 3.0k views
ADD COMMENT
1
Entering edit mode

simple search and you get multiple hits as to how you can use multiple bam files with featureCounts and generate one matrix with all samples for a expression matrix.

combining quantification (featureCounts) result files into a single dataset

https://support.bioconductor.org/p/64932/

Finally read the manual of featureCounts first. It supports multiple bams. There is no harm reading a software usage manual. They are designed for effective usage .

ADD REPLY
0
Entering edit mode

Why is this a "software error"? Please use sensible tags.

What have you tried to solve this issue? Users will help you sooner if you show some effort yourself.

ADD REPLY
0
Entering edit mode

Sorry, I googled and tried the above code by which I obtained error. That is why I taged with software error

ADD REPLY
0
Entering edit mode

can you provide first few lines any two files?

ADD REPLY
4
Entering edit mode
6.5 years ago
venu 7.1k

FYI, featureCounts accepts many bam files at once and generates one count table for all BAMs.

Regarding your error, I think that's not the proper way to use join. I guess order of genes will be same in all counts.txt files from featureCounts. You can simply do paste *counts.txt with little preprocessing (i.e. keep gene_id and counts column in each file).

ADD COMMENT
3
Entering edit mode
6.5 years ago

One way is to paste together a bunch of process substitutions, each of which that cut the desired column from its file:

$ paste <(cut -f1 somegeneid.txt) <(cut -f7 counts24.txt) <(cut -f7 counts25.txt) ... <(cut -f7 counts57.txt) > out.txt

Fill in ... with the rest of the substitutions for counts26.txt through counts56.txt. A script could programmatically generate and run this command for you if your files have a consistent naming scheme.

ADD COMMENT

Login before adding your answer.

Traffic: 2103 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6