Hi there,
I have run an alignment by using STAR by using the --readFilesManifest option since I have several samples and I wanted to run it at once:
STAR --runThreadN 20 --genomeDir /path/to/the/folder/ --readFilesManifest /path/to/the/file/samples.txt
Like this I have an unique .bam file with all the samples inside that are now aligned with my reference genome.
The .txt file is a tab limited file with all the samples that belong to 4 different groups .It looks like:
bm-R1_1.fp.fq.gz bm-R1_2.rp.fq.gz bm
bm-R2_1.fp.fq.gz bm-R2_2.rp.fq.gz bm
bm-R3_1.fp.fq.gz bm-R3_2.rp.fq.gz bm
dpi8-R1_1.fp.fq.gz dpi8-R1_2.rp.fq.gz dpi8
dpi8-R2_1.fp.fq.gz dpi8-R2_2.rp.fq.gz dpi8
dpi8-R3_1.fp.fq.gz dpi8-R3_2.rp.fq.gz dpi8
dpi12-R1_1.fp.fq.gz dpi12-R1_2.rp.fq.gz dpi12
dpi12-R2_1.fp.fq.gz dpi12-R2_2.rp.fq.gz dpi12
dpi12-R3_1.fp.fq.gz dpi12-R3_2.rp.fq.gz dpi12
dpi22-R1_1.fp.fq.gz dpi22-R1_2.rp.fq.gz dpi22
dpi22-R2_1.fp.fq.gz dpi22-R2_2.rp.fq.gz dpi22
dpi22-R3_1.fp.fq.gz dpi22-R3_2.rp.fq.gz dpi22
c1_1.fp.fq.gz c1_2.rp.fq.gz control
c2_1.fp.fq.gz c2_2.rp.fq.gz control
c3_1.fp.fq.gz c3_2.rp.fq.gz control
c4_1.fp.fq.gz c4_2.rp.fq.gz control
c5_1.fp.fq.gz c5_2.rp.fq.gz control
c6_1.fp.fq.gz c6_2.rp.fq.gz control
c7_1.fp.fq.gz c7_2.rp.fq.gz control
c8_1.fp.fq.gz c8_2.rp.fq.gz control
c9_1.fp.fq.gz c9_2.rp.fq.gz control
c10_1.fp.fq.gz c10_2.rp.fq.gz control
c11_1.fp.fq.gz c11_2.rp.fq.gz control
c12_1.fp.fq.gz c12_2.rp.fq.gz control
After some quality analyses I have run Feature Counts with the command:
featureCounts -a /path/to/the/folder/genome_reference/genomic.gtf -o Table.count sorted_aligment.bam
The output is:
head Table.count
Geneid Chr Start End Strand Length /Analysis/sorted_aligment.bam
gene-CpipJ_CPIJ020271 NW_001889829.1;NW_001889829.1;NW_001889829.1 7980;8477;12299 7992;8702;12311 +;+;+ 252 0
gene-CpipJ_CPIJ020269 NW_001889825.1;NW_001889825.1;NW_001889825.1;NW_001889825.1;NW_001889825.1;NW_001889825.1;NW_001889825.1 2864;3147;11343;11611;11814;12144;12472 3065;3288;11540;11799;12078;12255;12515 +;+;+;+;+;+;+ 1152 72680
gene-CpipJ_CPIJ039441 NW_001889824.1 242 324 + 83 0
gene-CpipJ_CPIJ039442 NW_001889824.1 976 1048 - 73 0
gene-CpipJ_CPIJ020194 NW_001889821.1 14098 14682 + 585 2952
gene-CpipJ_CPIJ020082 NW_001889820.1;NW_001889820.1 8810;11422 8815;11967 +;+ 552 1081
gene-CpipJ_CPIJ020329 NW_001889809.1;NW_001889809.1 492;8246 806;8257 -;- 327 1
gene-CpipJ_CPIJ020172 NW_001889802.1 9947 10045 + 99 0
But I want the expression not only by gene ID but also by sample. I have cut this file from FeatureCounts in order to keep the gene ID and the expression level by:
cut -f1,7-8 Table.count> Table.count_cut
And I get:
Geneid /Analysis/sorted_aligment.bam
gene-CpipJ_CPIJ020271 0
gene-CpipJ_CPIJ020269 72680
gene-CpipJ_CPIJ039441 0
gene-CpipJ_CPIJ039442 0
gene-CpipJ_CPIJ020194 2952
gene-CpipJ_CPIJ020082 1081
gene-CpipJ_CPIJ020329 1
gene-CpipJ_CPIJ020172 0
gene-CpipJ_CPIJ020291 12
gene-CpipJ_CPIJ039423 0
gene-CpipJ_CPIJ020231 24587
I have read there is a way of have all the info I want together https://www.bioconductor.org/help/course-materials/2016/CSAMA/lab-3-rnaseq/rnaseq_gene_CSAMA2016.html#alternative---counting-with-featurecounts-in-rsubread but I don't understand very well how. I am sure there is a very simple way of doing but I don't find how.
Could anyone help me in including the info from the file sample.txt in the expression Table from FeatureCounts?? Like this I will see the expression of each gene in each sample and I could compare the expression level depending on the moment the sample was taken.
Any help it would be more than welcome.
Thank you so much.
Feed all of your BAM files to
featureCounts
at one time to get a matrix of read counts. Tip: Provide BAM file names in the same order as you want the columns to be (groups etc) to not have to manipulate them later.Thank you for your answer.
The thing is that I have an only BAM file. I run STAR and created an unique BAM file with all the samples inside. Like this I saved time :) So I don't have these BAM file you refer to.
Only way to recognize your individual samples would be to look at the
RG
tag. It appears that you did not run your alignment with the necessary STAR option. So you may have to re-do these alignments.