Hi everyone, I'm a newbie in R and RNA seq analysis, so don't judge me so much) I'm trying to perform RNA-seq analysis as it says here http://combine-australia.github.io/RNAseq-R/06-rnaseq-day1.html , I was given a bunch of bam/bam.bai files(C33A and HeLa) 3 scrambled and 3 shoct4 files for each gene, so as a first step I need to do counts, here is the first question: Should I use featurecounts functions for all bam files of one type of genes or parse them separately?
Secondly, I tried to parse all files as separately and together and got counts txt files, which different http://combine-australia.github.io/RNAseq-R/06-rnaseq-day1.html
Tutorial count file: My Count file:
Use all files together when you feed them into
featureCounts
. That way you get a single matrix (genes in rows, samples in columns) that would be easier to use in downstream DE apps.Hi frozmanik,
welcome to Biostars. What you are asking are valid, but very basic questions that have been addressed a lot of times before. Therefore, I recommend that you spend quality time on reading this excellent tutorial/workflow on RNA-seq analysis, which also covers the questions you raised here. Please use google and the search function extensively because many of the common problems you'll encounter when reproducing the steps covered in the tutorial are extensively described here on Biostars as well as other communities or scientific blogs. Please also check How to add images to a Biostars post, and make sure the link you use is indeed correct. The link to the picture in your question seems to be broken. Cheers!
Hello
use following commands for bam file, to create count matrix after aligning reads with genome
This might work for your specific situation, but because row numbers in
counts <- counts[-c(54135:54139),]
are hardcoded, it is not generic, and also rather complicated for something that can be solved with a one-liner like:While this answer may be correct on it own it is not addressing the question asked by OP. So I am moving this to a comment.