Dear all,
I am new in this area and trying to learn how to do the differential analysis now. I have got some bam file from the mapping tool STAR. Now I want to use the DESeq2 to continue the analysis, but DESeq2 ask a matrix as input. Therefore, I use htseq-count to convert the reads bam file into a txt file by the code
htseq-count -f bam file genome.gtf >file.txt
But the outcome of it is like this when I use read.table in R:
> head(df)
V1 V2.x V2.y
1 __alignment_not_unique 2812420 2666754
2 __ambiguous 14080 15004
3 __no_feature 8863085 8241851
4 __not_aligned 0 0
5 __too_low_aQual 0 0
6 ENSG00000000003.14 0 0
I have try the code
condition <-factor("V1","V2.x","V2.y")
df2 <- data.frame(df,condition = condition)
dds <- DESeqDataSetFromMatrix(df2, DataFrame(condition), ~ condition)
But it doesn't work, and give me the error
Error in DESeqDataSetFromMatrix(df2, DataFrame(condition), ~condition) :
ncol(countData) == nrow(colData) is not TRUE
could anyone help me with this? How can I use this htseq outcome for DESeq2? Thank you
Reading the tutorial of DESeq2 would be a good start.
http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#htseq-count-input
For the error, you didn't import the read count file generated from htseq-count, thus the "cannot open the connection" error.
I have solved the problem with these codes
sampleFiles <- grep("count",list.files("F:/"),value=TRUE) sampleCondition <- sub("(.*count).*","\\1",sampleFiles) sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, condition = sampleCondition)
You realize that DESeq2 doesn't really work if you have one sample-one condition concordance, right?
Go through the tutorial. Just because your command lines are being parsed doesn't mean the software is going to give you the answer you want.