I meet some problem for using Deseq2.
3
0
Entering edit mode
5.9 years ago
codezy ▴ 10

Dear all,

I am new in this area and trying to learn how to do the differential analysis now. I have got some bam file from the mapping tool STAR. Now I want to use the DESeq2 to continue the analysis, but DESeq2 ask a matrix as input. Therefore, I use htseq-count to convert the reads bam file into a txt file by the code

 htseq-count -f bam file genome.gtf >file.txt

But the outcome of it is like this when I use read.table in R:

> head(df)
                      V1    V2.x    V2.y
1 __alignment_not_unique 2812420 2666754
2            __ambiguous   14080   15004
3           __no_feature 8863085 8241851
4          __not_aligned       0       0
5        __too_low_aQual       0       0
6     ENSG00000000003.14       0       0

I have try the code

  condition <-factor("V1","V2.x","V2.y")
df2 <- data.frame(df,condition = condition)
dds <- DESeqDataSetFromMatrix(df2, DataFrame(condition), ~ condition)

But it doesn't work, and give me the error

Error in DESeqDataSetFromMatrix(df2, DataFrame(condition), ~condition) : 
ncol(countData) == nrow(colData) is not TRUE

could anyone help me with this? How can I use this htseq outcome for DESeq2? Thank you

RNA-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode

Reading the tutorial of DESeq2 would be a good start.

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#htseq-count-input

For the error, you didn't import the read count file generated from htseq-count, thus the "cannot open the connection" error.

ADD REPLY
0
Entering edit mode

I have solved the problem with these codessampleFiles <- grep("count",list.files("F:/"),value=TRUE) sampleCondition <- sub("(.*count).*","\\1",sampleFiles) sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, condition = sampleCondition)

ADD REPLY
1
Entering edit mode

You realize that DESeq2 doesn't really work if you have one sample-one condition concordance, right?

Go through the tutorial. Just because your command lines are being parsed doesn't mean the software is going to give you the answer you want.

ADD REPLY
1
Entering edit mode
5.9 years ago

The best advice anyone can give you is to walk through the DESeq2 tutorial. Look at the example data files, and google things about R to learn how to make your data look like the data in the tutorial.

ADD COMMENT
1
Entering edit mode
5.9 years ago
leaodel ▴ 190

Hi, codezy. You should really dive into DESeq2's tutorial, it's an essential resource for learning. With that being said, when you have files from HTSeq you should use another function to import your samples. Try this:

Create a variable with the file's path:

dir <- "~/pathtoyourfiles/"

Add the file's name as a column on the table you'll use to feed the sampleTable argument.

myTable$fileName <- grep("count", list.files(dir), value = T)

Finally, import your samples:

ddsHTseq <- DESeqDataSetFromHTSeqCount(sampleTable = myTable, directory = dir, design = ~control+treatment)

You can also find more details about this function with:

?DESeqDataSetFromHTSeqCount()
ADD COMMENT
1
Entering edit mode
5.9 years ago
codezy ▴ 10

Thank you for all your answer. I have solved the problem by these codes:

sampleFiles <- grep("count",list.files("F:/"),value=TRUE) 
sampleCondition <- sub("(.*count).*","\\1",sampleFiles) 
sampleTable <- data.frame(sampleName = sampleFiles, fileName = sampleFiles, condition = sampleCondition)
ADD COMMENT

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6