Entering edit mode
6.2 years ago
alexadeanfitz
▴
10
Hello,
I have count tables from htseq-count and would like to run differential expression analysis using DESeq2
My understanding is that I need to also create an associated metadata table. I have 6 fastq files (2 from baseline condition, 2 from condition A and 2 from condition B). I am planning on doing this in R, and have a question regarding this script:
fastqDir= file.path("/Users/Alexa/fastq")
fastq <- list.files(fastqDir, pattern = "*.fastq.gz")
sampleNO <- str_sub(fastq, 1,10)
condition = c(rep("P",2), rep("A30",2), rep("B96",2))
libraryName = paste(condition,"-",sampleNO, sep = "")
metadata <- data.frame(sampleNO = sampleNO,
condition = condition,
fastq = fastq,
libraryName = libraryName)
metadata
- The first line is giving me error: unexpected input, I am not sure what I am doing wrong
- What do the numbers 1 and 10 represent in the third line
- By creating this metadata table and grouping datasets from the same experimental conditions together, will DESeq2 automatically merge the values of all the datasets within an experimental condition before comparison (ie. take the average of gene counts for gene A in baseline condition and compare to average of gene counts for gene A in test condition)
I am working on a Mac
Thanks
Ignore first question - caused by me cut and pasting script from text editor, the quotation symbols were different
I believe that 1,10 means I want the sampleNO to be divided based on the first 10 characters in the fastq file?
No,
sampleNO <- str_sub(fastq, 1,10)
means you are creating a new vector of strings, composed of the first ten characters from each file name. See?substr
?str_sub
, there you will find:And a lot of examples which you can follow to understand what the function does.