Question

DESeq dataset split into 2 subsets?

0

Entering edit mode

6.2 years ago

popescuiofelia ▴ 10

I have 210 genomic DNA sequences of different strains of amoeba and I am trying to do DESeq analysis on them (I am looking for lowest logfold changes). I have mapped them, counted the number of reads per gene and loaded the dataset into R. Unfortunately, I can't make the data frame that encompasses the sequence ID, the level (control or other) and the type of sequencing (paired or single end). Since after the command R gives a + I was pretty sure it is a punctuation problem but I have looked over and over and I can't find anything wrong. I used the command as below:

colData <- data.frame("strain"=c("10NC87.1","11NC96.1","12NC99.1","13NC34.2","14NC39.1","15NC52.3","16NC54.2","17NC58.1","18NC60.1","19NC60.2","1NC105.1","20NC63.2","2NC28.1","3NC67.2","3P51S75","4NC69.1","5NC71.1","6NC73.1","7NC76.1","8NC80.1","9NC85.2","A01.311S1merged.bam","A02.486S8merged.bam","A03.488S16merged.bam","A04.571S24merged.bam","A05.582S32merged.bam","A06.593S40merged.bam","A07.670S1merged.bam","A08.700S8merged.bam","A09.728S15merged.bam","A10.734S23merged.bam","A11.363S31merged.bam","A12.667S38merged.bam","AC9S2","B01.579S2merged.bam","B02.532S9merged.bam","B03.655S17merged.bam","B04.672S25merged.bam","B05.505S33merged.bam","B06.786S41merged.bam","B07.487S2merged.bam","B08.530S9merged.bam","B09.544S16merged.bam","B10.576S24merged.bam","B11.577S32merged.bam","B12.578S39merged.bam","B1AS67","B25CS96","B34AS78","B41AS84","BM5AS25","BS3","C01.580S3merged.bam","C02.600S10merged.bam","C03.763S18merged.bam","C04.732S26merged.bam","C05.398S34merged.bam","C06.118S42merged.bam","C09.586S17merged.bam","C10.531S25merged.bam","C11.ws2162S33merged.bam","C12.815S40merged.bam","CF2ddS82","CH14AS81","CT6AS54","CT9AS51","D01.608S4merged.bam","D02.777S11merged.bam","D03.401S19merged.bam","D04.735S27merged.bam","D05.606S35merged.bam","D06.738S43merged.bam","D07.616S3merged.bam","D08.561S10merged.bam","D09.602S18merged.bam","D10.758S26merged.bam","D11.180S34merged.bam","D12.18S41merged.bam","DCB5AS23","DD10C2S22","DD20B2bS49","DD20BS59","DD44S14","DD7S7","E01.642S5merged.bam","E02.744S12merged.bam","E03.448S47merged.bam","E04.375S46merged.bam","E05.805S36merged.bam","E06.ws655S44merged.bam","E07.317S4merged.bam","E08.ws380S1L001","E09.782S19merged.bam","E10.c5aS27merged.bam","E11.413S35merged.bam","E12.417S42merged.bam","E2C2S74","EI10AS57","F01.524S6merged.bam","F02.433S13merged.bam","F03.749S21merged.bam","F04.648S29merged.bam","F05.336S37merged.bam","F06.572S45merged.bam","F07.756S5merged.bam","F08.587S12merged.bam","F09.PJ11S20merged.bam","F10.583S28merged.bam","F11.483S36merged.bam","F12.438S43merged.bam","FC4CS19","G01.427S7merged.bam","G02.750S14merged.bam","G03.442S22merged.bam","G04.419S30merged.bam","G05.307S38merged.bam","G06.949S46merged.bam","G07.826S6merged.bam","G08.434S13merged.bam","G09.366S21merged.bam","G10.421S29merged.bam","H02.537S15merged.bam","H03.568S23merged.bam","H04.824S31merged.bam","H05.181S39merged.bam","H05.181S45merged.bam","H06.mfdS47merged.bam","H07.ws582S7merged.bam","H08.v12S14merged.bam","H09.21S22merged.bam","H10.9S30merged.bam","H11.1071S2L001","H11.1071S37merged.bam","H11A3S94","H12.304S44merged.bam","H15A1S66","H15B1S85","H20B2S80","H4A1S90","HD45B1S46","HD48D1S83","HD54C1S30","LB10CS70","LL20DS92","M1AS4","M4BS1","MA2A1S27","MA2F1S12","MA4B1S50","NC1011S73","NC21B1S87","NC26C1S34","NC26L1S79","NC26V1S55","NC282S89","NC412S68","NC431S58","NC672S21","NC741S8","NC752S10","NC942S42","OH594S18","OHIOS15","OZK11AS48","PL11AS88","S118S24","S220S53","S25S39","S2AS9","S53S40","SM12AS13","TN34A1S35","TN39C2S6","TN40J3S31","TN45T3AS33","TN50J1S95","TN52E1S65","TN52F1S5","TN52G1S64","TNSC14S28","V301B2S60","V319B3S29","V323C1BS47","V324B3S16","V328A1S32","V329B1S43","V330B1S62","V330D1S72","V330D2S91","V331C1S61","V331D1S71","V331D2S38","V336B1S36","V341A2S26","V341C1S17","V342B2S76","V343D2S44","V348C1S63","V4F4S37","V54C2S52","V55A3S41","V56D1S86","V64D2S56","V72B3S20","WS1956S69","WS2162S93","WS472S11","WS7S45","ZA3AS77","Ax4"),"strain"=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","control"),"type"="paired.end")

I have tried to see where R thinks the problem is and up until the 171 strain it gives no error. Further down, even if I add different strain names it still thinks the command wrong. I have read online that R has a limit of rows and column that you can make but I am far from that threshold. Other said something about the RAM but I have a 7.7 GiB of RAM. It is a really dumb question, but can somebody please explain what is going on and why I always get that +? If I can't make it work, will it affect the log fold changes results if I split the data into 2 sets of 105 sequences?

Thanks!

software error R • 2.0k views

ADD COMMENT • link updated 6.2 years ago by h.mon 35k • written 6.2 years ago by popescuiofelia ▴ 10

2

Entering edit mode

You want to run deseq on genomic dna? Are you sure that's meaningful?

ADD REPLY • link 6.2 years ago by swbarnes2 14k

0

Entering edit mode

Good catch. But hopefully that is an error since the next sentence says this:

I have mapped them, counted the number of reads per gene and loaded the dataset into R

ADD REPLY • link 6.2 years ago by GenoMax 150k

0

Entering edit mode

~~As strain and type are the names of the arguments, it should be strain= and type=, not "strain"= and "type"=.~~

edit: in addition, you have two strain= arguments, you should rename one of them.

edit2: indeed, the command works fine either with or without double quotes.

ADD REPLY • link 6.2 years ago by h.mon 35k

0

Entering edit mode

I agree. But even with "" the command works fine when the sample is lower than 171. I still have the same problem with these changes:

> colData<-data.frame(strain=c("10NC87.1","11NC96.1","12NC99.1","13NC34.2","14NC39.1","15NC52.3","16NC54.2","17NC58.1","18NC60.1","19NC60.2","1NC105.1","20NC63.2","2NC28.1","3NC67.2","3P51S75","4NC69.1","5NC71.1","6NC73.1","7NC76.1","8NC80.1","9NC85.2","A01.311S1merged.bam","A02.486S8merged.bam","A03.488S16merged.bam","A04.571S24merged.bam","A05.582S32merged.bam","A06.593S40merged.bam","A07.670S1merged.bam","A08.700S8merged.bam","A09.728S15merged.bam","A10.734S23merged.bam","A11.363S31merged.bam","A12.667S38merged.bam","AC9S2","B01.579S2merged.bam","B02.532S9merged.bam","B03.655S17merged.bam","B04.672S25merged.bam","B05.505S33merged.bam","B06.786S41merged.bam","B07.487S2merged.bam","B08.530S9merged.bam","B09.544S16merged.bam","B10.576S24merged.bam","B11.577S32merged.bam","B12.578S39merged.bam","B1AS67","B25CS96","B34AS78","B41AS84","BM5AS25","BS3","C01.580S3merged.bam","C02.600S10merged.bam","C03.763S18merged.bam","C04.732S26merged.bam","C05.398S34merged.bam","C06.118S42merged.bam","C09.586S17merged.bam","C10.531S25merged.bam","C11.ws2162S33merged.bam","C12.815S40merged.bam","CF2ddS82","CH14AS81","CT6AS54","CT9AS51","D01.608S4merged.bam","D02.777S11merged.bam","D03.401S19merged.bam","D04.735S27merged.bam","D05.606S35merged.bam","D06.738S43merged.bam","D07.616S3merged.bam","D08.561S10merged.bam","D09.602S18merged.bam","D10.758S26merged.bam","D11.180S34merged.bam","D12.18S41merged.bam","DCB5AS23","DD10C2S22","DD20B2bS49","DD20BS59","DD44S14","DD7S7","E01.642S5merged.bam","E02.744S12merged.bam","E03.448S47merged.bam","E04.375S46merged.bam","E05.805S36merged.bam","E06.ws655S44merged.bam","E07.317S4merged.bam","E08.ws380S1L001","E09.782S19merged.bam","E10.c5aS27merged.bam","E11.413S35merged.bam","E12.417S42merged.bam","E2C2S74","EI10AS57","F01.524S6merged.bam","F02.433S13merged.bam","F03.749S21merged.bam","F04.648S29merged.bam","F05.336S37merged.bam","F06.572S45merged.bam","F07.756S5merged.bam","F08.587S12merged.bam","F09.PJ11S20merged.bam","F10.583S28merged.bam","F11.483S36merged.bam","F12.438S43merged.bam","FC4CS19","G01.427S7merged.bam","G02.750S14merged.bam","G03.442S22merged.bam","G04.419S30merged.bam","G05.307S38merged.bam","G06.949S46merged.bam","G07.826S6merged.bam","G08.434S13merged.bam","G09.366S21merged.bam","G10.421S29merged.bam","H02.537S15merged.bam","H03.568S23merged.bam","H04.824S31merged.bam","H05.181S39merged.bam","H05.181S45merged.bam","H06.mfdS47merged.bam","H07.ws582S7merged.bam","H08.v12S14merged.bam","H09.21S22merged.bam","H10.9S30merged.bam","H11.1071S2L001","H11.1071S37merged.bam","H11A3S94","H12.304S44merged.bam","H15A1S66","H15B1S85","H20B2S80","H4A1S90","HD45B1S46","HD48D1S83","HD54C1S30","LB10CS70","LL20DS92","M1AS4","M4BS1","MA2A1S27","MA2F1S12","MA4B1S50","NC1011S73","NC21B1S87","NC26C1S34","NC26L1S79","NC26V1S55","NC282S89","NC412S68","NC431S58","NC672S21","NC741S8","NC752S10","NC942S42","OH594S18","OHIOS15","OZK11AS48","PL11AS88","S118S24","S220S53","S25S39","S2AS9","S53S40","SM12AS13","TN34A1S35","TN39C2S6","TN40J3S31","TN45T3AS33","TN50J1S95","TN52E1S65","TN52F1S5","TN52G1S64","TNSC14S28","V301B2S60","V319B3S29","V323C1BS47","V324B3S16","V328A1S32","V329B1S43","V330B1S62","V330D1S72","V330D2S91","V331C1S61","V331D1S71","V331D2S38","V336B1S36","V341A2S26","V341C1S17","V342B2S76","V343D2S44","V348C1S63","V4F4S37","V54C2S52","V55A3S41","V56D1S86","V64D2S56","V72B3S20","WS1956S69","WS2162S93","WS472S11","WS7S45","ZA3AS77","Ax4"),level=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","control"),type="paired.end")+

ADD REPLY • link 6.2 years ago by popescuiofelia ▴ 10

0

Entering edit mode

I can run it on my laptop (same specs as yours):

head(colData)

  strain       level       type
1 10NC87.1     1           paired.end
2 11NC96.1     2           paired.end
3 12NC99.1     3           paired.end
4 13NC34.2     4           paired.end
5 14NC39.1     5           paired.end
6 15NC52.3     6           paired.end

ADD REPLY • link 6.2 years ago by h.mon 35k

0

Entering edit mode

Thank you for your replies! I managed to run it when I deleted the sorted.bam and merged.bam endings from the strain names.

ADD REPLY • link 6.2 years ago by popescuiofelia ▴ 10