DESeq dataset split into 2 subsets?
0
0
Entering edit mode
5.8 years ago

I have 210 genomic DNA sequences of different strains of amoeba and I am trying to do DESeq analysis on them (I am looking for lowest logfold changes). I have mapped them, counted the number of reads per gene and loaded the dataset into R. Unfortunately, I can't make the data frame that encompasses the sequence ID, the level (control or other) and the type of sequencing (paired or single end). Since after the command R gives a + I was pretty sure it is a punctuation problem but I have looked over and over and I can't find anything wrong. I used the command as below:

colData <- data.frame("strain"=c("10NC87.1","11NC96.1","12NC99.1","13NC34.2","14NC39.1","15NC52.3","16NC54.2","17NC58.1","18NC60.1","19NC60.2","1NC105.1","20NC63.2","2NC28.1","3NC67.2","3P51S75","4NC69.1","5NC71.1","6NC73.1","7NC76.1","8NC80.1","9NC85.2","A01.311S1merged.bam","A02.486S8merged.bam","A03.488S16merged.bam","A04.571S24merged.bam","A05.582S32merged.bam","A06.593S40merged.bam","A07.670S1merged.bam","A08.700S8merged.bam","A09.728S15merged.bam","A10.734S23merged.bam","A11.363S31merged.bam","A12.667S38merged.bam","AC9S2","B01.579S2merged.bam","B02.532S9merged.bam","B03.655S17merged.bam","B04.672S25merged.bam","B05.505S33merged.bam","B06.786S41merged.bam","B07.487S2merged.bam","B08.530S9merged.bam","B09.544S16merged.bam","B10.576S24merged.bam","B11.577S32merged.bam","B12.578S39merged.bam","B1AS67","B25CS96","B34AS78","B41AS84","BM5AS25","BS3","C01.580S3merged.bam","C02.600S10merged.bam","C03.763S18merged.bam","C04.732S26merged.bam","C05.398S34merged.bam","C06.118S42merged.bam","C09.586S17merged.bam","C10.531S25merged.bam","C11.ws2162S33merged.bam","C12.815S40merged.bam","CF2ddS82","CH14AS81","CT6AS54","CT9AS51","D01.608S4merged.bam","D02.777S11merged.bam","D03.401S19merged.bam","D04.735S27merged.bam","D05.606S35merged.bam","D06.738S43merged.bam","D07.616S3merged.bam","D08.561S10merged.bam","D09.602S18merged.bam","D10.758S26merged.bam","D11.180S34merged.bam","D12.18S41merged.bam","DCB5AS23","DD10C2S22","DD20B2bS49","DD20BS59","DD44S14","DD7S7","E01.642S5merged.bam","E02.744S12merged.bam","E03.448S47merged.bam","E04.375S46merged.bam","E05.805S36merged.bam","E06.ws655S44merged.bam","E07.317S4merged.bam","E08.ws380S1L001","E09.782S19merged.bam","E10.c5aS27merged.bam","E11.413S35merged.bam","E12.417S42merged.bam","E2C2S74","EI10AS57","F01.524S6merged.bam","F02.433S13merged.bam","F03.749S21merged.bam","F04.648S29merged.bam","F05.336S37merged.bam","F06.572S45merged.bam","F07.756S5merged.bam","F08.587S12merged.bam","F09.PJ11S20merged.bam","F10.583S28merged.bam","F11.483S36merged.bam","F12.438S43merged.bam","FC4CS19","G01.427S7merged.bam","G02.750S14merged.bam","G03.442S22merged.bam","G04.419S30merged.bam","G05.307S38merged.bam","G06.949S46merged.bam","G07.826S6merged.bam","G08.434S13merged.bam","G09.366S21merged.bam","G10.421S29merged.bam","H02.537S15merged.bam","H03.568S23merged.bam","H04.824S31merged.bam","H05.181S39merged.bam","H05.181S45merged.bam","H06.mfdS47merged.bam","H07.ws582S7merged.bam","H08.v12S14merged.bam","H09.21S22merged.bam","H10.9S30merged.bam","H11.1071S2L001","H11.1071S37merged.bam","H11A3S94","H12.304S44merged.bam","H15A1S66","H15B1S85","H20B2S80","H4A1S90","HD45B1S46","HD48D1S83","HD54C1S30","LB10CS70","LL20DS92","M1AS4","M4BS1","MA2A1S27","MA2F1S12","MA4B1S50","NC1011S73","NC21B1S87","NC26C1S34","NC26L1S79","NC26V1S55","NC282S89","NC412S68","NC431S58","NC672S21","NC741S8","NC752S10","NC942S42","OH594S18","OHIOS15","OZK11AS48","PL11AS88","S118S24","S220S53","S25S39","S2AS9","S53S40","SM12AS13","TN34A1S35","TN39C2S6","TN40J3S31","TN45T3AS33","TN50J1S95","TN52E1S65","TN52F1S5","TN52G1S64","TNSC14S28","V301B2S60","V319B3S29","V323C1BS47","V324B3S16","V328A1S32","V329B1S43","V330B1S62","V330D1S72","V330D2S91","V331C1S61","V331D1S71","V331D2S38","V336B1S36","V341A2S26","V341C1S17","V342B2S76","V343D2S44","V348C1S63","V4F4S37","V54C2S52","V55A3S41","V56D1S86","V64D2S56","V72B3S20","WS1956S69","WS2162S93","WS472S11","WS7S45","ZA3AS77","Ax4"),"strain"=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","control"),"type"="paired.end")

I have tried to see where R thinks the problem is and up until the 171 strain it gives no error. Further down, even if I add different strain names it still thinks the command wrong. I have read online that R has a limit of rows and column that you can make but I am far from that threshold. Other said something about the RAM but I have a 7.7 GiB of RAM. It is a really dumb question, but can somebody please explain what is going on and why I always get that +? If I can't make it work, will it affect the log fold changes results if I split the data into 2 sets of 105 sequences?

Thanks!

software error R • 1.9k views
ADD COMMENT
2
Entering edit mode

You want to run deseq on genomic dna? Are you sure that's meaningful?

ADD REPLY
0
Entering edit mode

Good catch. But hopefully that is an error since the next sentence says this:

I have mapped them, counted the number of reads per gene and loaded the dataset into R

ADD REPLY
0
Entering edit mode

As strain and type are the names of the arguments, it should be strain= and type=, not "strain"= and "type"=.

edit: in addition, you have two strain= arguments, you should rename one of them.

edit2: indeed, the command works fine either with or without double quotes.

ADD REPLY
0
Entering edit mode

I agree. But even with "" the command works fine when the sample is lower than 171. I still have the same problem with these changes:

> colData<-data.frame(strain=c("10NC87.1","11NC96.1","12NC99.1","13NC34.2","14NC39.1","15NC52.3","16NC54.2","17NC58.1","18NC60.1","19NC60.2","1NC105.1","20NC63.2","2NC28.1","3NC67.2","3P51S75","4NC69.1","5NC71.1","6NC73.1","7NC76.1","8NC80.1","9NC85.2","A01.311S1merged.bam","A02.486S8merged.bam","A03.488S16merged.bam","A04.571S24merged.bam","A05.582S32merged.bam","A06.593S40merged.bam","A07.670S1merged.bam","A08.700S8merged.bam","A09.728S15merged.bam","A10.734S23merged.bam","A11.363S31merged.bam","A12.667S38merged.bam","AC9S2","B01.579S2merged.bam","B02.532S9merged.bam","B03.655S17merged.bam","B04.672S25merged.bam","B05.505S33merged.bam","B06.786S41merged.bam","B07.487S2merged.bam","B08.530S9merged.bam","B09.544S16merged.bam","B10.576S24merged.bam","B11.577S32merged.bam","B12.578S39merged.bam","B1AS67","B25CS96","B34AS78","B41AS84","BM5AS25","BS3","C01.580S3merged.bam","C02.600S10merged.bam","C03.763S18merged.bam","C04.732S26merged.bam","C05.398S34merged.bam","C06.118S42merged.bam","C09.586S17merged.bam","C10.531S25merged.bam","C11.ws2162S33merged.bam","C12.815S40merged.bam","CF2ddS82","CH14AS81","CT6AS54","CT9AS51","D01.608S4merged.bam","D02.777S11merged.bam","D03.401S19merged.bam","D04.735S27merged.bam","D05.606S35merged.bam","D06.738S43merged.bam","D07.616S3merged.bam","D08.561S10merged.bam","D09.602S18merged.bam","D10.758S26merged.bam","D11.180S34merged.bam","D12.18S41merged.bam","DCB5AS23","DD10C2S22","DD20B2bS49","DD20BS59","DD44S14","DD7S7","E01.642S5merged.bam","E02.744S12merged.bam","E03.448S47merged.bam","E04.375S46merged.bam","E05.805S36merged.bam","E06.ws655S44merged.bam","E07.317S4merged.bam","E08.ws380S1L001","E09.782S19merged.bam","E10.c5aS27merged.bam","E11.413S35merged.bam","E12.417S42merged.bam","E2C2S74","EI10AS57","F01.524S6merged.bam","F02.433S13merged.bam","F03.749S21merged.bam","F04.648S29merged.bam","F05.336S37merged.bam","F06.572S45merged.bam","F07.756S5merged.bam","F08.587S12merged.bam","F09.PJ11S20merged.bam","F10.583S28merged.bam","F11.483S36merged.bam","F12.438S43merged.bam","FC4CS19","G01.427S7merged.bam","G02.750S14merged.bam","G03.442S22merged.bam","G04.419S30merged.bam","G05.307S38merged.bam","G06.949S46merged.bam","G07.826S6merged.bam","G08.434S13merged.bam","G09.366S21merged.bam","G10.421S29merged.bam","H02.537S15merged.bam","H03.568S23merged.bam","H04.824S31merged.bam","H05.181S39merged.bam","H05.181S45merged.bam","H06.mfdS47merged.bam","H07.ws582S7merged.bam","H08.v12S14merged.bam","H09.21S22merged.bam","H10.9S30merged.bam","H11.1071S2L001","H11.1071S37merged.bam","H11A3S94","H12.304S44merged.bam","H15A1S66","H15B1S85","H20B2S80","H4A1S90","HD45B1S46","HD48D1S83","HD54C1S30","LB10CS70","LL20DS92","M1AS4","M4BS1","MA2A1S27","MA2F1S12","MA4B1S50","NC1011S73","NC21B1S87","NC26C1S34","NC26L1S79","NC26V1S55","NC282S89","NC412S68","NC431S58","NC672S21","NC741S8","NC752S10","NC942S42","OH594S18","OHIOS15","OZK11AS48","PL11AS88","S118S24","S220S53","S25S39","S2AS9","S53S40","SM12AS13","TN34A1S35","TN39C2S6","TN40J3S31","TN45T3AS33","TN50J1S95","TN52E1S65","TN52F1S5","TN52G1S64","TNSC14S28","V301B2S60","V319B3S29","V323C1BS47","V324B3S16","V328A1S32","V329B1S43","V330B1S62","V330D1S72","V330D2S91","V331C1S61","V331D1S71","V331D2S38","V336B1S36","V341A2S26","V341C1S17","V342B2S76","V343D2S44","V348C1S63","V4F4S37","V54C2S52","V55A3S41","V56D1S86","V64D2S56","V72B3S20","WS1956S69","WS2162S93","WS472S11","WS7S45","ZA3AS77","Ax4"),level=c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100","101","102","103","104","105","106","107","108","109","110","111","112","113","114","115","116","117","118","119","120","121","122","123","124","125","126","127","128","129","130","131","132","133","134","135","136","137","138","139","140","141","142","143","144","145","146","147","148","149","150","151","152","153","154","155","156","157","158","159","160","161","162","163","164","165","166","167","168","169","170","171","172","173","174","175","176","177","178","179","180","181","182","183","184","185","186","187","188","189","190","191","192","193","194","195","196","197","198","199","200","201","202","203","204","205","206","207","208","209","control"),type="paired.end")+
ADD REPLY
0
Entering edit mode

I can run it on my laptop (same specs as yours):

head(colData)
  strain       level       type
1 10NC87.1     1           paired.end
2 11NC96.1     2           paired.end
3 12NC99.1     3           paired.end
4 13NC34.2     4           paired.end
5 14NC39.1     5           paired.end
6 15NC52.3     6           paired.end
  
ADD REPLY
0
Entering edit mode

Thank you for your replies! I managed to run it when I deleted the sorted.bam and merged.bam endings from the strain names.

ADD REPLY

Login before adding your answer.

Traffic: 1860 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6