Hi,
I have a data set which includes A, T,C,G. When I use read.table command for this data set, I see TRUE's in terms of T alleles. It happens if the entire column just includes T. How can I fix that problem?
Thanks.
Hi,
I have a data set which includes A, T,C,G. When I use read.table command for this data set, I see TRUE's in terms of T alleles. It happens if the entire column just includes T. How can I fix that problem?
Thanks.
read.table()
tries to guess the class of the input data and will sometimes be mislead.
> read.table(text=("A C G T"))
V1 V2 V3 V4
1 A C G TRUE
> summary(read.table(text=("A C G T")))
V1 V2 V3 V4
A:1 C:1 G:1 Mode:logical
TRUE:1
NA's:0
It is possible to specify in advance the class of the columns; see ?read.table
for details.
> read.table(text=("A C G T"), colClasses = "character")
V1 V2 V3 V4
1 A C G T
Related to Chris's answer in this particular case, stringAsFactors
will not solve the problem.
> read.table(text=("A C G T"), stringsAsFactors = FALSE)
V1 V2 V3 V4
1 A C G TRUE
Note that there are other cases where T
may be coerced to TRUE
instead of "T"
. In particular, pay attention that there is one gene whose symbol is _T_ !
I suspect that using "read.table( . . . stringAsFactors=F)" will solve your problem.
(edit - it will not! see the comprehensive answer above)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I would argue this is as bad as Excel converting gene names to dates.
I assume there is a better alternative to
read.table()
without these quirks?Yes. Use readr and explicitly state the datatypes for the columns.
Not that I know,
However, this only happens when a column only contains values that look like logical.
Thank you very much all of you !
colClasses="character" solved that problem.
You are welcome. Please click on the "Accept!" button so that my answer appears at the top of the list. This is important since the other answer does not solve the problem.