I am trying to read data from a bedfile into R, but I lose the last two columns when the read.table function is applied.
My bedfile has 6 columns:
$ head CHX_clean_sorted_data_plus.bed
chr1 630725 630752 SN1052:386:C9VFHACXX:1:1305:11494:98114#RB:GTGCC 50 +
chr1 630728 630752 SN1052:386:C9VFHACXX:1:1116:5611:86646#RB:GAGTG 50 +
chr1 630728 630751 SN1052:386:C9VFHACXX:1:1213:17960:21112#RB:TTAGA 50 +
chr1 630728 630752 SN1052:386:C9VFHACXX:1:1312:11292:52265#RB:TCAAA 50 +
chr1 634005 634030 SN1052:386:C9VFHACXX:1:1110:17705:92051#RB:GTGCG 50 +
chr1 634337 634367 SN1052:386:C9VFHACXX:1:2102:4448:4217#RB:TTGGA 50 +
The final two columns are lost when I use read.table to read the data into R
a <- read.table("CHX_clean_sorted_data_plus.bed", sep="\t", blank.lines.skip=FALSE)
dim(a)
[1] 144712 4
head(a)
V1 V2 V3 V4
1 chr1 630725 630752 SN1052:386:C9VFHACXX:1:1305:11494:98114
2 chr1 630728 630752 SN1052:386:C9VFHACXX:1:1116:5611:86646
3 chr1 630728 630751 SN1052:386:C9VFHACXX:1:1213:17960:21112
4 chr1 630728 630752 SN1052:386:C9VFHACXX:1:1312:11292:52265
5 chr1 634005 634030 SN1052:386:C9VFHACXX:1:1110:17705:92051
6 chr1 634337 634367 SN1052:386:C9VFHACXX:1:2102:4448:4217
I would appreciate any insight into what is causing this and how to fix it.
-Lauren
See read.table does not read in all rows! for a potential solution. The problem in your case seems to be the '#' sign.
I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:
Stealing the screenshot for future use since you have already done the effort :)
Don't forget to cite me then, every time you use it ;) Having this as a standard moderation answer would be convenient :-p
Since @Istvan has finally completed the ChIP-seq chapter in Biostars handbook he may have time to revisit our Biostars feature wish list.
Thanks so much for your help. This worked great!
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.