How to submit a genomics package with lots of data to CRAN
1
0
Entering edit mode
5.9 years ago
mk ▴ 300

I've got a package that I'd like to submit to CRAN, but the data is huge, about 20 MB of compressed internal (used for unit tests, not exported) and about 7 MB of compressed exported data (used in vignettes).

I could maybe cut this down a bit by altering my compression but not by much. This package offers a pipeline that involves multiple manifold learning, classification, and pathway inference steps, and the testing involves high-dimensional objects (caveat: although currently in the 10 MB range, these could be made much smaller and still serve their purpose).

It's my understanding that package data (both in /data and R/sysdata.rda) should not exceed 5 MB in size. What are my options?

CRAN R genomics • 1.1k views
ADD COMMENT
4
Entering edit mode
5.9 years ago

You should create a separate data package along with your main pkg http://www.davekleinschmidt.com/r-packages/

ADD COMMENT
1
Entering edit mode

Thanks @Satosh Anand. It looks like Dave Kleinschmidt has permanently hosted the data package in that example on github, since it can't be hosted in CRAN.

I've done a bit of digging and found the following here:

Packages on which a CRAN package depends should be available from a mainstream repository: if any mentioned in ‘Suggests’ or ‘Enhances’ fields are not from such a repository, where to obtain them at a repository should be specified in an ‘Additional_repositories’ field of the DESCRIPTION file (as a comma-separated list of repository URLs) or for other means of access, described in the ‘Description’ field. A package listed in ‘Suggests’ or ‘Enhances’ should be used conditionally in examples or tests if it cannot straightforwardly be installed on the major R platforms

According to this post on SO, it seems that Github would qualify as one "mainstream repository" required by the Policy.

ADD REPLY

Login before adding your answer.

Traffic: 2408 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6