Hello,
I would like to use ORFik to determine the coverage of the different ORFs across the maize genome. I have ribo-seq data, the latest annotation file (a GFF3), and the v5 genome fasta file for B73.
After running my code, three Large CompressedGRangesLists are created and none of them have any values in seqlengths. The lack of values in seqlengths seems to be causing the error. I downloaded the GFF3 from ensemblPlants, so I expected it to work. Can I manually modify the Large CompressedGRangesLists directly from the GFF3 file?
Here is my code (the error messages I get are right after):
# Import packages ----
library("ORFik", lib = "~/Rlibs") # Loads the package
library("GenomicRanges", lib = "~/Rlibs")
library("GenomicFeatures", lib = "~/Rlibs")
# Specify files locations
#where_to_save_config <- "~/Bio_data/ORFik_config.csv"
#parent_folder <- "~/Bio_data"
bams <- "~/Documents/R/Ribo-seq/processed_data/"
gtf <- "~/Documents/R/Ribo-seq/references/Zea_mays.Zm-B73-REFERENCE-NAM-5.0.55.chr.gtf"
genome <- "~/Documents/R/Ribo-seq/references/Zm-B73-REFERENCE-NAM-5.0.fa"
exper.name <- "ORF_maize"
# Create the experiment:
template <- create.experiment(dir = bams, # directory of the NGS files for the experiment
exper.name, # Experiment name
txdb = gtf, # gtf / gff / gff.db annotation
fa = genome, # Fasta genome
organism = "Zea mays", # Scientific naming
types = "bam",
stage = c("V12","V12","14d","14d","14d","14d"),
rep = c("1","2","1","2","1","2"),
condition = c("ear1", "ear2", "leaf1", "leaf2", "root1", "root2"),
fraction = c("OTHER","OTHER","OTHER","OTHER","OTHER","OTHER"),
saveDir = NULL, # Create template instead of ready experiment
)
df <- read.experiment(template)# read experiment from template
save.experiment(df, file = "~/Bio_data/ORFik_config.csv")
df
# PATH to bam files
filepath(df, type = "default")
# Loading NGS data to a specified environment
envExp(df) #This will be the environment
# Determining the library names in the ORFik experiment
bamVarName(df) #This will be the names
# Auto-loading the libraries to the environment
outputLibs(df) # With default output.mode = "envir".
# Loading genomic annotations
txdb <- loadTxdb(df) # transcript annotation
# Make 100-bases-sized meta window for each library in experiment
transcriptWindow(leaders, cds, trailers, df, outdir = "~/Bio_data/",
windowSize = 100)
shiftFootprintsByExperiment(df)
I get two error messages because of the same reason (I believe). I get this one when I run transcriptWindow()
:
transcriptWindow(leaders, cds, trailers, df, outdir = "~/Bio_data/", windowSize = 100)
/home/R/Ribo-seq/processed_data/ear2.unique.bam
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Sorting shifted footprints...
Error: BiocParallel errors
4 remote errors, element index: 1, 3, 4, 5
1 unevaluated and other errors
first remote error:
Error in covRleFromGR(x, weight = weight, ignore.strand = ignore.strand): Seqlengths of x contains NA values!
I get this error when I run shiftFootprintsByExperiment()
:
shiftFootprintsByExperiment(df)
Shifting reads in experiment: ORF_maize
Saving ofst files to: /home/R/Ribo-seq/processed_data/pshifted/
Saving wig files to: /home/R/Ribo-seq/processed_data/pshifted/
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
/home/R/Ribo-seq/processed_data/ear1.unique.bam
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Sorting shifted footprints...
RFP_ear1_V12_r1
Error in x$.self$finalize() : attempt to apply non-function
In addition: Warning message:
In .merge_two_Seqinfo_objects(x, y) :
The 2 combined objects have no sequence levels in common. (Use
suppressWarnings() to suppress this warning.)
RFP_ear2_V12_r2
RFP_leaf1_14d_r1
RFP_leaf2_14d_r2
Error: BiocParallel errors
0 remote errors, element index:
2 unevaluated and other errors
first remote error:
Thank you for your help!
Hello,
I am also working with maize, but v4 genome. I tried to perform the same computations, and got exactly the same error. Installation of the last ORFik version did not help.
What else is possible to do?
I would be grateful for any help!
Unfortunately the link or post referred to by @hauken_heyken in answer below was removed. Please reach out to then directly via GitHub or their website.