scATAC annotation file for zebrafish
0
1
Entering edit mode
21 months ago
ravihansa82 ▴ 130

Hi all,

I started to analyze scATAC Seq data. I obtained the data from SRA. I have a trouble regarding gene annotation. I used Danio_rerio.GRCz11.109.gtf file to create a GRange object using AcidGenomics package. Here is the r script I used for that:

DanioAnno <-makeGRangesFromGFF("Danio_rerio.GRCz11.109.gtf", level = c("genes", "transcripts"), ignoreVersion = T, synonyms = F)

It created the following output.

EnsemblGenes object with 32520 ranges and 8 metadata columns: (showing only the col names)

                       seqnames      ranges strand | broadClass    geneBiotype             geneId        geneIdVersion
                          <Rle>   <IRanges>  <Rle> |      <Rle>          <Rle>              <Rle>                <Rle>

                           geneName     geneSource         source  type
                              <Rle>          <Rle>          <Rle> <Rle>

Then I performed CreateChromatinAssay while passing the just created annotation file and then created the seurat object (sobj). Now it is giving me an error when I perform TSSEnrichment as the following code.

sobj <- TSSEnrichment(object = sobj, fast = FALSE)

Error in GetTSSPositions(ranges = annotations) : 
  Gene annotation does not contain gene_biotype information.

However, geneBiotype col is there. Is this error due to change of col name (geneBiotype instead of gene_biotype)?

Can someone can help me to figure out this error. I also appreciate it if you could share a method for gene annotation for zebrafish when performing scATAC analysis. Thank you

scRNA scATAC-Seq • 939 views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
0
Entering edit mode

Hi @ravihansa82 , I'm the developer the AcidGenomes package. Let me know if this reproducible example helps with your scATAC workflow. Note that the package consistently uses lowerCamelCase formatting for metadata columns, and it looks like the workflow you're using expects snake_case formatting.

suppressPackageStartupMessages({
    library(AcidGenomes) # 0.5.1
    library(syntactic) # 0.6.6
})
file <- "https://ftp.ensembl.org/pub/release-110/gtf/danio_rerio/Danio_rerio.GRCz11.110.gtf.gz"
gr <- makeGRangesFromGFF(file = file, level = "genes", ignoreVersion = TRUE)
print(class(gr))
## [1] "EnsemblGenes"
## attr(,"package")
## [1] "AcidGenomes"
print(colnames(mcols(gr)))
##  [1] "broadClass"    "description"   "geneBiotype"   "geneId"       
##  [5] "geneIdVersion" "geneName"      "geneSource"    "geneSynonyms" 
##  [9] "ncbiGeneId"    "source"        "type"         
##  [1] "broadClass"    "description"   "geneBiotype"   "geneId"       
##  [5] "geneIdVersion" "geneName"      "geneSource"    "geneSynonyms" 
##  [9] "ncbiGeneId"    "source"        "type"
## Need to coerce from EnsemblGenes to GRanges so you can modify the mcols.
gr <- as(gr, "GRanges")
print(class(gr))
## [1] "GRanges"
## attr(,"package")
## [1] "GenomicRanges"
## [1] "GRanges"
## attr(,"package")
## [1] "GenomicRanges"
colnames(mcols(gr)) <- snakeCase(colnames(mcols(gr)))
print(colnames(mcols(gr)))
##  [1] "broad_class"     "description"     "gene_biotype"    "gene_id"        
##  [5] "gene_id_version" "gene_name"       "gene_source"     "gene_synonyms"  
##  [9] "ncbi_gene_id"    "source"          "type"           
##  [1] "broad_class"     "description"     "gene_biotype"    "gene_id"        
##  [5] "gene_id_version" "gene_name"       "gene_source"     "gene_synonyms"  
##  [9] "ncbi_gene_id"    "source"          "type"
ADD REPLY

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6