Masking before RNA-Seq Alignment and Gene Prediction in Plant Genomes
0
0
Entering edit mode
17 months ago
Ayish • 0

Hello Experts,

I'm interested in conducting gene prediction for a plant genome using RNA sequencing data. To achieve this, I intend to perform RNA sequencing alignment and employ BRAKER2 for gene prediction. Before running BRAKER2, I'm considering applying soft-masking to the genome. I would like to clarify whether I should perform masking before creating the BAM alignment file or if it's possible to align the RNA sequencing reads with the genome without masking. I thought it should not matter for RNA-seq data because they are coming from exons which I assume contains no long-range repeats.

Request your suggestion/guidance.

Thank you in advance.

RNA-evidence genome prediction masking • 1.0k views
ADD COMMENT
1
Entering edit mode

Soft masking the genome (unless something changed) not make any difference for STAR. Depending on how your RNA-Seq is done (Illumina? paired?) you may improve the mapping by checking for overlaps between forward and reverse reads in the pair. See "Merging and mapping of overlapping paired-end reads." in the STAR manual

ADD REPLY
0
Entering edit mode

Thank you for reply. I have Illumina paired-end reads. Would it be fine if I use hard-masking for STAR and soft-masked genome for BRAKER2? I will take a look at mentioned section.

ADD REPLY
0
Entering edit mode

I am not sure what you gain by using hard masked genome for RNA-Seq mapping. There will be reads covering some introns. But if you mask repeats you risk that some exons will be truncated/have an introduced NNN gap because say 20bp CA repeat.
Then at least in some species you may have genome with a large number of pseudogenes/gene fragments derived from a supposedly real genes. No idea how you got your set of repeats for that species, but there is a chance that such tricky genes will be classified as repeats, and without any RNA-Seq mappings to a subset of these you will miss them.

ADD REPLY

Login before adding your answer.

Traffic: 2434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6