Hi
I am mapping RNAseq reads to the rat genome.
I am building
For each chromosome there is a chrX_random.fa fasta file with it.
Should I ignore these when building the splice junctions libraries? It seems no genes map to the chrX_random.fa files anyway, according to the ENSEMBL annotation i got from UCSC (though I might be wrong about this?).
I realise I should still keep them for aligning reads against.
I am following these instructions:
http://useq.sourceforge.net/usageRNASeq.html
Many thanks.
Also: I am not using e.g. tophat because my reads are 34bp, and tophat states explicitly in the manual "The software is optimized for reads 75bp or longer."
Plus I am not sure if I like the idea of tophat realigning the orginally unmapped reads in a "second round" surely this is problematic with shorter reads, since they are more likely to map ambiguously:
Wouldn't it be better to align against a genome plus junctions in the same round, to give the junctions "equal chance" of being mapped to as genomic regions, esp. with short reads which could easily map erroneously to pseudo-genes more easily than might be the case with longer reads, or paired end reads.
I want to make an exon junction bed file for CAST mouse strain. I can't get ALEXA-seq to work because of an unknown host error with cvs to ensembl. I imagine because it's quite old, do you happen to have a new version of the
alternativeExpressionDatabase/createExonJunctionDatabase.pl
program that's updated? Also perl is legit :)