Does anyone know where I can find the DNA sequence for human LINE-1 (L1)?
I understand that it has 2 open reading frames, so that equates in my head to maybe 2 gene symbols I should be looking for?
Does anyone know where I can find the DNA sequence for human LINE-1 (L1)?
I understand that it has 2 open reading frames, so that equates in my head to maybe 2 gene symbols I should be looking for?
Querying the EBI ENA with the text 'human line repeat' gives 155 results. Top result is
M19503 Human Line-1 repeat mRNA with 2 open reading frames.
The accession number will fetch the relevant records from Genbank or EMBL.
PUBMED; 2454389. Skowronski J., Fanning T.G., Singer M.F.; "Unit-length line-1 transcripts in human teratocarcinoma cells"; Mol. Cell. Biol. 8(4):1385-1397(1988).
PUBMED; 1701022. Swergold G.D.; "Identification, characterization, and cell specificity of a human LINE-1 promoter"; Mol. Cell. Biol. 10(12):6718-6729(1990).
That EMBL record was last updated in 2008, but I don't know what was changed. There's no guarantee that any particular record is definitive in EMBL/Genbank.
Dropping this sequence into Ensembl blast hits ENSG00000240563 which is L1TD1 in HGNC nomenclature. From there, there are links to more recent publications.
Since Lines are repeat elements (non-LTR retrotransposons to be more exact), there are several families with somewhat differing sequences. They encode reverse transcriptase and apurinic-apyrimidinic endonuclease but they are a bit more complex than that: http://gydb.org/index.php/Retroelements#Non-LTR_retroelements
There are indeed 2 ORF, but the first encodes a gag-like region, it's the second that encode the two functions above.
That 1993 report is probably good for the sequence. You could search in dbRIP for sequences (advanced search, choose subfamily l1hs).
l1td1 is the hugo id of the above NM_019079 I believe.
I would recommend getting a free RepBase account (http://www.girinst.org/accountservices/register.php) and use the annotations in the EMBL formatted RepBase records to identify the appropriate L1 sub-family relevant to your problem:
The Entrez Gene page for this gene which gives two distinct mRNAS for you gene. So according to Refseq there are two distinct transcripts but they are encoding the same protein.[?][?] The Ensembl page gives only one transcript that is identical to one of the Refseq transcript. [?]Since a picture is worth a thousand words, find below the transcripts from Refesq and Ensembl for you gene of interest:[?] [?]And if you want to keep track of it I would suggest to use his Geneid (54596) or his Ensembl ID (ENSG00000240563) cause they are more stable than the HUGO Gene Name.[?]
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks Keith. I'm a bit concerned that that sequence (GenBank M19503.1) might be a bit ancient - it's from 1993. Do you know if there are official HUGO gene symbols or RefSeqs associated with these genes? I can't seem to find any. Thanks
NM_019079 is a RefSeq accession from 2009 (linked from Ensembl above)