Question

rna differential expression

2

Entering edit mode

9.8 years ago

capleo88 ▴ 20

Hi everyone,

I need help for my project. Actually I don't know how to proceed with my results. Briefly I had the entire trancriptomes of different strains of lb. plantaum sequenced with the Illumina HiSeq. I got my results in fastq format (each of these files is about 12 gigs) and processed them with the software DNA-Star. I mapped them against another strains and got a list of genes (about 3000) with their relative expression level. Unfortunately the name of these genes is in a format (JDM1_RS05500, just to make an example) that is not related with anything in any database. If I try to search it on kegg or uniprot I don't have any result. The only one that gives me a result is ncbi but leads me to the webpage of the entire genome of the strain (i suggest you to try to search for it yourselves, just to have an idea - search for JDM1_RS05500).

This name corresponds to the locus_tag. I am trying to convert each of the 3000 gene in a format like this (agrB) to match it with the kegg database and find the pathway in which it's involved. There are two main problems though: 1) it will take me ages to look for each of my 3000 genes; 2) once I've done it, I wouldn't know how to proceed anyway.

I understand my big limit of knowledge in this field (I'm sure that even slightly experienced ones between you realized it) but I'm asking just for some suggestion to better understand where to start from.

Thanks a million to everyone

map blast RNA-Seq kegg alignment • 1.7k views

ADD COMMENT • link updated 3.0 years ago by Ram 45k • written 9.8 years ago by capleo88 ▴ 20

Ram · Answer 1 · 2015-11-04

1

Entering edit mode

9.8 years ago

Antonio R. Franco ★ 5.2k

If you mapped the sequence to a reference with a gene_id which is unusual, you can do nothing with it

I managed to find a page where you can download the genome sequence and the associated gff file that is describing the genes features, the sequence in genbank format, and also a tabular list with all kind of details

Maybe you can use this sequence and the gff as reference to map again your data

ADD COMMENT • link updated 5.7 years ago by Ram 45k • written 9.8 years ago by Antonio R. Franco ★ 5.2k

0

Entering edit mode

Thanks for your answer. Actually I already have a similar tab with all the protein products. Even in the genome you suggested there's no direct link to the kegg. SO I just can search for each of them individually. I was just wondering how to proceed after that. Is there a way to cross the genes with the pathway or maybe I should do it manually?

ADD REPLY • link 9.8 years ago by capleo88 ▴ 20