Hello!
I am trying to analyse differential gene expression in 83 samples split up into three groups following the protocol described by Pertea et al., 2016 (doi:10.1038/nprot.2016.095).
Now I have two CSV files for each pairwise comparison, one with the transcripts results and other with gene results. I have assigned a column with the gene name to the gene results file following Freeze's instructions (https://www.biostars.org/p/218136/).
A lot of rows have a dot "." as gene name... I suppose that it means "unknown gene". Is it correct? But when I look for these transcripts in the transcripts results file I find that the same transcript has multiple gene names. What's happen?
A technical question: What does StringTie do when a transcript matches with multiple exomes/genes? How does StringTie count it?
Gene results:
geneNames feature id fc pval qval exp_sig
1385 . gene MSTRG.18605 0.3612696 8.836918e-05 0.02780747 Downregulated
2009 . gene MSTRG.2251 0.3705158 1.723619e-04 0.03990178 Downregulated
3565 . gene MSTRG.31880 0.2855206 2.766333e-05 0.02210375 Downregulated
3577 . gene MSTRG.3192 0.4190300 2.500196e-04 0.04590761 Downregulated
7730 . gene MSTRG.52616 0.4974403 1.902062e-04 0.04187998 Downregulated
8791 LOC102724999 gene MSTRG.57635 0.4391518 7.886925e-05 0.02780747 Downregulated
8833 RRM2 gene MSTRG.5791 0.1491026 9.653982e-06 0.01517076 Downregulated
8941 . gene MSTRG.58419 0.4839879 2.913421e-04 0.04883999 Downregulated
9248 . gene MSTRG.7286 0.4853837 4.294071e-05 0.02455956 Downregulated
Extract info about MSTRG.18605 in transcript file
geneNames geneIDs feature id fc pval qval
12632 . MSTRG.18605 transcript 92336 0.8248940 0.348378059 0.8105198
12634 . MSTRG.18605 transcript 92338 0.5838309 0.180876736 0.7302217
12635 . MSTRG.18605 transcript 92339 0.4058340 0.051912182 0.5780383
12636 . MSTRG.18605 transcript 92340 0.3070723 0.002123351 0.2270620
12637 . MSTRG.18605 transcript 92341 0.8145933 0.510077645 0.8781143
12638 HLA-C MSTRG.18605 transcript 92342 0.1351463 0.002860136 0.2504040
12639 . MSTRG.18605 transcript 92343 1.0737243 0.810680044 0.9544281
12640 . MSTRG.18605 transcript 92344 0.4356676 0.012957412 0.3988734
12641 . MSTRG.18605 transcript 92345 0.5661032 0.014102596 0.4047605
12642 . MSTRG.18605 transcript 92346 0.4012696 0.048469168 0.5719337
12643 . MSTRG.18605 transcript 92347 0.2884399 0.004146397 0.2847510
12644 . MSTRG.18605 transcript 92348 0.3167699 0.033571700 0.5162445
12645 . MSTRG.18605 transcript 92349 0.3484434 0.017986318 0.4343761
12650 . MSTRG.18605 transcript 92355 1.1018119 0.684596561 0.9242470
12656 . MSTRG.18605 transcript 92362 1.2783697 0.407525370 0.8302648
12657 . MSTRG.18605 transcript 92363 0.6356763 0.383143717 0.8199931
12658 . MSTRG.18605 transcript 92364 0.9109425 0.752058633 0.9371627
12659 . MSTRG.18605 transcript 92365 0.7067905 0.098177182 0.6499381
12661 . MSTRG.18605 transcript 92367 0.7979155 0.371616108 0.8166484
12663 . MSTRG.18605 transcript 92372 0.8521393 0.347975902 0.8105198
12664 . MSTRG.18605 transcript 92373 1.1114767 0.630943495 0.9137496
12665 . MSTRG.18605 transcript 92374 1.1819176 0.227935366 0.7639951
12667 . MSTRG.18605 transcript 92376 0.8134280 0.242299301 0.7643093
12668 . MSTRG.18605 transcript 92377 0.8685366 0.616908528 0.9100701
12669 HLA-B MSTRG.18605 transcript 92378 0.4077784 0.080955383 0.6381334
12670 . MSTRG.18605 transcript 92380 1.0450625 0.892268294 0.9766652
12671 HLA-B MSTRG.18605 transcript 92381 0.8218658 0.667177666 0.9193687
12672 HLA-B MSTRG.18605 transcript 92382 0.7802102 0.441900921 0.8454283
12673 . MSTRG.18605 transcript 92383 1.1639513 0.778891293 0.9450891
12675 . MSTRG.18605 transcript 92386 0.5742808 0.170972054 0.7188115
12676 MIR6891 MSTRG.18605 transcript 92387 0.7063845 0.054794258 0.5844075
12677 MIR6891 MSTRG.18605 transcript 92389 1.3352687 0.217143588 0.7592203
did you check the coordinates of these genes/transcripts?
hey, did you find an answer to your question?