Entering edit mode
3.4 years ago
tiagobellintani
▴
40
Hi friends, how are you? I need your help with my study project. I have RNAseq data for 3 species (6 replicates per species), species form a genus within bedbugs. I would like to analyze the differential expression of these species, however I have doubts if the answers I get are real or methodological biases. At the moment I am using all assemblies (18 = 6 per species) to map the expression and I have interesting results, however I don't know if it could be technical bias.
forward my results
Utilizei:
#build reference with all assemblies (n =18)
kallisto index -i reference ( all assemblies)
#analysed by sample (pairend)
kallisto quant -i reference.idx -o output --rf-stranded -b 100 r1.fasta r2.fasta
## estimetes
Trinity/util/abundance_estimates_to_matrix.pl \
--est_method kallisto --gene_trans_map reference.fasta.gene_trans_map \
--name_sample_by_basedir --cross_sample_norm TMM --out_prefix outdir \
sample1, sample2 ...sample18
Trinity/Analysis/DifferentialExpression/run_DE_analysis.pl --matrix gene.counts.matrix --method edgeR --output out --dispersion 0.1
Trinity/Analysis/DifferentialExpression/analyze_diff_expr.pl --matrix gene.TMM.EXPR.matrix --max_genes_clust 1000000 -P 1e-3 -C 4
Trinity/Analysis/DifferentialExpression/define_clusters_by_cutting_tree.pl -R / diffExpr.P1e-3_C4.matrix.RData --Ptree 60
What do you guys think will be next to trust us in my results or improve them?
do you know if each taxon was sequenced as a separate batch or all the samples were sequenced together? Samples of each taxon cluster together, this can be biologically meaningful but also could be due to a batch effect.
Hi, buddy, sorry for the delay. On sequencing they were sequenced in the same batch and the conditions were the same. My doubt is that because I don't have a genome or transcriptome as a reference, I may be obtaining "non-real" data about the expression. I chose to use all assemblies (n=18 (6 per species)) to obtain the reference and analyze against this "super reference". My collaborators are unsure about the results, however the methodology is consistent with "good practices".
I would like to be certain that I could continue with this study.