Hello,
I am the aothor who developed the code of the iso-ktsp software you were trying to use. I developed it when I was still a bachelor's degree student in computer engineering, as the final project of my degree, and once it was done and working, I ended my collaboration with that research group and didn't pay attention to this software anymore. So, I am sorry for taking so long to answer to you, hopefully I can still be helpful.
I've managed to reproduce your error with the test dataset you provided and I found the reason why it gives you this error. Let me explain.
The "normal" mode of the algorithm has two steps. One where it runs over partitions of the samples to find the best k (the optimal number of pairs for the predictive model) and then a final step that runs over all the data to find the final model. It is the first part that is giving you problems. You only have 3 tumor samples, whereas the default number of iterations to find the best k is 10, but you can't partition 3 samples into 10 iterations. The iterations should not be greater than the least represented class of samples. There's the option -n which allows you to define the iterations for the first part of the algorithm, so if you run the program with the option "-n 3" it works with your dataset. Still, 3 tumor samples and 8 normal ones is a small sample size for this algorithm, so don't expect great results,
I did program a check for this error and it was intedned to give a meaningful message instead of the java exceptions, however it seems that the check isn't working. I appologise for the inconvenience.
Also, I should warn that the example of dataset you provided here won't work as intended for the isofrom version of the algorithm (the -i option) because you only have one isoform for each gene. The isoform version only considers pairs formed by two isoforms of the same gene, so if there are no genes with at least 2 isoforms, it won't be able to form any pairs.
In case you want examples of full datasets used in the article we published, you can find them here: https://figshare.com/articles/TCGA_Iso_kTSP_analysis_dataset/1061917 . For gene datasets, look for the files named "_gene_read_paired" and for isofrom datasets check the ones named "_iso_read_paired".
Once again, I am sorry for being so late with this reply, any other doubts you have, please do ask.
Hi , Can you give us your command line ? By the way the documentation don't gives java version and dependencies from links you gives.
yes, that is the problem, The manual is too simple! My command line is "
java -jar iso-kTSP_v1.0.3.jar input_file
" below the part of my input file, you may check and test by using it.Please use
ADD REPLY/ADD COMMENT
when responding to existing posts to keep the threads logically organized.Is there any example of input data in the documentation ? you could make some first test to check if the software run in a good way. I don't see header in you input example and there is space lines .
There is no any example file. At least I didn`t find it. There is a paragraph which depict the basic structure of input file. Below is this discription:
Examples of calls: java -jar iso-kTSP gene_seq.txt java -jar iso-kTSP -o out_iso_analysis.txt -i -k 12 iso_data.tab java -jar iso-kTSP -o out_iso_analysis.txt /home/user/iso_data.tab -c tumor normal -i -n 15 -s 40 -k 4
Input format: The expected format for the input dataset is a tab-separated plain text file (with any extension), where the first row contains the sample labels with suffixes to differentiate between samples belonging to different classes, not necessarily paired. Subsequent lines contain the "gene_id", or "gene_id,isoform_id" for isoforms, in the first column followed by the sample data values (in any numerical format that java can parse), in the same order as in the first row.