Conversation
|
PS: I've added two warnings while generating the command line: one about absolute paths being used (e.g., for the Details |
|
With the logic pulled over from I've not yet checked all the failed tests, however here is a short summary that already accounts for a substantial portion of the 148 failures I saw:
|
|
Loading saved models requires full path. In current regression the saved model files are at smlp/repo/smlp/regr_smlp/models/ . Here is an example of a genrated command: ../../src/run_smlp.py -model_name "../models/Test5_smlp_toy_num_resp_mult" -out_dir ./ -pref Test15 -mode predict -resp y1 -feat x,p1,p2 -model dt_caret -save_model f -use_model t -mrmr_pred 0 -plots f -pred_plots f -resp_plots f -seed 10 -log_time f -new_dat "../data/smlp_toy_num_resp_mult_pred_labeled.csv" The path ../models/Test5_smlp_toy_num_resp_mult is created using the value in the smlp_regr.csv file in the first column and corresponding row (row 15 in this case), which looks like this: commenting just in case this clarifies/helps. |
|
Is this PR intended for current smlp repo? In case the recovered test data is not yet in master, I think it is higher priority -- I mean the test input data that Franz recovered, after I renaming input, output and knob names to use the same convention (x, y, p). Also, I am waiting for the new repo (with limited license) to be opened. |
Does "full path" mean "absolute path"? I am observing that the
Indeed, the command line is generated this way. But running SMLP somehow expects
Yes, a proper way to test that doesn't take hours and is a bit more flexible while also using standard tools such as pytest is essential for the more invasive changes regarding pySMT, etc. I cannot use the
This will be a next step after I can reliably run the tests. |
|
Probably for the future but in place of -mode_name I would use -model
referring to model file.
Model name can be easily extracted from the file name.
…On Thu, 19 Mar 2026 at 11:05, Franz Brauße ***@***.***> wrote:
*fbrausse* left a comment (SMLP-Systems/smlp#61)
<#61 (comment)>
Loading saved models requires full path.
Does "full path" mean "absolute path"? I am observing that the -out_dir
value is prepended to the -model_name. That means, those options are not
independent of one another, and basically only work when -out_dir is set
to ./. See the error message I posted for the generated command of
Test22, where *only* the -out_dir is different.
The path ../models/Test5_smlp_toy_num_resp_mult is created using the value
in the smlp_regr.csv file in the first column and corresponding row (row 15
in this case), [...]
Indeed, the command line is generated this way. But running SMLP somehow
expects -model_name to be relative to -out_dir. Which I consider to be a
bug that prevents this PR from progressing.
Is this PR intended for current smlp repo?
Yes, a proper way to test that doesn't take hours and is a bit more
flexible while also using standard tools such as pytest is essential for
the more invasive changes regarding pySMT, etc. I cannot use the
smlp_regr.py reliably here because it randomly hangs with -w 16 and I
don't have the time to wait for hours until it (hopefully) finishes with -w
1.
In case the recovered test data is not yet in master, I think it is higher
priority -- I mean the test input data that Franz recovered, after I
renaming input, output and knob names to use the same convention (x, y, p).
This will be a next step after I can reliably run the tests.
—
Reply to this email directly, view it on GitHub
<#61 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIVYHTMAVUZQXS2RH43NZDD4RPH6VAVCNFSM6AAAAACWXEYJ22VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DAOBZGM2TIMRWGE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
|
model_name is used both for saving models and loading models. A saved model consists with many files, and model_name is used to identify these files while loading the model, and to save these files while saving the model. in SMLP out_dir is used for all files generated by a run. These files include saved model files, that s why the output directory is prefixed to saved model file names to generate a full path. Full path can be absolute or relative, depending of course on whether one supplies absolute or relative path as argument to -out_dir. |
|
Is there a way to avoid overwriting those model files? |
|
Franz, can you clarify your question: |
|
Sure. The problem I'm facing is that there are regression tests that read models and expect to find them in a location determined by |
|
Loading/reading models in regression are done from saved models directory -- directory smlp/repo/smlp/regr_smlp/models/, located in parallel to code, data, master (and several other) directories. When SMLP is run and saves models, the saved model is in the output directory. When SMLP runs a saved model, it takes the models from the models directory, so if the same command also writes / saves model there will be no clash. In fact, as far as I remember, SMLP does not permit to have -save_model t and use_model t in one command because it does not make sense. |
|
If for some reason a regression test model changes and the model is saved, and is reused in some other test that reads that model, the models must be updated in the models/ directory. smlp_regr.py script supports that -- when a difference between newly saved model at output directory is compared to the saved model in the models directory and mismatch is found, the regression scripts asks what to do with the diff -- accept the diff and update both the models/ and/or master/ directories, or to ignore the diff. |
|
I think we're still not talking about the same thing. You describe what happens when running I am modifying only the This is what happens: Note the path in the last line of this error message. It is constructed from putting the |
|
Working around this Edit: Out of the 41 failing, 13 already fail on master for me. Investigating the remaining failed ones... |
|
Hi Franz, when you run: ../../src/run_smlp.py -model_name ../models/test22_model -out_dir /tmp/pytest-of-kane/pytest-1223/test0 -pref Test22 -mode predict -resp 'PF ,|PF |' -model poly_sklearn -save_model f -use_model t -pred_plots t -resp_plots t -data_scaler none -mrmr_pred 0 -plots f -seed 10 -log_time f -new_dat ../data/smlp_toy_num_metasymbol_mult_reg_pred_labeled.csv does directory /tmp/pytest-of-kane/pytest-1223/test0 exist? |
Yes, it does exist before I run the command (and still does after).
The same error, except the FileNotFoundError's message changes: |
Outcome: All 41 that fail with this PR already fail on master for me. Here is a summary:
Why the Thus, this PR performs no worse than the current master, at least on my installation. Edit: 104 and 107 are expected to fail, this is handled as intended by pytest now. |
|
|
Thanks, Zurab! I've marked 104 and 107 as expected to fail.
Do you remember what changes fixed this keras "(0,) in the loss argument" issue?
I have not finished recovering all those missing files. It's quite some time ago by now that I worked on that. I plan to return there once I can reliably run the tests. |
|
The fixes to keras version compatibility are in src/smlp_py/train_keras.py -- changes are conditional with respect to: It makes sense to grep on "3.0.0" or on keras.version to make sure all relevant changes are in train_keras.py. |
|
With keras-3.13.2 installed (as before), I now get Will try the keras-3 patches from your nlp_text branch next. Edit: However, they are not in master, yet, so we'll have to see whether it might make sense to extract them from your branch as a separate PR. |
|
With keras-2.15.0 and tensorflow<2.16: (I skipped Tests 154, 155 and 157 manually because they were taking too long). |
…gr.py The test script "test_cmdline.py" was generated by a script from the smlp_regr.csv file. The logic for constructing the arguments for the command line tests has been reverse engineered from the original smlp_regr.py script. For now, this is just the beginning. Comparing the generated files and outputs on stdout/stderr for a test run remains to be done.
Result with this: 41 failed, 150 passed, 36 skipped, 43 warnings in 483.56s (0:08:03)
Fixing access to non-existing private attribute '_metrics' occuring e.g. when running regression test #4: Traceback (most recent call last): File "/home/kane/intel/smlp/regr_smlp/code/../../src/run_smlp.py", line 18, in <module> main(sys.argv) File "/home/kane/intel/smlp/regr_smlp/code/../../src/run_smlp.py", line 15, in main smlpInst.smlp_flow() File "/home/kane/intel/smlp/src/smlp_py/smlp_flows.py", line 317, in smlp_flow model = self.modelInst.build_models(args.model, X, y, X_train, y_train, X_test, y_test, X_new, y_new, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kane/intel/smlp/src/smlp_py/smlp_models.py", line 472, in build_models model = self.model_train(feat_names_dict, resp_names, algo, X_train, X_test, y_train, y_test, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kane/intel/smlp/src/smlp_py/smlp_models.py", line 353, in model_train model = self._instKeras.keras_main(resp_names, keras_algo, X_train, X_test, y_train, y_test, hparams_dict, plots, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kane/intel/smlp/src/smlp_py/train_keras.py", line 924, in keras_main model = self._keras_train_multi_response(resp_names, algo, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kane/intel/smlp/src/smlp_py/train_keras.py", line 865, in _keras_train_multi_response history = self._nn_train(model, epochs, batch_size, weights_precision, self.model_checkpoint_pattern, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/kane/intel/smlp/src/smlp_py/train_keras.py", line 475, in _nn_train self._log_model_summary(model, epochs, batch_size, sample_weights_dict, callbacks) File "/home/kane/intel/smlp/src/smlp_py/train_keras.py", line 380, in _log_model_summary compiled_metrics = model.compiled_metrics._metrics # Access the private _metrics attribute ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'DeprecatedCompiledMetric' object has no attribute '_metrics'
The test script "test_cmdline.py" was generated by a script from the smlp_regr.csv file. The logic for constructing the arguments for the command line tests has been reverse engineered from the original smlp_regr.py script.
For now, this is just the beginning. Comparing the generated files and outputs on stdout/stderr for a test run remains to be done.The commands generated by pytest exactly match thosesmlp_regr.pyproduces (modulo-out_dir, which is produced by pytest now and no longer defaults to./).The script convert-csv-pytest.py was used to convert the .csv to the new "test_cmdline.py" and cmp.py was used to check that the generated command lines exactly match of those of
smlp_regr.py. Those files are not part of this PR.I intend to finish implementing the check of the result files according to the logic of the current regression script in this PR and will mark it as draft until then.In the long run, I think it would be better to replace the "compare log and output files for changes" approach with a more robust check for the properties actually supposed to be tested - we discussed about this earlier. The approach taken in this PR putting the base functionality into a separate class living inlib.pylooks extendible enough to me to support those future checks.