Conversation
There was a problem hiding this comment.
Hi Dylan, thanks for your PR.
The tests run well on my end. I made a small change to the unit test to prevent it from running indefinitely and added code to calculate processing time. I’m not sure if the results need to be saved to Tiled instead of the local .db—a Tiled publisher could be added in the future if needed.
A companion PR is available at als-computing/arroyosas#19, which reads the tiled_url from the local .db file and sends it to LSE.
There was a problem hiding this comment.
I made some changes to this file to fix a Python unit test that wouldn’t stop running.
There was a problem hiding this comment.
I add codes for calculating processing time for both autoencoder and dimension reduction.
|
@xiaoyachong , thanks. This feels mergable to me. Do you agree? |
Let's see if @taxe10 has any comments, and then we can merge it. Tanny and I have scheduled commit day next Monday to merge all the changes made for SMI. We can include this then if that works for you. |
|
Great. |
|
Looks great! One thought - should this storage live in Tiled instead of a separate DB? A custom DB gives nice querying options (like by timestamp), but what we really need right now is experiment replay. Are the feature vectors meant to persist across experiments, or just capture the latest results? Since the offline version already uses Tiled, it might be cleaner to keep things consistent across both modes. My only concern is whether Tiled can handle complex queries as @dylanmcreynolds mentioned before, e.g., timestamps, energy ranges or other metadata, as flexibly as a DB |
|
@taxe10 yeah, I've thought about this a lot. I think I've landed on "the two aren't mutually exclusive". While the vectors probably more appropriately belong in Tiled, they don't take much space here. This publisher and a publisher that pushes results to tiled can happily co-exists. Tiled has pretty good query options, but right now it would take customization of the tiled database to add indexes where appropriate if we wanted to ask complicated questions like "show me all runs where the second dim reduction took over 100 milliseconds". |
|
@taxe10 @dylanmcreynolds |
dylanmcreynolds
left a comment
There was a problem hiding this comment.
The new tiled publisher is great!! I would like to think of ways to make an abstract class for tiled publishers that does some of this work...especially all the handing off of async/threads that needs to be done to talk to the non-async tiled client. But we can do that refactor in the future.
For now, though, we need to change things to not call tiled client or numpy from with async directly.
| import traceback | ||
| logger.error(traceback.format_exc()) | ||
|
|
||
| async def _write_table_to_tiled(self, table_key): |
There was a problem hiding this comment.
Should be in a thread
There was a problem hiding this comment.
I modify the threading as you suggested. Meanwhile, I’d like to mention that I save data to Tiled at the scan level rather than for each individual message, which differs from writing to a local .db. A write occurs only when a new scan (new UUID) is received, with each UUID typically containing 30–200 messages. This simplifies the process since we don’t need to search for the current table and append. Do you think this is a good approach?
|
Changes look good to me. |
|
@xiaoyachong is this mergable? |
|
Overall, the changes look good to me. Quick question - does |
Yes. This is the key for the remote NSLS-II tiled server. |
Hi Dylan. I think so. |
Edited
This PR introduces changes to the arroyo pieces of the latent space explorer.
pyproject.tomlto add asycnio unit tests