Skip to content

Add generic task_config and task_activity#542

Merged
GianlucaFicarelli merged 25 commits intomainfrom
generic_task
Mar 5, 2026
Merged

Add generic task_config and task_activity#542
GianlucaFicarelli merged 25 commits intomainfrom
generic_task

Conversation

@GianlucaFicarelli
Copy link
Collaborator

This PR adds more generic models compared to #532.
Only TaskConfig and TaskActivity are defined, and can replace the previous models according to this mapping:

  • Campaign -> TaskConfig[xxx__campaign]
  • TaskConfig -> TaskConfig[xxx__config]
  • TaskConfigGeneration -> TaskActivity[xxx__config_generation]
  • TaskExecution -> TaskActivity[xxx__execution]

where:

task_config_type (enum):

  • circuit_simulation__campaign
  • circuit_simulation__config
    ...

task_activity_type (enum):

  • circuit_simulation__config_generation
  • circuit_simulation__execution
    ...

This would allow to define workflows of task_config with more than 2 activities.
This generalization is possible only if:
- all the task_configs (campaign and config) share the same attributes (scan_parameters, parent_id, input)
- all the task_activities (config_generation and execution) share the same attributes (executor, execution_id)

@GianlucaFicarelli GianlucaFicarelli self-assigned this Mar 1, 2026
@codecov
Copy link

codecov bot commented Mar 1, 2026

Codecov Report

❌ Patch coverage is 98.70130% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
app/queries/utils.py 90.24% 2 Missing and 2 partials ⚠️
Flag Coverage Δ
pytest 97.91% <98.70%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
app/db/auth.py 95.00% <100.00%> (+8.88%) ⬆️
app/db/model.py 99.10% <100.00%> (+0.01%) ⬆️
app/db/triggers.py 100.00% <ø> (ø)
app/db/types.py 99.46% <100.00%> (+0.04%) ⬆️
app/filters/task_activity.py 100.00% <100.00%> (ø)
app/filters/task_config.py 100.00% <100.00%> (ø)
app/queries/common.py 97.94% <100.00%> (-0.15%) ⬇️
app/queries/constants.py 100.00% <100.00%> (ø)
app/queries/factory.py 100.00% <100.00%> (ø)
app/queries/types.py 100.00% <100.00%> (ø)
... and 11 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

app/db/model.py Outdated
Comment on lines +2288 to +2290
parent_id: Mapped[uuid.UUID | None] = mapped_column(
ForeignKey(f"{EntityType.task_config}.id"), index=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parent_id seems a bit misleading as it implies a hierarchy. Could we use a name like activity_id/generator_activity_id or something along these lines.?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clear that parent_id isn't the activity, but it's an optional task_config that was used by the activity to generate this task_config.
In theory it's not strictly required, because it's possible to get it from task_activity.used[0], but it can be handy for filtering or to retrieve the full parent in a simpler way, and it would ensure that it's unique.
For the existing campaigns, it would be the equivalent of:

  • Simulation.simulation_campaign_id
  • SkeletonizationConfig.skeletonization_campaign_id
  • IonChannelModelingConfig.ion_channel_modeling_campaign_id

Any other suggestion about the name if we keep it? task_config_generator_id?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the attribute in opposite direction (parent task -> children tasks) isn't defined, since it's not ensured that all the children belongs to the same activity (unless we allow only one activity per config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with task_config_generator_id

app/db/model.py Outdated
parent_id: Mapped[uuid.UUID | None] = mapped_column(
ForeignKey(f"{EntityType.task_config}.id"), index=True
)
input: Mapped[list[Entity]] = relationship(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a list, I would go with "inputs"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that input (singular) is misleading, although the attribute is called input_ids so it would match the prefix.
I'll rename it to inputs unless we have a better name.

@james-isbister
Copy link
Contributor

There are some conceptual differences between how we planned to use Campaign and TaskConfig in the other PR which are worth mentioning:

A) Campaign->scan_parameters is intended to hold a dictionary where the keys are the names of the variables to be scanned over (i.e. stimulus_strength), and the values are lists of values to be scanned over in that dimension (i.e. [0.1, 0.2, 0.3]).
B) TaskConfig->scan_parameters holds a dictionary where the keys are the names of the variables to be scanned over (i.e. stimulus_strength) and the values are single values (i.e. 0.2).
C) TaskConfig should probably eventually also hold a dictionary which indexes which value it represents in the parent Campaign->scan_parameters. For example, there might be a dictionary where the keys are again variables (i.e. stimulus_strength) and the values are indices in the lists of A (i.e. index 0). It wouldn't make sense for Campaign to have this attribute. It's needed though because there might not always be a bijection between A and B, and we need to infer the location in the scan. B becomes becomes redundant but is maybe still useful.

Perhaps it's possible that a second or third task in a workflow might also scan over several values though, so maybe it's not just the top Campaign config that needs to hold a scan_parameters with lists.

Maybe a generic TaskConfig object could have at least one of the two optional attributes:

  • scan_parameters: Like A) described above
  • scan_index, Like C) described above

We might have a third variable for convenience which is essentially B, called something like coordinate_parameters

Let me know if I can explain that better

Tagging @chr-pok who might also have some thoughts on this

@GianlucaFicarelli
Copy link
Collaborator Author

There are some conceptual differences between how we planned to use Campaign and TaskConfig in the other PR which are worth mentioning: ...

This is important to discuss, but from the point of view of entitycore, the JSON dicts are currently arbitrary json blobs that can contain any data, that is interpreted only by the client (obi-one).

We could also rename the attribute to something more generic instead of scan_parameters.

To keep the model simpler, it would be better to avoid adding more jsonb fields, and consolidate these details in a single json that can depend on the task_config_type.

For reference, these might be related to the point C above: #462 and the corresponding PR #441

@eleftherioszisis
Copy link
Contributor

To keep the model simpler, it would be better to avoid adding more jsonb fields, and consolidate these details in a single json that can depend on the task_config_type.

+1 For this. There is no need to have scan_parameters + scan_index. We can aim for one field with the desired schema.

@chr-pok
Copy link
Contributor

chr-pok commented Mar 3, 2026

This PR adds more generic models compared to #532.
...

Thanks for this PR! Since campaigns and individual configs are conceptually different, I am not quite sure if this level of generalization is really needed. I personally would prefer to keep them (as well as the corresponding activities) separate, as in the alternative PR#532. This would allow us to be more flexible w.r.t. specific attributes required for campaigns vs. single configs (which will probably be displayed/filtered differently in the UI, etc.). Or is there a compelling reason why these even more generic models would be beneficial?

@GianlucaFicarelli
Copy link
Collaborator Author

GianlucaFicarelli commented Mar 4, 2026

This PR adds more generic models compared to #532.
...

Thanks for this PR! Since campaigns and individual configs are conceptually different, I am not quite sure if this level of generalization is really needed. I personally would prefer to keep them (as well as the corresponding activities) separate, as in the alternative PR#532.

From the point of view of entitycore, the logic to handle campaigns and individual configs is the same, with the only difference that the campaign is the first step of the workflow, while configs are the second step.
They are both entities with the same attributes:

  • id, name, description
  • scan_parameters (or it can be renamed to a generic "config")
  • task_config_generator_id (optional, it would be null for the first task/campaign, and it could be also obtained through the activity that generated the config)
  • inputs (list of generic entities associated as input to the config)
  • no configs (outputs) anymore (like simulation_campaign.simulations), because nothing prevent the same campaign to be used in multiple activities, each one with different generated config.

With the approach in this PR, it means that they are stored in entitycore in the same table, and can be created or retrieved using the same endpoint, with just different parameters.
In particular task_config_type is required during the creation, and can be specified when searching for a specific type, for example it can be circuit_simulation__campaign or circuit_simulation__config.
This doesn't prevent the consumer of these entities (obi-one, notebooks, or even the web ui) to define separate classes, when the business logic implemented in the client is different, although I would expose the same generic classes in entitysdk.

This would allow us to be more flexible w.r.t. specific attributes required for campaigns vs. single configs (which will probably be displayed/filtered differently in the UI, etc.).

Do you have in mind specific attributes that need to be specified for the campaign and not for single configs (or the other way around), that cannot be part of the json configuration?

Or is there a compelling reason why these even more generic models would be beneficial?

It's not a compelling reason, but the generalization:

  • simplifies the code in entitycore: one entity model, one activity model, and the corresponding endpoints, instead of duplicating everything.
  • it allows to register arbitrary workflows with additional steps, as long as each step can be represented as a generic task_config. James mentioned that there could be "a second or third task in a workflow", that shouldn't require changes with the generic approach in this PR, but it might require changes with the previous PR (for example task_config.campaign_id wouldn't make much sense if it's a second task, so it should be renamed).

Note that the json config called scan_parameters in this PR can be renamed to a more generic name config that can be different depending on the type of entity, and the schema could be something like:

{
  "version": 1,
  "type": "circuit_simulation__campaign",
  "scan_parameters": {}
}
{
  "version": 1,
  "type": "circuit_simulation__config",
  "scan_parameters": {},
  "scan_index": {}
}

entitycore doesn't enforce any schema at the moment, but it could be done if needed even with the generic approach, as long as there is a type that can be used as a discriminator.

@james-isbister
Copy link
Contributor

james-isbister commented Mar 4, 2026

The only thing which feels strange to me is that there is no way to programmatically distinguish whether TaskConfigs or TaskActvities of a specific task_config_type are intended as a single coordinate or "scan", when we are indeed deciding this when we make a new task_config_type (i.e. circuit_simulation__campaign). Looking at the contents of scan_parameters to determine this, is putting the burden on the client (i.e. obi-one), when I think we already know this when we define new task_config_types

Maybe this could come in useful, for example when visualising (or recursing through) a graph of a complex multi-stage workflow (i.e. a long chain of configs and tasks).

Maybe just a simple representation of which task_config_types are intended as a scan configs or coordinates. I'm not sure what form this would take

Another thing to consider is campaign vs single coordinate analysis. Eventually we might want to link an analysis result to a campaign or coordinate. I'm not sure how best to do that, but another thing to bear in mind. Maybe it's ok for this to be generic between campaigns and coordinates. Or maybe a campaign analysis result should be linked directly to the coordinates or generated entities (or a subset of them when the analysis is of a subset of coordinates or entities in the "campaign". It was a common pattern at bbp to only analyse a subset of simulations for example

@GianlucaFicarelli
Copy link
Collaborator Author

The only thing which feels strange to me is that there is no way to programmatically distinguish whether TaskConfigs or TaskActvities of a specific task_config_type are intended as a single coordinate or "scan",

I don't follow what is needed to be distinguished, and how this PR is different from the previous one in this regard.
For searching simulation campaigns:

  • with the previous PR it would be GET /api/entitycore/campaign?task_type=circuit_simulation
  • while with the current PR it's GET /api/entitycore/task-config?task_config_type=circuit_simulation__campaign

I think we can take the discussion offline for more clarifications.

@james-isbister
Copy link
Contributor

Say we want a generic tool for visualising (or operating on) the graph of any workflow, and we want to visualise (or operate on) the campaign configs differently from the single configs. We would just need a list somewhere of which task_config_types are "campaigns" or "single" configs. That could go in entitycore somewhere perhaps, rather than detecting whether __campaign is in the string

Can discuss further offline

@GianlucaFicarelli GianlucaFicarelli merged commit de94005 into main Mar 5, 2026
2 checks passed
@GianlucaFicarelli GianlucaFicarelli deleted the generic_task branch March 5, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants