Skip to content

Metacello re-fetches baselines even if they were fetched before #539

@syrel

Description

@syrel

Hello 👋

Recently, we decided to refactor baselines in a lower part of our decently sized project. That refactoring included splitting baselines that declare a lot of packages into multiple baselines that clearly specify dependency of a smaller set of packages by referencing other baselines from other repositories. This resulted in a significantly increased loading times, in particularly the fetch step that creates a linear list of loading directives. Upon closer inspection it turned out that dependent baselines are analysed over and over again even if metacello supposedly visited them already.

Project structure

To simplify the debugging process we recreated our project structure in a playground organization https://github.com/bugginrack.

In that project we have a bunch of libraries (https://github.com/bugginrack/MyLibrary) with the following baseline dependencies:
Dependencies-MyLybraryD

On top of that there is a framework (https://github.com/bugginrack/MyFramework):
Dependencies-MyFramework

Next, we have a few projects (https://github.com/bugginrack/MyProject), some of them depend on each other (A, B and C):
Dependencies-MyProject

Code size independency

Our thesis is that the size of the code-base does not have a significant influence on the fetching performance. To prove that we loaded the same baseline structure with a large (generated) code base (video) and without any code (video).

It took ~46s to finish the fetching phase for a project with a signicant amount of code:
FetchingEnd-GitHub-WithCode
and the same ~47s for a project without code:
FetchingEnd-GitHub-WithoutCode

Connection independency

To prove that it is connection independent we did the same experiment while loading code locally.
With code (video):
FetchingEnd-Local-WithCode
Without code (video):
FetchingEnd-Local-WithoutCode

The problem

The issue is that doubling the amount of same-level projects doubles the time it takes to fetch, while increasing the dependency depth exponentially increases the fetching time.
For our real system the loading times exceeded 2 hours.

Solution

The intermidiante solution is of course to uglify, flatten and merge the baselines reducing the amount of interconnections to the minimum.

Q: Would it be possible to improve Metacello baseline fetching to skip already processed baselines?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions