Skip to content

Support Production Workloads #39

@fabiante

Description

@fabiante

I want Persul to support production workloads. This issue defines what that actually means in terms of numbers.

Numbers

PURL creation

Parameters

A rough estimation which I have repeatedly used, adjusted for nice fractions:

  • 1.000 distinct users creating purls
  • 1.000 purls created / per day * per user (= 40 / hour ~ 0.6 / minute)
  • 86.400 purls created / day * user
    • breaks down into 3.600 / hour = 60 / minute = 1 / second
  • purls have a minimum lifespan of 10 years

Deductions

Given these numbers ...

  • 86.400.000 purls would be created every year
  • 864.000.000 purls would have been created after 10 years

Side note on data types:

  • 32-bit integer IDs will be able to address ~4 billion PURLs.
  • With 64 bit that would go up to 18.446.744.073.709.551.615
  • UUIDs are 128 bit have an adress space large enough that I won't bother writing down the number. Don't quote me on that 😉

PURL resolution

In my use case I expect that PURLs will be resolved infrequently, potentially never.

Due to that expectation I don't yet have detailed numbers expect these:

  • Only 10% of all PURLs are ever resovled. The remaining 90% are created but never resolved.
  • Users expect a PURL to resolve within 1 second on average. This does not take into account the time the actual resolved URL is loaded by browsers since that is not under Persurls control.

Subtasks

  • Run purl creation load tests on SQLite
  • Run purl creation load tests on Postgres
  • Run resolution load tests on Postgres
  • Run mixed load tests on Postgres - How does the application larger amounts of concurrent writes+reads
  • Consider upgrading to 64bit integer for primary id in purls table

Related

Metadata

Metadata

Assignees

Labels

epicA large goal generally guiding developmenttestingRelated to testing / writing tests

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions