Compare our learning with original flwr

### DPSA

We get with our approach (10 rounds, 3 epochs training per round) (scaling norm to 1)
```
INFO flower 2022-12-22 08:37:50,580 | server.py:144 | FL finished in 1010.8763513350004
INFO flower 2022-12-22 08:37:50,581 | app.py:192 | app_fit: losses_distributed [(1, 2.3027536869049072), (2, 2.2576515674591064), (3, 2.2854557037353516), (4, 2.2913222312927246), (5, 2.290480613708496), (6, 2.290785312652588), (7, 2.291123151779175), (8, 2.2913451194763184), (9, 2.2919833660125732), (10, 2.291351795196533)]
INFO flower 2022-12-22 08:37:50,581 | app.py:193 | app_fit: metrics_distributed {'accuracy': [(1, 0.1), (2, 0.1906), (3, 0.164), (4, 0.1816), (5, 0.184), (6, 0.1847), (7, 0.1825), (8, 0.1815), (9, 0.1784), (10, 0.1817)]}
```

### Original
Executing the original flwr code without scaling gives us the following:
```
INFO flower 2022-12-22 12:07:14,295 | server.py:144 | FL finished in 641.0084569230003
INFO flower 2022-12-22 12:07:14,295 | app.py:192 | app_fit: losses_distributed [(1, 1.5546590089797974), (2, 1.3022336959838867), (3, 1.1738433837890625), (4, 1.1083922386169434), (5, 1.0485312938690186), (6, 1.0243991613388062), (7, 1.0081266164779663), (8, 1.0220357179641724), (9, 1.0120254755020142), (10, 1.0466431379318237)]
INFO flower 2022-12-22 12:07:14,295 | app.py:193 | app_fit: metrics_distributed {'accuracy': [(1, 0.4341), (2, 0.5336), (3, 0.5827), (4, 0.6033), (5, 0.6319), (6, 0.6384), (7, 0.6497), (8, 0.6497), (9, 0.6523), (10, 0.6544)]}
INFO flower 2022-12-22 12:07:14,295 | app.py:194 | app_fit: losses_centralized []
INFO flower 2022-12-22 12:07:14,295 | app.py:195 | app_fit: metrics_centralized {}
```

### Original - with scaling
With the original flwr code (when scaling the norm to 2 - because some kind of average of gradients is done, see below)
```
INFO flower 2022-12-22 11:39:56,416 | server.py:144 | FL finished in 249.8099209640004
INFO flower 2022-12-22 11:39:56,416 | app.py:192 | app_fit: losses_distributed [(1, 2.3029253482818604), (2, 2.258068799972534), (3, 2.2848613262176514), (4, 2.2887682914733887)]
INFO flower 2022-12-22 11:39:56,417 | app.py:193 | app_fit: metrics_distributed {'accuracy': [(1, 0.1), (2, 0.2153), (3, 0.1835), (4, 0.1833)]}
INFO flower 2022-12-22 11:39:56,417 | app.py:194 | app_fit: losses_centralized []
INFO flower 2022-12-22 11:39:56,417 | app.py:195 | app_fit: metrics_centralized {}
```

Same parameters, more rounds:
```
NFO flower 2022-12-22 11:54:06,915 | server.py:144 | FL finished in 659.7055658550007
INFO flower 2022-12-22 11:54:06,915 | app.py:192 | app_fit: losses_distributed [(1, 2.3022499084472656), (2, 2.2597239017486572), (3, 2.2851016521453857), (4, 2.291539430618286), (5, 2.291046619415283), (6, 2.291619062423706), (7, 2.2908899784088135), (8, 2.2913625240325928), (9, 2.291919469833374), (10, 2.2917206287384033)]
INFO flower 2022-12-22 11:54:06,915 | app.py:193 | app_fit: metrics_distributed {'accuracy': [(1, 0.1), (2, 0.2074), (3, 0.1353), (4, 0.1557), (5, 0.186), (6, 0.1833), (7, 0.1848), (8, 0.1858), (9, 0.186), (10, 0.1751)]}
INFO flower 2022-12-22 11:54:06,915 | app.py:194 | app_fit: losses_centralized []
INFO flower 2022-12-22 11:54:06,915 | app.py:195 | app_fit: metrics_centralized {}
```

In the original code the following is called for averaging the gradients:
```python
def aggregate(results: List[Tuple[NDArrays, int]]) -> NDArrays:
    """Compute weighted average."""
    # Calculate the total number of examples used during training
    num_examples_total = sum([num_examples for _, num_examples in results])

    # Create a list of weights, each multiplied by the related number of examples
    weighted_weights = [
        [layer * num_examples for layer in weights] for weights, num_examples in results
    ]

    # Compute average weights of each layer
    weights_prime: NDArrays = [
        reduce(np.add, layer_updates) / num_examples_total
        for layer_updates in zip(*weighted_weights)
    ]
    return weights_prime
```
**Question**: What does this do? How does it compare to our simple addition of gradient vectors?

**General Question**: Are we happy with our current state?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare our learning with original flwr #2

DPSA

Original

Original - with scaling

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Compare our learning with original flwr #2

Description

DPSA

Original

Original - with scaling

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions