🔨Work Item
When running two separate test runs on the same machine, the tests can crash because they are both trying to use the same tcp port. See
rg "tcp://127.0.0.1:12345"
...
tests/python/pytorch/graphbolt/test_item_sampler.py
886: else "tcp://127.0.0.1:12345"
tests/python/pytorch/graphbolt/test_dataloader.py
92: else "tcp://127.0.0.1:12345"
tests/python/pytorch/cuda/test_nccl.py
17: init_method="tcp://127.0.0.1:12345",
40: init_method="tcp://127.0.0.1:12345",
63: init_method="tcp://127.0.0.1:12345",
89: init_method="tcp://127.0.0.1:12345",
...
Description
We should have the tcp port be some random number within a certain range to ensure that TCP port collision is really unlikely.
🔨Work Item
When running two separate test runs on the same machine, the tests can crash because they are both trying to use the same tcp port. See
Description
We should have the tcp port be some random number within a certain range to ensure that TCP port collision is really unlikely.