When run on wheatley with 32 threads we get a deadlock with 450G memory used by various processes during the pixcal step:
Ctrl+C gives garbled lines that look like this:
KeyboardInterrupt
File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.12/multiprocessing/pool.py", line 114, in worker
task = get()
^^^^^
File "/usr/lib/python3.12/multiprocessing/queues.py", line 386, in get
with self._rlock:
File "/usr/lib/python3.12/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.12/multiprocessing/pool.py", line 114, in worker
task = get()
^^^^^
File "/usr/lib/python3.12/multiprocessing/queues.py", line 386, in get
with self._rlock:
File "/usr/lib/python3.12/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Last lines of log before and after I kill it:
2024-08-24 18:00:56 mkidpipeline.steps.pixcal :INFO (apply:432[pid=3835464]) Mask applied in 0.046s
2024-08-24 18:01:11 mkidpipeline.steps.pixcal :INFO (threshold:177[pid=3835459]) Masked 882 hot pixels, 767 cold pixels, 0 dead pixels
2024-08-24 18:01:11 mkidpipeline.utils.memory :DEBUG (release_ram:81[pid=3835459]) ./out/1602047869.h5 (tid=127117875261568) released 10.2 GB, total reserved 74.3 GB
2024-08-24 18:01:11 mkidpipeline.steps.pixcal :INFO (apply:417[pid=3835459]) Applying pixel mask to dither0_3/36 (MKIDObservation): 25.00159001350403s @ 1602047869.057722
2024-08-24 18:01:11 mkidpipeline.photontable :INFO (update_header:1071[pid=3835459]) Adding new header key: pixcal.method
2024-08-24 18:01:11 mkidpipeline.photontable :INFO (update_header:1071[pid=3835459]) Adding new header key: pixcal.step
2024-08-24 18:01:11 mkidpipeline.steps.pixcal :INFO (apply:432[pid=3835459]) Mask applied in 0.060s
2024-08-24 18:01:11 mkidpipeline.steps.pixcal :INFO (threshold:177[pid=3835450]) Masked 780 hot pixels, 810 cold pixels, 0 dead pixels
2024-08-24 18:01:11 mkidpipeline.utils.memory :DEBUG (release_ram:81[pid=3835450]) ./out/1602047895.h5 (tid=127117875261568) released 10.0 GB, total reserved 64.3 GB
2024-08-24 18:01:11 mkidpipeline.steps.pixcal :INFO (apply:417[pid=3835450]) Applying pixel mask to dither0_4/36 (MKIDObservation): 25.000959634780884s @ 1602047895.0989745
2024-08-24 18:01:12 mkidpipeline.photontable :INFO (update_header:1071[pid=3835450]) Adding new header key: pixcal.method
2024-08-24 18:01:12 mkidpipeline.photontable :INFO (update_header:1071[pid=3835450]) Adding new header key: pixcal.step
2024-08-24 18:01:12 mkidpipeline.steps.pixcal :INFO (apply:432[pid=3835450]) Mask applied in 0.059s
2024-08-24 18:01:22 mkidpipeline.steps.pixcal :INFO (threshold:177[pid=3835465]) Masked 989 hot pixels, 814 cold pixels, 0 dead pixels
2024-08-24 18:01:22 mkidpipeline.utils.memory :DEBUG (release_ram:81[pid=3835465]) ./out/1602047843.h5 (tid=127117875261568) released 10.5 GB, total reserved 53.8 GB
2024-08-24 18:01:22 mkidpipeline.steps.pixcal :INFO (apply:417[pid=3835465]) Applying pixel mask to dither0_2/36 (MKIDObservation): 25.00138235092163s @ 1602047843.0487542
2024-08-24 18:01:22 mkidpipeline.photontable :INFO (update_header:1071[pid=3835465]) Adding new header key: pixcal.method
2024-08-24 18:01:22 mkidpipeline.photontable :INFO (update_header:1071[pid=3835465]) Adding new header key: pixcal.step
2024-08-24 18:01:22 mkidpipeline.steps.pixcal :INFO (apply:432[pid=3835465]) Mask applied in 0.058s
^CProcess ForkPoolWorker-171:
Process ForkPoolWorker-166:
Process ForkPoolWorker-198:
Process ForkPoolWorker-178:
Process ForkPoolWorker-186:
Process ForkPoolWorker-197:
Process ForkPoolWorker-161:
Process ForkPoolWorker-194:
Process ForkPoolWorker-192:
Process ForkPoolWorker-175:
Process ForkPoolWorker-193:
Process ForkPoolWorker-185:
Process ForkPoolWorker-196:
Process ForkPoolWorker-174:
Process ForkPoolWorker-164:
Process ForkPoolWorker-173:
Process ForkPoolWorker-176:
Process ForkPoolWorker-195:
Process ForkPoolWorker-189:
Process ForkPoolWorker-190:
Process ForkPoolWorker-170:
Process ForkPoolWorker-172:
Process ForkPoolWorker-177:
Fortunately I was profiling when this happened so we should have some callgraph data
deadlock.log
oom.log
When run on wheatley with 32 threads we get a deadlock with 450G memory used by various processes during the pixcal step:
Ctrl+C gives garbled lines that look like this:
Last lines of log before and after I kill it:
Fortunately I was profiling when this happened so we should have some callgraph data
deadlock.log
oom.log