I've been experimenting with using pyinstrument to profile PINT and PINT Pal code in the NANOGrav 20-year timing notebook. It produces these nice timelines (see below). One of the things I've learned is that most of the time in the PINT fitter is spent in repeated calls to TimingModel.delay(), as part of the design matrix computation. It seems like caching some of these results could speed up the fitter substantially.
More explicitly, here's the F-test cell for J0125-2327:

If you drill down into one of the fits, you can see each of the steps the fitter is taking, and that the design matrix calculation takes up most of each step:

Inside the design matrix calculation, most of the time is taken up by calculating derivatives of the delay with respect to each parameter:

But then if you drill down even further, you can see that most of the time is taken up by repeated calculations of the delay, with calculating the altitude (needed for tropospheric delays) taking up a disproportionate fraction of that time.

To me this suggests that caching the delay, or even just the altitude (only for a short time, during a single design matrix calculation) could speed up the PINT fitter substantially.
I've been experimenting with using
pyinstrumentto profile PINT and PINT Pal code in the NANOGrav 20-year timing notebook. It produces these nice timelines (see below). One of the things I've learned is that most of the time in the PINT fitter is spent in repeated calls toTimingModel.delay(), as part of the design matrix computation. It seems like caching some of these results could speed up the fitter substantially.More explicitly, here's the F-test cell for J0125-2327:
If you drill down into one of the fits, you can see each of the steps the fitter is taking, and that the design matrix calculation takes up most of each step:
Inside the design matrix calculation, most of the time is taken up by calculating derivatives of the delay with respect to each parameter:
But then if you drill down even further, you can see that most of the time is taken up by repeated calculations of the delay, with calculating the altitude (needed for tropospheric delays) taking up a disproportionate fraction of that time.
To me this suggests that caching the delay, or even just the altitude (only for a short time, during a single design matrix calculation) could speed up the PINT fitter substantially.