-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Hello,
While running an experiment with automl in Databricks RT 11.3ML I get the error:
Unable to generate notebook at [workspace location] using format JUPYTER: {"error_code": "MAX_NOTEBOOK_SIZE_EXCEEDED", "message": "File size imported is 34974148 bytes), exceeded max size (10485760 bytes)"}
The exact same code runs smoothly for datasets with more variables and more training instances but in other Databricks environments. However, in a particular environment, this error always comes up.
The learning task is a regression and I have tried reducing the amount of training instances from 20M (which I know they are automatically sampled during the automl initial steps) to 2K but it still generates a Juyter Notebook of 12MB (apparently bigger than the allowed maximum).
My first guess was that the pandas profiling step causes the error while rendering the output of a "big" dataset but I did manage to manually run the exact same pandas profiling notebook using the same train set dataframe inputed to the automl task.
Any help is appretiated because I'm not sure what else to do as the error comes in a phase of the process which I haven't accessed or modified.
