Skip to content

Set up binder and installation support#14

Merged
espg merged 15 commits intomainfrom
binder-fperez
May 13, 2021
Merged

Set up binder and installation support#14
espg merged 15 commits intomainfrom
binder-fperez

Conversation

@fperez
Copy link
Contributor

@fperez fperez commented May 13, 2021

This is the Binder Link to use to test this PR, as it points to the correct branch, NOT the one in the main README.

I made a binder folder and setup.py file so we can install geostacks normally. We can document this better later, but for now I think it will do.

Note that I changed the EC notebook to remove the path hacks, opting instead for a proper dev install with pip install -e .. That's in the binder, and we can add that to the README later...

I'm leaving this as a PR for @espg @whyjz to be able to see my changes, but I suggest merging it ASAP so we have a working binder quickly...

fperez added 4 commits May 12, 2021 17:43
Pinning for now env to python 3.7 - that's the default used by Binder right
now, and I'm seeing some odd errors with 3.8 and the widgetsnbextension conda
package, that I don't have time to debug.  We can relax this later.
@whyjz
Copy link
Contributor

whyjz commented May 13, 2021

Thanks @fperez this is super helpful! Just to make sure that binder uses binder/environment.yml for the dependencies and pip install -e for installing GeoStacks, right?

I tried to run EarthCube_meeting.ipynb and found that error associated with pickle(?), and here I am just posting my screenshot for your reference. Looks good to me for a merge but I would like to leave this open for now so @espg can take a look at it!

ScrSelection_017

@whyjz whyjz mentioned this pull request May 13, 2021
@espg
Copy link
Contributor

espg commented May 13, 2021

@fperez does joblib have the same issues as pickle? I used to use that for object persistence, and it's a light dependency

@espg
Copy link
Contributor

espg commented May 13, 2021

Seems close to working... running into a strange scipy error. Trying to import scipy gives:

ImportError: cannot import name '_ccallback_c'

Incompatibility between scipy and numpy versions maybe?? Trying to downgrade scipy to see if things change...

@espg
Copy link
Contributor

espg commented May 13, 2021

no idea on this-- the import errors all trace back to scipy. Minimal test of building the binder, launching, and then trying to import scipy fails. Anything that depends on scipy will also fail... so sklearn can't be imported either.

Maybe a coliision between /srv/conda/envs/notebook/site-packages and another site-packages like /srv/conda/lib/python3.8/site-packages ... but I don't see a duplicate scipy package, so even that seems unlikely.

@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

@espg curious why you removed the yaml install? It's needed for pip install -e . to run, as I used yaml in setup.py to avoid duplicating the deps declared in environment.yml and instead read them from that file...

fperez added 2 commits May 12, 2021 23:05
I don't think that's the issue... The older scipy version pin is causing weird conflicts with numpy.
@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

This scipy issue is extremely puzzling and hard to debug...

I can manually fix the problem by updating scipy, but that shouldn't be needed. No clue why we're getting a seemingly broken scipy from the install, but so be it...

I'm going to try to add the scipy update to the postBuild, crossing fingers.

An attempt to fix the strange scipy import problem we're seeing.
@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

Ok, just recording here the problem - reading the binder build log line by line, I realized that #$$%#% pypy is getting installed!! For god knows what reason I can't fathom, we're getting pypy 3.7 replacing the system python!!!

That changes things in a bad way - pypy doesn't fully support C extensions as complex as Scipy yet (they've made progress, but it's just not there) and so this is causing our problems.

Now we just need to figure out why we're getting pypy in there. Very bizarre...

The mamba update command that's part of the default docker build is pulling this in at update time:

pypy3.7                      7.3.4  hc4e864a_4            conda-forge/linux-64      34 MB

and with that, we're screwed. That pulls in the following changes (each line pair is the before/after):

  python                      3.7.10  hffdb5ce_100_cpython  installed
  python                      3.7.10  0_73_pypy             conda-forge/linux-64       5 KB
  python_abi                     3.7  1_cp37m               installed
  python_abi                     3.7  1_pypy37_pp73         conda-forge/linux-64       4 KB

We're losing cpython 3.7.10 and getting instead the pypy build!

We need to prevent that from happening, not sure how yet. Just leaving this here as breadcrumbs.

@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

Pinging @consideRatio in case you have any ideas here?

@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

Note that @wolfv kindly pointed out in the binder gitter channel that this is actually a bug in mamba :)

We'll work around it for now...

I'm puzzled by this - installing yaml works fine at the cmd line, and keeping it in the environment file works intermittently. No idea what's going on.
@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

Ok folks, I think I got it to work... Whew. The build is super slow right now with all those updates, but at least it works. Crashing now :)

@espg
Copy link
Contributor

espg commented May 13, 2021

@fperez huge thank you for getting things to build! I'm equally confused by the yaml strangeness, and why it needs to be removed...especially since the environment.yaml is itself a yaml file, and it still runs without yaml apparently?

@espg espg merged commit 412a101 into main May 13, 2021
@fperez
Copy link
Contributor Author

fperez commented May 13, 2021

Yes, I don't understand the yaml issue either. Sometimes it passes, sometimes it fails, and I have no idea... There's also two yaml packages for python, the one you import yaml which is actually called pyyaml, and a fork of that whose maintenance and evolution isn't clear to me.

BTW, recording here @wolfv's suggestion on gitter that it's possible to pin CPython by using the syntax python=*=*_cpython. I wasn't aware of that, so at least I wanted to leave the note for our records.

@fperez fperez mentioned this pull request May 13, 2021
@wolfv
Copy link

wolfv commented May 13, 2021

Sorry for the trouble with mamba, hopefully we can sort out this slight incompatibility soon. Basically conda uses a "track_feature" to de-prioritize the pypy packages. However that mechanism hasn't been fully integrated into libsolv yet. We'll have to tackle this asap

@wolfv
Copy link

wolfv commented Jun 30, 2021

Hi folks, just wanted to get back to you to mention that I've been trying to address the pypy trackfeature situation with openSUSE/libsolv#457

Also I tried your environment. If attempting to create a fresh environment from your yaml file right now, mamba will (unfortunately) still trip up. I think the reason is some subtle incompatibility between defaults and conda-forge (all conda-forge python packages require an extra "python_abi" package) and somehow that makes libsolv spin in infinite circles.

There are some ways to fix that, though:

  • add python to the top of the requirements so that the solver fixes a python version first
  • remove the defaults channel (it's not needed anyways)

I think for your situation with binder it's fine though, since the python version is already "fixed" by the previous installation.

I've opened an issue on mamba to track this infinite looping: mamba-org/mamba#1044 -- hopefully we can resolve that at some point, too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants