Before you can use EasyML Studio, you must configure the environment which include your development environment and runtime server environment which contains Preparation, Run Docker containers and Start services three steps to run it successfully.
- Get code from Git repository https://github.com/ICT-BDA/EasyML
- Import the code to your IDE via maven project
- Support java version 1.7 by default
- Make sure your IDE have install Maven and GWT plugin
- Please remove or comment
jdk.toolsdependency properties in pom.xml(This dependency is not used for IDEA, but for eclipse IDE)
- Use Maven clean and compile goal to download all related dependency package
- Make sure you can see the whole External Libraries on the Project list
- Use
gwt:compileplugin to compile gwt project.
After you have get all dependencies and finish all compilation, you can start building your EasyML Studio web app for the following steps:
- Edit run configurations
- Add GWT web app via
+at upper left corner, if you do not seeGWT Configurationin the list, you might have something wrong in the step of Configure GWT Lib Path. Go back to the last step, resolve the problem.
Use Super Dev Modecan allow you debugging your web app at the browser side. It is make debugging more effient, for which remember choosing it.- When you have finished all the steps above, you can click the green run button to make and debug the EasyML. After a while, you can browse EasyML in your Chrome to access it.
-
Make sure your eclipse have install maven and GWT plugins.(As the GWT plugins can't be downloaded easily now, we supply GWT plugins for Eclipse 4.4 Luna, you can install the plugin to eclipse by offline mode)
-
Import the code into your IDE via maven project
- Click project peoperties by right key to modify project facets. Keep these facets as bellows will make your project structure more clearly.
- The project support java version 1.7 by default, you can switch to other java version by project configuration modification.
- Use maven clean and maven compile goal to download all related dependency package. Right click project
Run as, selectmaven cleanto clean the project and then selectmaven build...to configure goal for project compiling.
- If you meet tools.jar can not be found error, you should modify your project JRE System Library from JRE to JDK path. The eclipse import project use JRE path by default.
-
When you finish compiling the project, please refresh it.
-
Use gwt-maven plugin goal to compile the gwt project. If you don't do this step, you can't see some images in your browser when you login the system.
- Right click the project
Run as, selectWeb Application(GWT Super Dev Mode). Then selectindex.htmlpage to run the project.
- After a while for above steps, you can see the finished
Consolestate.
- Switch to the
Development Mode, and you can visit the project through the url.
Our server cluster is based on Docker, thus you can build run time environment on your own computer. It is convenient for you to develop project without any remote connections. Furthermore, you can also contribute to the server environments. The docker version server cluster is not stable and efficient, for which we can do a series of things on it. However, you first step to access it is installing Docker.
- Just follow the official guide to install Docker. If you install the
docker in windows, at leastMinimum Memory Requirement (RAM) = 8GBfor standalone computer. Hard drive must containenough space (10GB)in which you install Docker. Otherwise it will be very very slow. - Make sure your docker service runs correctly via
Docker infoandDocker version - No matter which system your computer is, stop the Firewall of your system
- If you are using centos 7, you also should stop the selinux, in order to avoiding run Docker container error.
- Run
sudo docker run hello-worldto see if we have install docker successfully.
-
Pull our mysql server images from our docker hub:
docker pull nkxujun/mysql_eml:version (latest default)
-
Our Eml server images is based on ubuntu, so pull it first:
docker pull nkxujun/ubuntu_eml:version(latest default) -
You can use
docker imagesto see if you have pulled these two images successfully:
Every single server in our cluster is created by one docker image, and this image can be built via a Dockerfile which has defined by us and includes all utilities we need such as hadoop. Thus we need to download the Dockerfile and all dependent files and configuration files from our google drive disk or Baidu Cloud.
-
Enter the path of the files which you have download last step
-
Use build.sh to build our image, this process will last for a few minute
sh build.sh -
You can use
docker imagesto see if you have built successfully:
-
You can use
sh run_containers.shto run all servers:
-
If your have run all these four server successed: mysql, hadoop-master, hadoop-slave1,hadoop-slave2, you can use
docker psto check
Because the hadoop cluster network communication depend on ssh, we need to confirm that the three server can do ssh log-in without password.
-
We can use
docker exec -it hadoop-master /bin/bashto enter the container named Hadoop-master -
In Hadoop-master,use
ssh localhost、ssh hadoop-slave1andssh hadoop-slave2to test the ssh function and do not forget exit after each ssh test, check all the hadoop containers ssh service is working:
-
If the ssh does not work, enter each container and execute this:
/etc/init.d/ssh start -
If you meet Bad owner or permissions on ~/.ssh/config. exception when you test
ssh localhost, you can executechmod 600 ~/.ssh/configcommand and try again.
- Add your Localhost(Linux) or Docker IP(Windows, if you use DockerToolBox to install, the default virtual IP is
192.168.99.100) ashadoop-master,hadoop-slave1,hadoop-slave2andmysqlto your hosts file, for example:
10.20.11.7 hadoop-master 10.20.11.7 mysql 10.20.11.7 hadoop-slave1 10.20.11.7 hadoop-slave2
- Run
sh init_mysql.shto prepare the databases for Oozie and EasyML Studio
Our EML installation package in version 1.2.0 or above could support Tensorflow mode.Follow the steps bellow If you want to have a try.
- Enter the
hadoop-mastercontainer via docker exec -it hadoop-master /bin/bash - Run
sh /root/config-tensorflow.shto config the environment that tensorflow need. - If you want to config the environment yourself, you can do use the tensorflow folder scripts in EasyML Project.
- Enter the hadoop-master container via
docker exec -it hadoop-master /bin/bash(a vital important command to enter every container) - Run
sh /root/start-hadoop.shto start hadoop and spark service - You can use
sh /root/run-wordcount.shto test the hadoop service - Visit http://hadoop-master:50070/ in your browser to check namenode and every datanode's status:
- Enter the hadoop-master container via
docker exec -it hadoop-master /bin/bash - Run
sh /root/start-oozie.shto start Oozie and Tomcat service, it will spend some time:
- We have started an ooize task example in the start shell, you can visit http://hadoop-master:11000/oozie/ in your browser and refresh for the detail of the task.
oozie admin -oozie http://hadoop-master:11000/oozie -status //Meet below exception Error: IO_ERROR:java.net.ConnectException: Connection refused
then you can try to restart oozie again in hadoop-master container like belows:
rm -rf $OOZIE_HOME/logs/* //Clear log directory rm -rf $OOZIE_HOME/oozie-server/logs/* //Clear log directory rm -rf $OOZIE_HOME/oozie-server/temp/* //Clear temp directory rm -rf $OOZIE_HOME/oozie-server/webapps/oozie/ //Delete oozie rm -rf $OOZIE_HOME/oozie-server/webapps/oozie.war //Delete oozie.war ./root/start-oozie.sh //Rerun the start script about oozie
- Visit http://hadoop-master:18080/EMLStudio in your browser and log in via username:
bdaict@hotmail.comand password:bdaict, you can find a example in the task list. - Clone it and submit to the server. If the task can run correctly, congratulations on your successful configuration.
- If you want to rebuild the cluster, you can use
sh stop_containers.shto stop and remove hadoop-master, hadoop-slave1 and hadoop-slave2 containers and usesh rm_images.shto remove cluster image. - If you have restart your docker or entity machine, the containers need to be restarted, you can use
sh restart_service.shto restart all containers. But you need to note that after you have executed thesh restart_service.sh, you will enter the hadoop-master container, meanwhile you also need to execute thesh restart.shin hadoop-master container to restart hadoop, spark and oozie service.





















