19 Jul 11:05

a3f849d

Final classifier used for publication Latest

Latest

Script used for training the models used in publication: https://www.frontiersin.org/articles/10.3389/fmicb.2021.746297/full

Assets 2

02 Mar 17:39

kspruthviraj

v1.2.2

1efc927

Added Ciliates classifier

In this version, a new classifier has been added. The classifier (Efficient B0 network) was trained on Ciliates and Non-ciliates data.

Note: Classifier classifies Ciliates as Non-negatives and Non-ciliates as Negatives.

Please make use of the below filenames for predicting new data in the directory ./trained_model/Final_ciliates_eff0_classifier/

a) model file --> keras_model_finetune.h5
b) weight file --> bestweights_finetune.hdf5

Please note that the same commands from earlier version v1.2.1 works here as well.

Assets 2

26 Nov 14:48

kspruthviraj

v1.2.1

d7492bc

Binary, multi and versusall classifiers with option for hypertuning

The syntax for predicting images is the same as previous releases, but there are some changes for training script.

While training, the user now can choose between binary, multi and versus-all classifier. When the hp_tuning is set to yes (in order to do hyper parameter tuning), then the user can choose more than one models to train. The user can choose between two different options for pre-processing images i.e. either resize the image to desired size by keeping the original proportions or without keeping the proportions.The user can also save the data and the filenames. Finally, the user can choose models to do either Average ensemble and stacking ensemble of models. The classification report, confusion matrix and the plots are saved in the output directory.

For the end user who wants to use the trained models for predictions,
Please install some of these important libraries- (Not everything is listed)

pip install git+https://github.com/qubvel/efficientnet
pip install -U scikit-learn
pip install tensorflow ==2.3.0
pip install keras==2.4.3
pip install imutils==0.5.3

Then to predict your images, use:

python predict.py -modelfullnames './trained-models/eff0_for_DiatomCentrics/keras_model_finetune.h5' -weightnames './trained-models/eff0_for_DiatomCentrics/bestweights_finetune.hdf5' -ensMethods 'leader' -testdirs './data/Centric_Diatoms_latest/Validation_from_experts/Centric_diatoms/' './data/Centric_Diatoms_latest/Validation_from_experts/Centric_negatives/' -predname './out/predictions/Centric_diatoms'-thresholds 0.99 0.8 2>/dev/null

If you have images for validation i.e. images that are separated according to their labels in specified folders and want to see how the model performs, then use this command: You can also specify whether to copy the misclassified images in to separate directory for further investigation by controlling -save_misclassified parameter

python validation.py -modelnames ./trained-models/eff0_for_DiatomCentrics/keras_model_finetune.h5 -weightnames ./trained-models/eff0_for_DiatomCentrics/bestweights_finetune.hdf5 -datapaths ./data/Centric_Diatoms_latest/Validation_from_experts/ -classifier binary -outpath ./out/misclassified/ -save_misclassified yes

Assets 3

07 Sep 15:02

mbaityje

v1.1.1

49e8a5f

More classes

The current release includes models trained on more classes. The syntax is the same as previous releases, but the models in ./trained-models are now trained on 34 classes.

Assets 2

12 Jun 14:38

mbaityje

v1.1.0

947ebdb

Multiple arguments on predictions

Changes on `predict.py`

This patch improves usability. Now, a single run can make predictions on several directories. Also the ensembling method and the abstention can be given as multiple arguments, and the program will loop through them.
Note that the keyword for the ensembling method was changed from em to ensMethod, and absthres was changed to thresholds, and most of the other keywords were changes to the plural. For a summary, run python predict.py -h.

Example of the new usage:

python predict.py -ensMethods 'leader' -testdirs 'data/1_zooplankton_0p5x/validation/tommy_validation/images/asterionella/' 'data/1_zooplankton_0p5x/validation/tommy_validation/images/uroglena/' 'data/1_zooplankton_0p5x/validation/tommy_validation/images/asplanchna/' -thresholds 0.99 0.8 2>/dev/null

In addition to the screen output, this example produces two files, predict_leaderabs0.8.txt and predict_leaderabs0.99.txt, with the prediction from each set of options.

Assets 2

09 Jun 10:15

mbaityje

v1.0.0

8f83cae

Changed core and introduced ensembling

What's new

I completely modified the core of the program, but you won't notice this. What you will notice are the improvements to the classifier. The main user-level feature is that I implemented some simple forms of ensembling and open set recognition. In this release, you have 4 classifiers, which you can use by themselves or together. Also, if hesitating, the classifier will restrain from making a prediction (which is what the taxonomist does with the unknown classes).

Usage

The only file that practitioners should look at is predict.py.

To display help with the command line arguments, launch as

python predict.py -h

Here is an example of usage, that implements a weighted majority rule, with a 1.0 abstention (see explanations at the bottom of the page)

python predict.py -testdir ./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/ -absthres=1.0 -em 'weighted-majority'  2>/dev/null

Here is a unanimity rule with 0.8 abstention.

python predict.py -testdir ./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/ -absthres=0.8 -em 'unanimity' 2>/dev/null

The screen output should look like this

./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/SPC-EAWAG-0P5X-1563026664165210-9932037279651-002549-028-1074-1754-64-36.jpeg keratella_quadrata
./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/SPC-EAWAG-0P5X-1575360491907312-8542395764139-004819-014-870-1900-56-52.jpeg keratella_quadrata
./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/SPC-EAWAG-0P5X-1575360195859503-8542099739532-001859-020-2384-1318-64-48.jpeg keratella_quadrata
./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/SPC-EAWAG-0P5X-1530835631817821-2549602643688-004219-000-1812-2554-40-45.jpeg keratella_quadrata
./data/1_zooplankton_0p5x/validation/tommy_validation/images/keratella_quadrata/SPC-EAWAG-0P5X-1572948291990397-6130232224429-002829-077-2144-442-72-32.jpeg keratella_quadrata

and it should also be printed in the file ./predict/predict.txt. If the folder ./predict/ does not exist, it is automatically created.

If you suspect something is going wrong, remove the 2>/dev/null from the command (that's a way to hide the annoying messages that keras sends, but it also hides warning and error messages!).

Some more explanation

Ensemble rules

I implemented 4 rules to obtain a collaboration between the classifiers:

'unanimity': We only accept guesses where all the classifiers agree. Disagreements will result in 'Unclassified'.
'majority': Takes the choice of the majority of classifiers. I didn't focus on dealing with ties, because I don't like this method.
'leader': Take the choice of the most confident model.
'weighted-majority': Sums the confidences of all the classifiers, and chooses the class with the highest summed confidence.

Abstention

Abstention is a way of having the classifiers state that they don't know enough. The normal behavior is that if a classifier's confidence is smaller than absthres, then the classifier will abstain from making predictions.

With 'weighted-majority' the behavior is different, as instead of filtering on the single models, we are now filtering on the sum of the confidences, so the confidence can be larger than one.

Output

You can specify the output directory with the -outpath option. You can also specify the output filename with -predname option.

Classifier loading

The classifiers must be loaded. The default behavior assumes that they are in the directory ./out/trained-models/, and loads the four of them. If paths change, or if you do not want to use the four of them, you need to specify their path through the option -modelfullname. To specify more than one classifier, just put the paths separated by a space. The model is the keras_model.h5 file.

Assets 2

23 Apr 11:36

mbaityje

v0.2.1

ccda5ad

Modifications on output Pre-release

Pre-release

A pre-release for playing around with zooplankton data and feedback on API

The only file that practitioners should look at is predict.py.

Main changes

Undesired prints are removed
Added the -preddir and -predname options, which allow to specify the path and name of the output file

Assets 2

21 Apr 10:06

mbaityje

v0.2.0

9f21e87

Minor improvements in input/output Pre-release

Pre-release

A pre-release for playing around with zooplankton data and feedback on API

The only file that practitioners should look at is predict.py.

Main changes

The script predict.py now produces a summary classification file, called predict.txt.
The target argument was removed, and testdir now is directly the folder containing the images (instead of containing folders with images).
Error messages slightly improved.

Launch as:

python predict.py -testdir='path-to-images'

To query all options:

python predict.py -h

Assets 2

19 Feb 17:10

mbaityje

v0.1.3

ec77cf8

Patch: handle exception when training dataset unavailable Pre-release

Pre-release

A consistency check was performed comparing the read data with the original dataset.
Users do not have access to this dataset, so we must handle the exception that comes up when the dataset is not available. This is done in the current patch.

(This same release was done earlier, but was targeted on the wrong branch).

Assets 2

19 Feb 16:44

mbaityje

v0.1.1

51d6ddf

Patch: added loadable model Pre-release

Pre-release

Previous release did not include the path to a loadable model.
Here, we provide the model in the folder util-files/trained-conv2/

Assets 2

Releases: mbaityje/plankifier

Final classifier used for publication

Uh oh!

Added Ciliates classifier

Uh oh!

Binary, multi and versusall classifiers with option for hypertuning

Uh oh!

More classes

Uh oh!

Multiple arguments on predictions

Changes on predict.py

Uh oh!

Changed core and introduced ensembling

What's new

Usage

Some more explanation

Ensemble rules

Abstention

Output

Classifier loading

Uh oh!

Modifications on output

A pre-release for playing around with zooplankton data and feedback on API

Main changes

Uh oh!

Minor improvements in input/output

A pre-release for playing around with zooplankton data and feedback on API

Main changes

Uh oh!

Patch: handle exception when training dataset unavailable

Uh oh!

Patch: added loadable model

Uh oh!

Changes on `predict.py`