• An automatic zooplankton identification model has been developed for 114 taxonomic categories.

• The model successfully distinguishes species and stages.

• Various model validations show high model performance for identifying key zooplankton taxa.

• The model makes unprecedented insights into the fine scale vertical distribution of taxa possible.


We deployed the Lightframe On-sight Keyspecies Investigation (LOKI) system, a novel underwater imaging system providing cutting-edge imaging quality, in the Canadian Arctic during fall 2013. A Random Forests machine learning model was built to automatically identify zooplankton in LOKI images. The model successfully distinguished between 114 different categories of zooplankton and particles. The high resolution taxonomical tree included many species, stages, as well as sub-groups based on animal orientation or condition in images. Results from a machine learning regression model of prosome length (R2=0.97) were used as a key predictor in the automatic identification model. Model internal validation of the automatic identification model on test data demonstrated that the model performed with overall high accuracy (86%) and specificity (86%). This was confirmed by confusion matrices for external testing results, based on automatic identifications for 2 complete stations. For station 101, from which images had also been used for training, accuracy and specificity were 85%. For station 126, from which images had not been used to train the model, accuracy and specificity were 81%. Further comparisons between model results and microscope identifications of zooplankton in samples from the two test stations were in good agreement for most taxa. LOKI’s image quality makes it possible to build accurate automatic identification models of very high taxonomic detail, which will play a critical role in future studies of zooplankton dynamics and zooplankton coupling with other trophic levels.

Also see researchgate at for the publication!


1 2 3
November 23rd, 2016

#ggforce for accelerating #ggplot2 in #dataviz

November 20th, 2016

November 16th, 2016

ggplot2 2.2.0 #dataviz #R #datascience @rstudio

November 16th, 2016

ggedit add-on for #ggplot2 #dataviz #R #datascience

August 27th, 2016

#R Packages for #Data Access

August 16th, 2016

Getting Your Colleagues Hooked on #R

August 14th, 2016

Convolutional #neuralnetwork in #R (MXNet package) #MachineLearning #DataScience

June 3rd, 2016

Mad Hatter Explains Support Vector Machines #scicomm #machinelearning #SVM #datascience

May 11th, 2016

What’s the difference between machine learning, statistics, and data mining?

April 6th, 2016

Plotter app for interactive plotting of ggplots, on or locally.