GitHub - jacobmarks/active-learning-plugin: Label your dataset with active learning in FiftyOne!

🏃 Active Learning 🏃

When it comes to machine learning, one of the most time-consuming and costly parts of the process is data annotation. Especially in the realm of computer vision, labeling images or videos can be an incredibly laborious task, often requiring a team of annotators and hours of meticulous work to generate high-quality labels.

What if you could make this process smarter and more efficient? Enter Active Learning — a paradigm that iteratively selects the most "informative" or "ambiguous" examples for labeling, thereby reducing the amount of manual annotation needed. In practical terms, this means your model gets better, faster, and with fewer labeled samples.

This FiftyOne plugin brings Active Learning to your computer vision data using the modAL library, allowing you to integrate this accelerant directly into your annotation workflow. Now you can prioritize, query, and annotate the most crucial data points, all within the FiftyOne App—no coding necessary.

The best part? You can use this in tandem with your traditional annotation service providers (via FiftyOne’s integrations with CVAT, Labelbox and Label Studio), or even with the FiftyOne Zero-shot Prediction plugin!

Watch On Youtube

Installation

fiftyone plugins download https://github.com/jacobmarks/active-learning-plugin

Then install the requirements:

fiftyone plugins requirements @jacobmarks/active_learning --install

Operators

`create_active_learner`

Creates an active learning model and environment. The learner is initialized from a set of initial labels and input features.

We can choose:

The field or fields to use as a feature vector
The label field in which to store predictions
The default batch size — the number of samples per query
The Active Learner

For the latter of these, we can select from a variety of ensemble strategies, including Random Forest, Gradient Boosting, Bagging, and AdaBoost. When we make this top-level selection, the remainder of the form dynamically updates with appropriate hyperparameter configuration choices.

Executing this operator creates a modAL ActiveLearner that uses an “uncertainty” batch sampling. The execution also invokes the generation of initial predictions, and triggers the reload of the dataset.

`query_learner`

Queries the active learner for the next samples to label. If you'd like, you can override the default query batch size.

Tag the samples whose predicted labels are incorrect. Untagged samples will be treated as correct predictions.

`update_learner_predictions`

After correcting the incorrect query labels, we can update our active learner by “teaching” it this new information. Running this operator updates our active learning model, updates the label field with new predictions, and reloads the app.

Usage

0. Generate Initial Labels

Before we can create an active learner, we need to generate some initial labels. We can do this using the Zero-shot Prediction plugin:

Alternatively, we can use tags on some of our samples as labels, so long as they are mutually exclusive.

1. Create Input Features

Next, we need to populate fields on our samples with numerical attributes (floats or arrays) that we can use as input features for our active learner.

A common choice is model embeddings, which can be computed either in the FiftyOne App, or in Python:

import fiftyone as fo
import fiftyone.zoo as foz

mobilenet = foz.load_zoo_model("mobilenet-v2-imagenet-torch")
dataset.compute_embeddings(mobilenet, embeddings_field="mobilenet_embeddings")

You can also add float-valued fields. For example, using the Image Quality Issues Plugin you can compute the brightness, contrast, and saturation of your images!

2. Create an Active Learner

Now we're ready to create an active learner. We can do this using the create_active_learner operator:

3. Query the Active Learner

Once we've created an active learner, we can query it for the next batch of samples to label. We can do this using the query_learner operator:

We then tag the samples whose predicted labels are incorrect. Untagged samples will be treated as correct predictions:

4. Update the Active Learner

After correcting the incorrect query labels, we can update our active learner by “teaching” it this new information. We can do this using the update_learner_predictions operator:

5. Repeat!

Now we can repeat steps 3 and 4 until we're satisfied with our model's performance.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
__init__.py		__init__.py
active_learning.py		active_learning.py
fiftyone.yml		fiftyone.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

README.md

README.md

init.py

init.py

active_learning.py

active_learning.py

fiftyone.yml

fiftyone.yml

requirements.txt

requirements.txt

Repository files navigation

🏃 Active Learning 🏃

Watch On Youtube

Installation

Operators

`create_active_learner`

`query_learner`

`update_learner_predictions`

Usage

0. Generate Initial Labels

1. Create Input Features

2. Create an Active Learner

3. Query the Active Learner

4. Update the Active Learner

5. Repeat!

About

Releases

Packages

Languages

jacobmarks/active-learning-plugin

Folders and files

Latest commit

History

Repository files navigation

🏃 Active Learning 🏃

Watch On Youtube

Installation

Operators

create_active_learner

query_learner

update_learner_predictions

Usage

0. Generate Initial Labels

1. Create Input Features

2. Create an Active Learner

3. Query the Active Learner

4. Update the Active Learner

5. Repeat!

About

Topics

Resources

Stars

Watchers

Forks

Languages

`create_active_learner`

`query_learner`

`update_learner_predictions`