Using Adapters and the DeploymentModel class

Alongside full finetuning, finetune also supports the adapter finetuning strategy from “Parameter-Efficient Transfer Learning for NLP”. This dramatically shrinks the size of serialized model files to ~30mb. When used in conjunction with the DeploymentModel class at inference time, this enables quickly switching between target models.

# First we train and save a model using the adapter finetuning strategy
from finetune import Classifier, DeploymentModel
from finetune.base_models import GPT
model = Classifier(adapter_size=64), Y)'adapter-model.jl')

# Then we load it using the DeploymentModel wrapper
deployment_model = DeploymentModel(featurizer=GPT)

# Loading the featurizer only needs to be done once

# You can then cheaply load + predict with any adapter model that uses the
# same base_model and adapter_size
predictions = deployment_model.predict(testX)

# Switching to another model takes only 2 seconds now rather than 20
predictions = deployment_model.predict(testX)