How do I assign different weights to training samples?
Learn how to use instance-level weights to influence model training in Kumo.
Why assign different weights to training samples?
In some use cases, you may want to assign different levels of importance to samples in the training table. For instance, on an e-commerce platform, it can be valuable to emphasize high-value customers more during training. This can be done by assigning larger weights to those training instances.
Kumo supports sample-level weighting via the training table. Using the kumo-sdk, you can add a weight column to scale each row’s relative contribution to the loss function. These weights directly influence the model’s optimization process and should be used thoughtfully.
Guidelines for Setting Weights
-
Negative weights: Kumo allows you to assign negative weights to some (not all) training instances, for only non link prediction tasks. This can be useful in advanced setups like hard negative mining. However, large or excessive negative weights can severely degrade model performance, as they can lead to unbounded loss and make the training process unstable.
-
Zero weights: Kumo allows you to assign zero weight to some (not all) training instances. By setting a weight of zero, you effectively remove that sample from training, which can be helpful for excluding noisy or irrelevant instances. That said, assigning zero weights to too many samples can shrink the effective loss signal and slow convergence.
-
Highly skewed weights: While moderate variation in weights is usually fine, extreme disparities between sample weights can negatively impact training stability, slow convergence, and lead to poor generalization. If using skewed weight distributions, monitor their effects on the training loss curve closely.
You can use instance weighting to better align training with business value, but carefully consider the effects of the assigned weights on loss stability and model performance before deploying your model.