Hypertuning for TensorFlow & PyTorch



Original Source Here

Red is quite popular despite less functionality for less layer types — Image by Author

Tensors of a feather flock together.

One year ago, I set out to build a library that made experiment tracking & parameter tuning easier for both TensorFlow and PyTorch. These libraries shared the same underlying concepts, so it was easy enough to wrap them:

  • fn_build — for the architecture.
  • fn_train — to define the loop.
  • fn_lose — to calculate loss.
  • fn_optimize — for learning settings.
  • fn_predict — for running it.

It was fun! I was testing entirely different architectures with simple if statements (e.g. concave_convex:[True,False]), and even mixing Torch loss functions with Keras models.

Based on the type of analysis I was conducting (e.g. regression, binary classification, and multi-label classification) — I began to notice that I was using the same lose-optimize-predict combinations. So I was able to set overridable defaults for most of the components.

♻️ Similar to how Keras abstracts a training loop, I had abstracted a loop of training loops, while maintaining the customizability of the workflow.

Here is a basic multi-label classification example:

# Unique param combos will get passed into functions as `hp**`.
hyperparameters = {
"neuron_count": [9, 12]
, "batch_size": [3, 5]
, "epoch_count": [30, 60]
}
def fn_build(features_shape, label_shape, **hp):
model = Sequential()
model.add(Input(shape=features_shape))
model.add(Dense(units=hp['neuron_count'], activation='relu'))
model.add(Dense(units=label_shape[0], activation='softmax'))
return model
def fn_train(
model, loser, optimizer,
samples_train, samples_evaluate, **hp
):

model.compile(
loss = loser
, optimizer = optimizer
, metrics = ['accuracy']
)
model.fit(
samples_train["features"]
, samples_train["labels"]
, validation_data = (
samples_evaluate["features"]
, samples_evaluate["labels"]
)
, verbose = 0
, batch_size = hp['batch_size']
, epochs = hp['epoch_count']
, callbacks = [History()]
)
return model

These components are used to assemble a Queue of training jobs:

queue = aiqc.Experiment.make(
# --- Analysis type ---
library = "keras"
, analysis_type = "classification_multi"

# --- Model functions ---
, fn_build = fn_build
, fn_train = fn_train
, fn_lose = None #auto CatCrossEnt.
, fn_optimize = None #auto Adamax <3.
, fn_predict = None #returns `preds, probs`.

# --- Training options ---
, repeat_count = 2
, hyperparameters = hyperparameters

# --- Data source ---
, splitset_id = splitset.id #scroll down.
, hide_test = False
)
queue.run_jobs()
#🔮 Training Models 🔮: 100%|███████████████████████| 16/16

However, when it came time to evaluate each model with metrics & charts, I realized that several critical problems were being swept under the rug.

AIQC’s Built-In Visualizations — Image by Author

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: