Confusing Training Methods

I organized things I had been confused about or only vaguely understood.

Training, validation ordering

1
def train():
2
    for epoch in range(epcoh):
3
      training()
4
        validate()

This ordering is correct. The version below also trains the model on input data:

1
def train():
2
    for epoch in range(epcohs):
3
      training()
4
    for epoch in range(epcohs):
5
        validate()

The problem is that validation happens only after all training is complete. It just validates the final trained model repeatedly for each epoch. A waste of resources.

With the proper ordering, you can validate the learning result of each step and reflect it in the evaluation.

K fold cross validation

As the name says, it is a validation technique. So it should not be used for training like this:

1
def train():
2
    for epoch in range(epcohs):
3
      training()
4
    for epoch in range(epcohs):
5
        validate()
6
def kfoldvalidate()
7
    # do something...
8

9
train()
10
kfoldvalidatie()

I think it could be used in training, but if so, it would probably look something like this with ensemble learning (just my speculation…):

1
def train():
2
    model_list = MakeManyModel()
3
    for idx, train_set, validate_set in enumerate(kfold(dataset)):
4
      for epoch in range(epcohs):
5
          training(mode_list[idx])
6
      for epoch in range(epcohs):
7
          validate(model_list[idx])
8
    return model_list
9
def kfoldvalidate(model_list)
10
    SelectBestModel(model_list)
11

12
train()
13
for i in range(k):
14
     kfoldvalidatie()

One model exists per fold, and the best model among k models is selected. You could also use voting like actual ensemble learning instead of picking the single best model. I did not try it because it would consume too many resources..