.yaml configuration file
Code for this section:
Composition of the configuration file in LibFewShot
The configuration file of LibFewShot uses a yaml format file and it also supports reading the global configuration changes from the command line. We have pre-defined a default configuration
core/config/default.yaml. The users can put the custom configuration into the
config/ directory, and save this file in the
yaml format. At parsing, the sequencing relationship of defining the configuration of the method is
default.yaml->config/->console. The latter definition overrides the same value in the former definition.
Although most of the basic configurations have been set in the
default.yaml, you can not directly run a program just using the
default.yaml. Before running the code, the users are required to define a configuration file of one method that has been implemented in LibFewShot in the
Considering that FSL menthods usually have some basic parameters, such as
way, shot or
device id, which are often needed to be changed, LibFewShot also supports making changes to some simple configurations on the command line without modifying the
yaml file. Similarly, during training and test, because many parameters are the same of different methods, we wrap these same parameters together and put them into the
config/headers for brevity. In this way, we can write the
yaml files of the custom methods succinctly by importing them.
The following is the composition of the files in the
data.yaml: The relevant configuration of the data is defined in this file.
device.yaml: The relevant configuration of GPU is defined in this file.
losses.yaml: The relevant configuration of the loss used for training is defined in this file.
misc.yaml: The miscellaneous configuration is defined in this file.
model.yaml: The relevant configuration of the model is defined in this file.
optimizer.yaml: The relevant configuration of the optimizer used for training is defined in this file.
The settings of the configuration file in LibFewShot
The following details each part of the configuration file and explain how to write them. An example of how the DN4 method is configured is also presented.
The settings for data
data_root: The storage path of the dataset.
image_size: The size of the input image.
use_momery: Whether to use memory to accelerate reading.
augment: Whether to use data augmentation.
augment_times：support_set: The number of data augmentation/transformations used. Expanding the
support setdata for multiple times.
augment_times_query：query_set: The number of data augmentation/transformations used. Expanding the
query setdata for multiple times.
data_root: /data/miniImageNet--ravi image_size: 84 use_memory: False augment: True augment_times: 1 augment_times_query: 1
The settings for model
backboneinformation used in the method.
name: The name of the
backbone, needs to match the case of the
backboneimplemented in LibFewShot.
kwargs: The parameters used in the
backbone, must keep the name consistent with the name in the code.
is_flatten: The default is False, and if
True, the feature vector after flatten is returned.
avg_pool: The default is False, and if
True, the feature vector after
global average poolingis returned.
is_feature: The default is False, and if
True, the output of each
backbone: name: Conv64FLeakyReLU kwargs: is_flatten: False
classifierinformation used in the method.
name: The name of the
classifier, needs to match the case of the
classifierimplemented in LibFewShot.
kwargs: The parameters used in the
classifierinitialization, must keep the name consistent with the name in the code.
classifier: name: DN4 kwargs: n_k: 3
The settings for training
epoch: The number of
test_epoch: The number of
pretrain_path: The path of the pre-training weights. At the beginning of the training, this setting will be first checked. If it is not empty, the pre-trained weights of the target path will be loaded into the
backboneof the current training.
resume: If set to True, the training status is read from the default address to support continual training.
way_num: The number of
shot_num: The number of
query_num: The number of
test_way: The number of
wayduring testing. If not specified, the
way_numis assigned to the
test_shot: The number of
shotduring testing. If not specified, the
shot_numis assigned to the
test_query: The number of
queryduring testing. If not specified, the
query_numis assigned to the
episode_size: The number of tasks/episodes used for the network training at each time.
batch sizeused when the
pre-trained. In some kinds of methods, this property is useless.
train_episode: The number of tasks per
test_episode: The number of tasks per
epoch: 50 test_epoch: 5 pretrain_path: ~ resume: False way_num: 5 shot_num: 5 query_num: 15 test_way: ~ test_shot: ~ test_query: ~ episode_size: 1 # batch_size only works in pre-train batch_size: 128 train_episode: 10000 test_episode: 1000
The settings for optimizer
optimizer: Optimizer information used during training.
name: The name of the Optimizer, only temporarily supports all Optimizers provided by
kwargs: The parameters used in the optimizer, and the name needs to be the same as the parameter name required by the pytorch optimizer.
other: Currently, the framework only supports the learning rate used by each part of a separately specified method, and the name needs to be the same as the variable name used in the method.
#For demonstration purposes, there are no additional training parameters for dn4.
lr_scheduler: The learning rate adjustment strategy used during training, only temporarily supports all the learning rate adjustment strategies provided by
name: The name of the learning rate adjustment strategy.
kwargs: Other parameters used in the learning rate adjustment strategy in
The settings for Hardware
gpu number, which is the same as the
n_gpu: The number of parallel
gpu used during training, if
1, it can’t apply to parallel training.
deterministic: Whether to turn on
torch.backend.cudnn.deterministic and whether to determine random seeds during training.
seed: Seed points used in
The settings for Miscellaneous
log_name: If empty, use the auto-generated
classifier.name-data_root-backbone-way_num-shot_num file directory.
log_level: The log output level during training.
log_interval: The number of tasks for the log output interval.
result_root: The root of the result.
save_interval: The epoch interval to save weights.
save_part: The name of the variable in the method that needs to be saved. Variables with these names are saved separately when the model is saved. The parts that need to be saved are given as a list under