Write a .yaml
configuration file
Code for this section:
core/config/config.py
config/*
Composition of the configuration file in LibFewShot
The configuration file of LibFewShot uses a yaml format file and it also supports reading the global configuration changes from the command line. We have pre-defined a default configuration core/config/default.yaml
. The users can put the custom configuration into the config/
directory, and save this file in the yaml
format. At parsing, the sequencing relationship of defining the configuration of the method is default.yaml->config/->console
. The latter definition overrides the same value in the former definition.
Although most of the basic configurations have been set in the default.yaml
, you can not directly run a program just using the default.yaml
. Before running the code, the users are required to define a configuration file of one method that has been implemented in LibFewShot in the config/
directory.
Considering that FSL menthods usually have some basic parameters, such as way, shot
or device id
, which are often needed to be changed, LibFewShot also supports making changes to some simple configurations on the command line without modifying the yaml
file. Similarly, during training and test, because many parameters are the same of different methods, we wrap these same parameters together and put them into theconfig/headers
for brevity. In this way, we can write the yaml
files of the custom methods succinctly by importing them.
The following is the composition of the files in the config/headers
directory.
data.yaml
: The relevant configuration of the data is defined in this file.device.yaml
: The relevant configuration of GPU is defined in this file.losses.yaml
: The relevant configuration of the loss used for training is defined in this file.misc.yaml
: The miscellaneous configuration is defined in this file.model.yaml
: The relevant configuration of the model is defined in this file.optimizer.yaml
: The relevant configuration of the optimizer used for training is defined in this file.
The settings of the configuration file in LibFewShot
The following details each part of the configuration file and explain how to write them. An example of how the DN4 method is configured is also presented.
The settings for data
data_root
: The storage path of the dataset.image_size
: The size of the input image.use_momery
: Whether to use memory to accelerate reading.augment
: Whether to use data augmentation.augment_times:support_set
: The number of data augmentation/transformations used. Expanding thesupport set
data for multiple times.augment_times_query:query_set
: The number of data augmentation/transformations used. Expanding thequery set
data for multiple times.data_root: /data/miniImageNet--ravi image_size: 84 use_memory: False augment: True augment_times: 1 augment_times_query: 1
The settings for model
backbone
: Thebackbone
information used in the method.name
: The name of thebackbone
, needs to match the case of thebackbone
implemented in LibFewShot.kwargs
: The parameters used in thebackbone
, must keep the name consistent with the name in the code.is_flatten: The default is False, and if
True
, the feature vector after flatten is returned.avg_pool: The default is False, and if
True
, the feature vector afterglobal average pooling
is returned.is_feature: The default is False, and if
True
, the output of eachblock
inbackbone
is returned.
backbone: name: Conv64FLeakyReLU kwargs: is_flatten: False
classifier
: Theclassifier
information used in the method.name
: The name of theclassifier
, needs to match the case of theclassifier
implemented in LibFewShot.kwargs
: The parameters used in theclassifier
initialization, must keep the name consistent with the name in the code.
classifier: name: DN4 kwargs: n_k: 3
The settings for training
epoch
: The number ofepoch
during training.test_epoch
: The number ofepoch
during testing.pretrain_path
: The path of the pre-training weights. At the beginning of the training, this setting will be first checked. If it is not empty, the pre-trained weights of the target path will be loaded into thebackbone
of the current training.resume
: If set to True, the training status is read from the default address to support continual training.way_num
: The number ofway
during training.shot_num
: The number ofshot
during training.query_num
: The number ofquery
during training.test_way
: The number ofway
during testing. If not specified, theway_num
is assigned to thetest_way
.test_shot
: The number ofshot
during testing. If not specified, theshot_num
is assigned to thetest_way
.test_query
: The number ofquery
during testing. If not specified, thequery_num
is assigned to thetest_way
。episode_size
: The number of tasks/episodes used for the network training at each time.batch_size
: Thebatch size
used when thepre-training
model ispre-trained
. In some kinds of methods, this property is useless.train_episode
: The number of tasks perepoch
during training.test_episode
: The number of tasks perepoch
during testing.epoch: 50 test_epoch: 5 pretrain_path: ~ resume: False way_num: 5 shot_num: 5 query_num: 15 test_way: ~ test_shot: ~ test_query: ~ episode_size: 1 # batch_size only works in pre-train batch_size: 128 train_episode: 10000 test_episode: 1000
The settings for optimizer
optimizer
: Optimizer information used during training.name
: The name of the Optimizer, only temporarily supports all Optimizers provided byPyTorch
.kwargs
: The parameters used in the optimizer, and the name needs to be the same as the parameter name required by the pytorch optimizer.other
: Currently, the framework only supports the learning rate used by each part of a separately specified method, and the name needs to be the same as the variable name used in the method.
optimizer:
name: Adam
kwargs:
lr: 0.01
other:
emb_func: 0.01
#For demonstration purposes, there are no additional training parameters for dn4.
dnf_layer: 0.001
lr_scheduler
: The learning rate adjustment strategy used during training, only temporarily supports all the learning rate adjustment strategies provided by PyTorch
.
name
: The name of the learning rate adjustment strategy.kwargs
: Other parameters used in the learning rate adjustment strategy inPyTorch
.
lr_scheduler:
name: StepLR
kwargs:
gamma: 0.5
step_size: 10
The settings for Hardware
device_ids
: The gpu
number, which is the same as the nvidia-smi
command.
n_gpu
: The number of parallel gpu
used during training, if 1
, it can’t apply to parallel training.
deterministic
: Whether to turn on torch.backend.cudnn.benchmark
and torch.backend.cudnn.deterministic
and whether to determine random seeds during training.
seed
: Seed points used in numpy
,torch
,and cuda
.
device_ids: 0,1,2,3,4,5,6,7
n_gpu: 4
seed: 0
deterministic: False
The settings for Miscellaneous
log_name
: If empty, use the auto-generated classifier.name-data_root-backbone-way_num-shot_num
file directory.
log_level
: The log output level during training.
log_interval
: The number of tasks for the log output interval.
result_root
: The root of the result.
save_interval
: The epoch interval to save weights.
save_part
: The name of the variable in the method that needs to be saved. Variables with these names are saved separately when the model is saved. The parts that need to be saved are given as a list under save_part
.
log_name: ~
log_level: info
log_interval: 100
result_root: ./results
save_interval: 10
save_part:
- emb_func
- dn4_layer