Protocol Documentation
Table of Contents
Top
deepomatic/oef/protos/dataoperation.proto
DataOperation
An operation that can be applied to the whole dataset before the training.
Field |
Type |
Label |
Description |
loss_based_balancing |
LossBasedBalancing |
optional |
The loss based balancing operation. |
LossBasedBalancing
A class-balancing operation that will duplicate elements of less represented
classes.
Field |
Type |
Label |
Description |
batch_size |
float |
optional |
Number of datapoints considered before checking for entropy stability. If lower or equal to 1: this represents the fraction of the dataset to consider Default: 0.01 |
epsilon |
float |
optional |
The value below which entropy is considered stable. Default: 0.0001 |
max_dataset_expansion |
float |
optional |
Multiple of the dataset size allowed. Default: 2 |
Top
deepomatic/oef/protos/dataset.proto
Dataset related messages.
Dataset
A dataset pointer: you give the path to the root of all data source and the path to an object of type DatasetDump.
Field |
Type |
Label |
Description |
root |
string |
optional |
Root of all the data source. Might be a Google Storage path. |
config_path |
string |
required |
Path of the dataset config file. |
operations |
deepomatic.oef.dataoperation.DataOperation |
repeated |
A set of optional operations to be applied to the dataset (filter, balancing, …). |
margin_crop |
float |
optional |
option to add margin when the images are cropped, the value added is the percent of the minimum dimension [0.0-1.0] Default: 0 |
Top
deepomatic/oef/protos/dataset_dump_v1.proto
Dataset related messages.
DEPRECATED! Please use dataset_dump_v2
Annotation
Stores the value of an annotation.
Field |
Type |
Label |
Description |
name |
string |
|
Refers to the name of the annotation specification. |
bool_value |
bool |
|
A boolean value. |
string_value |
string |
|
A string value. |
int_value |
int32 |
|
An integer value. |
float_value |
float |
|
A float value. |
distribution_value |
LabelDistribution |
|
A list of probabilites per label. |
AnnotationSpecification
Specification for a concept.
DataPoint
A data-point is a data source augmented with its annotations.
DatasetDump
A dataset
DatasetDump.DataPointsEntry
ImageBBox
A bounding box region.
Field |
Type |
Label |
Description |
xmin |
float |
|
Left x-coordinate. |
ymin |
float |
|
Top y-coordinate. |
xmax |
float |
|
Right x-coordinate. |
ymax |
float |
|
Bottom y-coordinate. |
A data-source that would be available at inference time.
Field |
Type |
Label |
Description |
name |
string |
|
The name source name (must match the one of the data source names declared in the specifications). |
value |
string |
|
Unused for now, it should rather be of type Annotation. |
value_from |
string |
|
Use this to load the data source from a file. The path should be relative to the dataset root. |
regions |
Region |
repeated |
List of data regions. |
Specification for a data source (e.g. an image).
LabelDistribution
A probability distribution over labels.
LabelDistributionEntry
An entry of a probabilistic distribution.
Field |
Type |
Label |
Description |
name |
string |
|
The label name. |
index |
int32 |
|
The label index. It does not have to be 0-based, it might be anything that suits your needs. |
proba |
float |
|
Probability of this label. |
NoRegion
A dummy region that refer to the whole data source.
Region
A data-source region (including the dummy NoRegion).
Field |
Type |
Label |
Description |
type |
Region.RegionType |
|
Type of region (depends on the data source and the Model.model_type) (TODO: should be a one-of) |
annotations |
Annotation |
repeated |
Stores all the annotations of the region. |
none |
NoRegion |
|
A dummy region that actually refers to the whole data source. |
image_bbox |
ImageBBox |
|
A 2D bounding box. |
id |
string |
|
The id of that Region |
parent_id |
string |
|
Refers to the id of a parent Region |
Specification
Specification for a data point.
Field |
Type |
Label |
Description |
input_data |
InputDataSpecification |
repeated |
List of all the possible data source for a single data point. |
annotations |
AnnotationSpecification |
repeated |
List of all the possible concepts for a single data point. |
Split
Stores a dataset split.
Field |
Type |
Label |
Description |
ids |
SplitByIds |
|
Use this to define a split by IDs. |
@exclude Unused for now - should be used in conjuction with reproducible below. float pct = 2; |
SplitByIds
Defines a dataset split with the list of data point IDs.
Field |
Type |
Label |
Description |
ids |
string |
repeated |
List of IDs that belong to this split. |
Splits
Stores all the dataset splits.
@exclude Unused for now - should be used in conjuction with pct above. bool reproducible = 2; |
Splits.SplitMapEntry
TagAnnotationSpecification
Specifies a tag ID and name for a LABEL related annotation.
Field |
Type |
Label |
Description |
id |
int32 |
|
Unique deepomatic studio ID. |
tag_name |
string |
|
Tag name. |
Name |
Number |
Description |
IMAGE |
0 |
|
Region.RegionType
Type of region (depends on the data source and the Model.model_type).
Name |
Number |
Description |
NONE |
0 |
|
IMAGE_BBOX |
1 |
|
Top
deepomatic/oef/protos/dataset_dump_v2.proto
Dataset related messages.
Annotation
Stores the annotation.
Field |
Type |
Label |
Description |
view_id |
string |
|
view ID to which this annotation belongs |
concepts |
Concepts |
|
Concept annotation |
text |
string |
|
Text annotation |
mask |
Mask |
|
Mask annotation |
keypoints |
Keypoints |
|
Keypoints annotation |
ConceptAnnotation
Specifies list of concepts.
ConceptAnnotation.Concept
Specifies a tag ID and name for a LABEL related annotation.
Field |
Type |
Label |
Description |
id |
string |
|
Unique deepomatic studio ID (needed for studio and vulcan). |
name |
string |
|
Concept name. |
Concepts
A concept annotation.
Field |
Type |
Label |
Description |
concept_ids |
string |
repeated |
list of concept IDs. Should be of length 1 for single-label annotations |
DataPoint
A data-point is a data source augmented with its region annotations.
Field |
Type |
Label |
Description |
id |
string |
|
data point ID |
url |
string |
|
URL to a file. The path should be relative to the dataset root. |
metadata |
google.protobuf.Struct |
|
User defined meta-data informations. |
regions |
Region |
repeated |
List of data regions. |
splits |
string |
repeated |
Splits |
A dataset header. Contains a list of views and splits.
Field |
Type |
Label |
Description |
views |
View |
repeated |
The dataset views. |
splits |
string |
repeated |
The dataset splits. |
name |
string |
|
Name of the dataset |
KeypointAnnotation
Specifies a list of keypoint nodes and skeleton (node: [edges] for each node).
KeypointAnnotation.Keypoint
Keypoint definition
Field |
Type |
Label |
Description |
id |
string |
|
node id |
name |
string |
|
node name |
edges |
string |
repeated |
node edges |
Keypoints
A keypoint annotation.
Keypoints.Keypoint
A keypoint entry.
Field |
Type |
Label |
Description |
concept_id |
string |
|
Concept ID |
x |
float |
|
x-position |
y |
float |
|
y-position |
is_visible |
bool |
|
Is the keypoint visible. The coordinates can be non-zero (labeled) but the point is not visible. |
Mask
A mask annotation.
Field |
Type |
Label |
Description |
concept_id |
string |
|
Concept ID |
polygons |
Mask.Polygons |
|
polygons |
is_thing |
bool |
|
Is it a thing or stuff? |
Mask.Polygons
A polygon annotation
Field |
Type |
Label |
Description |
polygons |
Mask.Polygons.Polygon |
repeated |
A list of polygon. An object may be partially occluded and needs more than 1 polygon to be described |
Mask.Polygons.Polygon
Single Polygon with a list of vertices
Mask.Polygons.Polygon.Vertex
Vertex - a point of the polygon contour
Field |
Type |
Label |
Description |
x |
float |
|
x position |
y |
float |
|
y position |
Region
A data-source region (including the dummy WholeData).
Field |
Type |
Label |
Description |
annotations |
Annotation |
repeated |
Stores all the annotations of the region. |
whole_data |
Region.WholeData |
|
A dummy region that actually refers to the whole data source. |
image_bbox |
Region.ImageBBox |
|
A 2D bounding box. |
id |
string |
|
The id of that Region |
parent_id |
string |
|
Refers to the id of a parent Region |
Region.ImageBBox
A bounding box region.
Field |
Type |
Label |
Description |
xmin |
float |
|
Left x-coordinate. |
ymin |
float |
|
Top y-coordinate. |
xmax |
float |
|
Right x-coordinate. |
ymax |
float |
|
Bottom y-coordinate. |
Region.WholeData
A dummy region that refer to the whole data source.
TextAnnotation
Specifies a list of allowed characters and if word recognition is wanted.
Field |
Type |
Label |
Description |
allowed_characters |
string |
repeated |
allowed characters. |
use_word_recognition |
bool |
|
use auto-regression (for word recognition) |
View
Field |
Type |
Label |
Description |
id |
string |
|
Unique deepomatic studio ID. It should match experiment/view_ids. |
parent_id |
string |
|
Unique deepomatic studio ID of the parent view. |
concepts |
ConceptAnnotation |
|
Annotation of type LABEL. |
text |
TextAnnotation |
|
Annotation of type TEXT |
keypoints |
KeypointAnnotation |
|
Annotation of type KEYPOINTS |
type |
View.ViewType |
|
View type |
conditions |
View.Condition |
repeated |
Condition list(list()) condition = [] => no condition condition = [[]] => without concept condition = [[a], [b]] => a OR b condition = [[a, b]] => a AND b |
name |
string |
|
Name of view |
View.Condition
FIXME: This might handle other types as strings
Field |
Type |
Label |
Description |
condition |
string |
repeated |
View condition |
View.ViewType
Specifies the type of the view
Name |
Number |
Description |
undefined |
0 |
Default value and unsupported: ViewType has to be set |
classification |
1 |
Multi-class classification view |
tagging |
2 |
Multi-label classification view |
detection |
3 |
Detection view |
ocr |
4 |
OCR view |
segmentation |
5 |
Segmentation view |
keypoint |
6 |
Keypoint view |
Top
deepomatic/oef/protos/experiment.proto
Experiment related messages.
Experiment
This is the main message to define an experiment.
Field |
Type |
Label |
Description |
seed |
int32 |
optional |
Seed to be able to reproduce experiment Default: 0 |
dataset |
deepomatic.oef.dataset.Dataset |
required |
The dataset |
trainer |
deepomatic.oef.trainer.Trainer |
required |
The model |
hyperparameters |
Experiment.HyperparametersEntry |
repeated |
The hyper-parameters Map experiment parameters to hyper-parameter sampling distribution The key has to start with trainer. , i.e., trainer.initial_learning_rate |
max_hp_runs |
int32 |
optional |
Maximum number of hyper-parameter runs (default = 1 for standard training) Default: 1 |
view_ids |
string |
repeated |
List of views on which to train. It should match DatasetHeader/views ids. If this parameter is empty, the default view id used will be the first of DatasetHeader/views. |
Experiment.HyperparametersEntry
Top
deepomatic/oef/protos/hyperparameter.proto
CategoricalDistribution
Field |
Type |
Label |
Description |
values |
Value |
repeated |
|
HyperParameter
Field |
Type |
Label |
Description |
min |
float |
required |
|
max |
float |
required |
|
NormalDistribution
Field |
Type |
Label |
Description |
mu |
float |
required |
|
sigma |
float |
required |
|
Field |
Type |
Label |
Description |
min |
float |
required |
|
max |
float |
required |
|
Value
Field |
Type |
Label |
Description |
integer_value |
int32 |
optional |
|
float_value |
float |
optional |
|
boolean_value |
bool |
optional |
|
string_value |
string |
optional |
|
File-level Extensions
| Extension | Type | Base | Number | Description |
| ——— | —- | —- | —— | ———– |
| field_option | HyperParameter | .google.protobuf.FieldOptions | 1000 | |
| oneof_option | HyperParameter | .google.protobuf.OneofOptions | 1000 | |
Top
deepomatic/oef/protos/losses.proto
BootstrappedSigmoidClassificationLoss
Classification loss using a sigmoid function over the class prediction with
the highest prediction score.
Field |
Type |
Label |
Description |
alpha |
float |
optional |
Interpolation weight between 0 and 1. |
hard_bootstrap |
bool |
optional |
Whether hard boot strapping should be used or not. If true, will only use one class favored by model. Othewise, will use all predicted class probabilities. Default: false |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Output loss per anchor. Default: false |
ClassificationLoss
Configuration for class prediction loss function.
HardExampleMiner
Configuration for hard example miner.
Field |
Type |
Label |
Description |
num_hard_examples |
int32 |
optional |
Maximum number of hard examples to be selected per image (prior to enforcing max negative to positive ratio constraint). If set to 0, all examples obtained after NMS are considered. Default: 64 |
iou_threshold |
float |
optional |
Minimum intersection over union for an example to be discarded during NMS. Default: 0.7 |
loss_type |
HardExampleMiner.LossType |
optional |
Default: BOTH |
max_negatives_per_positive |
int32 |
optional |
Maximum number of negatives to retain for each positive anchor. If num_negatives_per_positive is 0 no prespecified negative:positive ratio is enforced. Default: 0 |
min_negatives_per_image |
int32 |
optional |
Minimum number of negative anchors to sample for a given image. Setting this to a positive number samples negatives in an image without any positive anchors and thus not bias the model towards having at least one detection per image. Default: 0 |
LocalizationLoss
Configuration for bounding box localization loss function.
Loss
Message for configuring the localization loss, classification loss and hard
example miner used for training object detection models. See core/losses.py
for details
Field |
Type |
Label |
Description |
localization_loss |
LocalizationLoss |
optional |
Localization loss to use. |
classification_loss |
ClassificationLoss |
optional |
Classification loss to use. |
hard_example_miner |
HardExampleMiner |
optional |
If not left to default, applies hard example mining. |
classification_weight |
float |
optional |
Classification loss weight. Default: 1 |
localization_weight |
float |
optional |
Localization loss weight. Default: 1 |
random_example_sampler |
RandomExampleSampler |
optional |
If not left to default, applies random example sampling. |
equalization_loss |
Loss.EqualizationLoss |
optional |
|
expected_loss_weights |
Loss.ExpectedLossWeights |
optional |
Method to compute expected loss weights with respect to balanced positive/negative sampling scheme. If NONE, use explicit sampling. TODO(birdbrain): Move under ExpectedLossWeights. Default: NONE |
min_num_negative_samples |
float |
optional |
Minimum number of effective negative samples. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights. Default: 0 |
desired_negative_sampling_ratio |
float |
optional |
Desired number of effective negative samples per positive sample. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights. Default: 3 |
Loss.EqualizationLoss
Equalization loss.
Field |
Type |
Label |
Description |
weight |
float |
optional |
Weight equalization loss strength. Default: 0 |
exclude_prefixes |
string |
repeated |
When computing equalization loss, ops that start with equalization_exclude_prefixes will be ignored. Only used when equalization_weight > 0. |
RandomExampleSampler
Configuration for random example sampler.
Field |
Type |
Label |
Description |
positive_sample_fraction |
float |
optional |
The desired fraction of positive samples in batch when applying random example sampling. Default: 0.01 |
SigmoidFocalClassificationLoss
Sigmoid Focal cross entropy loss as described in
https://arxiv.org/abs/1708.02002
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Default: false |
gamma |
float |
optional |
modulating factor for the loss. Default: 2 |
alpha |
float |
optional |
alpha weighting factor for the loss. |
WeightedIOULocalizationLoss
Intersection over union location loss: 1 - IOU
WeightedL2LocalizationLoss
L2 location loss: 0.5 * ||weight * (a - b)|| ^ 2
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Output loss per anchor. Default: false |
WeightedSigmoidClassificationLoss
Classification loss using a sigmoid function over class predictions.
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Output loss per anchor. Default: false |
WeightedSmoothL1LocalizationLoss
SmoothL1 (Huber) location loss.
The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and
delta * (|x|-0.5*delta) otherwise, where x is the difference between
predictions and target.
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Output loss per anchor. Default: false |
delta |
float |
optional |
Delta value for huber loss. Default: 1 |
WeightedSoftmaxClassificationAgainstLogitsLoss
Classification loss using a softmax function over class predictions and
a softmax function over the groundtruth labels (assumed to be logits).
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Default: false |
logit_scale |
float |
optional |
Scale and softmax groundtruth logits before calculating softmax classification loss. Typically used for softmax distillation with teacher annotations stored as logits. Default: 1 |
WeightedSoftmaxClassificationLoss
Classification loss using a softmax function over class predictions.
Field |
Type |
Label |
Description |
anchorwise_output |
bool |
optional |
DEPRECATED, do not use. Output loss per anchor. Default: false |
logit_scale |
float |
optional |
Scale logit (input) value before calculating softmax classification loss. Typically used for softmax distillation. Default: 1 |
HardExampleMiner.LossType
Whether to use classification losses ('cls', default), localization losses
('loc') or both losses ('both'). In the case of 'both', cls_loss_weight and
loc_loss_weight are used to compute weighted sum of the two losses.
Name |
Number |
Description |
BOTH |
0 |
|
CLASSIFICATION |
1 |
|
LOCALIZATION |
2 |
|
Loss.ExpectedLossWeights
Name |
Number |
Description |
NONE |
0 |
|
EXPECTED_SAMPLING |
1 |
Use expected_classification_loss_by_expected_sampling from third_party/tensorflow_models/object_detection/utils/ops.py |
REWEIGHTING_UNMATCHED_ANCHORS |
2 |
Use expected_classification_loss_by_reweighting_unmatched_anchors from third_party/tensorflow_models/object_detection/utils/ops.py |
Top
deepomatic/oef/protos/optimizer.proto
AdamOptimizer
Configuration message for the AdamOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
Field |
Type |
Label |
Description |
beta_1 |
float |
optional |
Default: 0.9 |
beta_2 |
float |
optional |
Default: 0.999 |
epsilon |
float |
optional |
Default: 1e-08 |
ConstantLearningRate
Configuration message for a constant learning rate.
CosineDecayLearningRate
Configuration message for a cosine decaying learning rate as defined in
object_detection/utils/learning_schedules.py
Field |
Type |
Label |
Description |
total_steps_pct |
float |
optional |
Default: 1.07 |
warmup_learning_rate |
float |
optional |
Default: 0.0002 |
warmup_steps_pct |
float |
optional |
Default: 0.0025 |
hold_base_rate_steps_pct |
float |
optional |
Default: 0 |
ExponentialDecayLearningRate
Configuration message for an exponentially decaying learning rate.
See https://www.tensorflow.org/versions/master/api_docs/python/train/
decaying_the_learning_rate#exponential_decay
Field |
Type |
Label |
Description |
decay_steps_pct |
float |
optional |
Default: 0.006 |
decay_factor |
float |
optional |
Default: 0.95 |
staircase |
bool |
optional |
Default: true |
burnin_learning_rate |
float |
optional |
Default: 0 |
burnin_steps_pct |
float |
optional |
Default: 0 |
min_learning_rate |
float |
optional |
Default: 0 |
LearningRatePolicy
Configuration message for optimizer learning rate.
ManualStepLearningRate
Configuration message for a manually defined learning rate schedule.
ManualStepLearningRate.LearningRateSchedule
Field |
Type |
Label |
Description |
step_pct |
float |
optional |
|
learning_rate_factor |
float |
optional |
Default: 0.1 |
MomentumOptimizer
Configuration message for the MomentumOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer
Field |
Type |
Label |
Description |
momentum_optimizer_value |
float |
optional |
Default: 0.9 |
NadamOptimizer
Configuration message for the NadamOptimizer
See: https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/optimizers/Nadam
Field |
Type |
Label |
Description |
beta_1 |
float |
optional |
Default: 0.9 |
beta_2 |
float |
optional |
Default: 0.999 |
epsilon |
float |
optional |
Default: 1e-07 |
OnceCycleLearningRate
Field |
Type |
Label |
Description |
cycle_steps |
float |
required |
Default: 1 |
min_max_lr_ratio |
float |
optional |
Default: 10 |
Optimizer
Top level optimizer message.
RMSPropOptimizer
Configuration message for the RMSPropOptimizer
See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
Field |
Type |
Label |
Description |
momentum_optimizer_value |
float |
optional |
Default: 0.9 |
decay |
float |
optional |
Default: 0.9 |
epsilon |
float |
optional |
Default: 1e-07 |
RectifiedAdamOptimizer
Configuration message for the RAdamOptimizer
https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/RectifiedAdam
Field |
Type |
Label |
Description |
beta_1 |
float |
optional |
Default: 0.9 |
beta_2 |
float |
optional |
Default: 0.999 |
epsilon |
float |
optional |
Default: 1e-07 |
TriangularCyclicalLearningRatePatched
Field |
Type |
Label |
Description |
cycle_steps |
float |
required |
Default: 1 |
min_max_lr_ratio |
float |
optional |
Default: 10 |
YogiOptimizer
Configuration message for Yogi optimizer
https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/Yogi
Field |
Type |
Label |
Description |
beta_1 |
float |
optional |
Default: 0.9 |
beta_2 |
float |
optional |
Default: 0.999 |
epsilon |
float |
optional |
Default: 0.001 |
Top
deepomatic/oef/protos/trainer.proto
Experiment related messages.
Quantization
Allows to quantize the model weight into int8.
Field |
Type |
Label |
Description |
delay |
int32 |
optional |
Number of steps to delay before quantization takes effect during training. Default: 500000 |
weight_bits |
int32 |
optional |
Number of bits to use for quantizing weights. Only 8 bit is supported for now. Default: 8 |
activation_bits |
int32 |
optional |
Number of bits to use for quantizing activations. Only 8 bit is supported for now. Default: 8 |
Trainer
This is the main message to define an experiment.
Field |
Type |
Label |
Description |
inputs |
deepomatic.oef.models.image.preprocessing.Input |
repeated |
Data augmentation options for each input |
image_classification |
deepomatic.oef.models.image.classification.Classification |
optional |
A classification model. |
image_detection |
deepomatic.oef.models.image.detection.Detection |
optional |
A detection model. |
image_ocr |
deepomatic.oef.models.image.ocr.OCR |
optional |
An OCR model |
image_segmentation |
deepomatic.oef.models.image.segmentation.Segmentation |
optional |
A segmentation model. |
batch_size |
int32 |
required |
Batch size: set by default according to the chosen model if not set. |
eval_batch_size |
int32 |
optional |
Batch size for evaluation: set to batch_size if set to non-positive value. Default: 0 |
num_train_epochs |
float |
optional |
Number of batches processed during training (in epochs). Used only if num_train_steps is zero. Default: 6 |
num_train_steps |
int32 |
optional |
Number of batches processed during training. If zero, the trainer will use num_train_epochs instead. Default: 0 |
num_eval_steps |
int32 |
optional |
Number of batches processed during evaluation (use zero to run on the whole validation set). Default: 0 |
add_regularization_loss |
bool |
optional |
Additional loss generated by the regularization function Default: true |
freeze_variables |
string |
repeated |
Variables that should not be updated during training. If update_trainable_variables is not empty, only eliminates the included variables according to freeze_variables patterns. |
update_trainable_variables |
string |
repeated |
Variables that should be updated during training. Note that variables which also match the patterns in freeze_variables will be excluded. |
pretrained_parameters |
string |
optional |
URL to pretrained parameters |
keep_checkpoint_every_n_hours |
float |
optional |
Time interval between two parameter checkpoints, in hours. Default: 1 |
resume_training |
bool |
optional |
Whether to load all variables, or only those within the feature extractor scopes. If true, the global step will be reset to zero. Default: false |
do_not_restore_variables |
string |
repeated |
Variables that should not restored from a checkpoint when fine-tuning. Typically useful for some convolutions close to label where it may be better to initialize them with random weights. This is not used when resuming training. You can indicate prefixes of parameter names: all parameter starting with any of the prefix will be skipped. |
restore_backbone_weights_only |
bool |
optional |
Set it to true to restore only the weights of the backbone when working with complex meta-architectures like detection. When set to true, it will restore the backbone weights but not the meta-architecture weights. Default: false |
initial_learning_rate |
float |
optional |
Initial learning rate |
learning_rate_policy |
deepomatic.oef.optimizer.LearningRatePolicy |
optional |
Learning rate policy |
optimizer |
deepomatic.oef.optimizer.Optimizer |
optional |
Optimizer type |
gradient_clipping_by_norm |
float |
optional |
Max value for a gradient, prevents exploding gradients when backpropagating. Use 0 to deactivate Default: 10 |
use_float16 |
bool |
optional |
Use float16 instead of float32 Default: false |
quantization |
Quantization |
optional |
Parameters for quantization. |
Top
deepomatic/oef/protos/models/image/backbones.proto
Backbone
The list of allowed backbones
DarknetBackbone
Field |
Type |
Label |
Description |
depth |
DarknetBackbone.Depth |
required |
The backbone variant: Darknet-19 or Darknet-53 |
EfficientNetBackbone
Field |
Type |
Label |
Description |
version |
EfficientNetBackbone.Version |
required |
The backbone variant: EfficientNet-B[0-8] or EfficientNet-L2 |
survival_prob |
float |
optional |
Default: 0.8 |
InceptionBackbone
InceptionResNetBackbone
MobileNetBackbone
NasNetBackbone
ResNetBackbone
VGGBackbone
Field |
Type |
Label |
Description |
depth |
VGGBackbone.Depth |
required |
The backbone variant: VGG-16, VGG-19, etc… |
YoloV8Backbone
DarknetBackbone.Depth
Name |
Number |
Description |
DEPTH_19 |
19 |
|
DEPTH_53 |
53 |
|
EfficientNetBackbone.Version
Name |
Number |
Description |
B0 |
0 |
https://arxiv.org/abs/1905.11946 |
B1 |
1 |
https://arxiv.org/abs/1905.11946 |
B2 |
2 |
https://arxiv.org/abs/1905.11946 |
B3 |
3 |
https://arxiv.org/abs/1905.11946 |
B4 |
4 |
https://arxiv.org/abs/1905.11946 |
B5 |
5 |
https://arxiv.org/abs/1905.11946 |
B6 |
6 |
https://arxiv.org/abs/1905.11946 |
B7 |
7 |
https://arxiv.org/abs/1905.11946 |
B8 |
8 |
https://arxiv.org/abs/1911.09665 |
L2 |
10 |
https://arxiv.org/abs/1911.04252 |
InceptionBackbone.Version
Name |
Number |
Description |
V1 |
1 |
|
V2 |
2 |
|
V3 |
3 |
|
V4 |
4 |
|
InceptionResNetBackbone.Version
Name |
Number |
Description |
V2 |
2 |
|
MobileNetBackbone.Version
Name |
Number |
Description |
V1 |
1 |
|
V2 |
2 |
|
NasNetBackbone.Depth
Name |
Number |
Description |
LARGE |
0 |
|
MOBILE |
1 |
|
NasNetBackbone.Version
Name |
Number |
Description |
NasNet |
1 |
The typology of those flags is used to generate human readable strings in experiment_to_display_name.py: keep those mixed cases |
https://arxiv.org/abs/1707.07012 |
|
|
|
|
PNasNet |
2 |
https://arxiv.org/abs/1712.00559 |
ResNetBackbone.Depth
Name |
Number |
Description |
DEPTH_50 |
50 |
|
DEPTH_101 |
101 |
|
DEPTH_152 |
152 |
DEPTH_200 = 200; |
ResNetBackbone.Version
Name |
Number |
Description |
V1 |
1 |
|
V2 |
2 |
|
VGGBackbone.Depth
Name |
Number |
Description |
DEPTH_11 |
11 |
|
DEPTH_16 |
16 |
|
DEPTH_19 |
19 |
|
YoloV8Backbone.Version
Name |
Number |
Description |
Nano |
1 |
|
Small |
2 |
|
Medium |
3 |
|
Large |
4 |
|
Extra |
5 |
|
Top
deepomatic/oef/protos/models/image/classification.proto
Image related models.
Classification
Classification model
Top
deepomatic/oef/protos/models/image/detection.proto
Image related models.
Detection
Detection model
Training parameters for EfficientDet
Field |
Type |
Label |
Description |
activation_function |
string |
optional |
activation function to be used ('swish', 'swish_native', 'relu', 'relu6') Default: swish |
min_level |
int32 |
optional |
integer number of minimum level of the output feature pyramid Default: 3 |
max_level |
int32 |
optional |
integer number of maximum level of the output feature pyramid Default: 7 |
num_scales |
int32 |
optional |
integer number of intermediate anchor scales added on each level Default: 3 |
aspect_ratios |
EfficientDetMetaArchitecture.AspectRatio |
repeated |
list of aspect ratio anchors added on each level |
anchor_scale |
float |
optional |
float number scale of size of the base anchor to the feature stride 2^level Default: 4 |
alpha |
float |
optional |
classification loss: focal loss float number weighting factor alpha Default: 0.25 |
gamma |
float |
optional |
classification loss: focal loss float number focusing parameter gamma Default: 1.5 |
delta |
float |
optional |
localization loss: huber loss float number transition parameter delta from quadratic to linear function Default: 0.1 |
box_loss_weight |
float |
optional |
localisation loss weight: huber loss Default: 50 |
iou_loss_type |
EfficientDetMetaArchitecture.IOULossType |
optional |
Default: NONE |
iou_loss_weight |
float |
optional |
localisation loss weight: IoU loss Default: 1 |
weight_decay |
float |
optional |
float number regularization weight decay Default: 4e-05 |
box_class_repeats |
int32 |
optional |
integer number of layers in classification / box net Default: 3 |
fpn_cell_repeats |
int32 |
optional |
integer number of layers in BiFPN Default: 3 |
fpn_num_filters |
int32 |
optional |
integer number of intermediate layers in BiFPN Default: 88 |
fpn_name |
string |
optional |
configuration name of BiFPN ('bifpn_sum', 'bifpn_fa', 'bifpn_dyn') Default: bifpn_fa |
Field |
Type |
Label |
Description |
height_ratio |
float |
required |
|
width_ratio |
float |
required |
|
Training parameters for Faster-RCNN
Field |
Type |
Label |
Description |
parameters |
RCNNParameters |
required |
|
initial_crop_size |
int32 |
required |
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. |
maxpool_kernel_size |
int32 |
required |
Kernel size of the max pool op on the cropped feature map during ROI pooling. |
maxpool_stride |
int32 |
required |
Stride of the max pool op on the cropped feature map during ROI pooling. |
FeaturePyramidNetworks
Configuration for Feature Pyramid Networks.
We recommend to use multi_resolution_feature_map_generator with FPN, and
the levels there must match the levels defined below for better
performance.
Correspondence from FPN levels to Resnet/Mobilenet V1 feature maps:
FPN Level Resnet Feature Map Mobilenet-V1 Feature Map
2 Block 1 Conv2d_3_pointwise
3 Block 2 Conv2d_5_pointwise
4 Block 3 Conv2d_11_pointwise
5 Block 4 Conv2d_13_pointwise
6 Bottomup_5 bottom_up_Conv2d_14
7 Bottomup_6 bottom_up_Conv2d_15
8 Bottomup_7 bottom_up_Conv2d_16
9 Bottomup_8 bottom_up_Conv2d_17
Field |
Type |
Label |
Description |
min_level |
int32 |
optional |
minimum level in feature pyramid Default: 3 |
max_level |
int32 |
optional |
maximum level in feature pyramid Default: 7 |
additional_layer_depth |
int32 |
optional |
channel depth for additional coarse feature layers. Default: 256 |
RCNNParameters
Faster-RCNN and RFCN as described in https://arxiv.org/abs/1506.01497
Field |
Type |
Label |
Description |
number_of_stages |
int32 |
optional |
Whether to construct only the Region Proposal Network (RPN). Default: 2 |
first_stage_features_stride |
int32 |
optional |
Output stride of extracted RPN feature map. Default: 16 |
batch_norm_trainable |
bool |
optional |
Whether to update batch norm parameters during training or not. When training with a relative large batch size (e.g. 8), it could be desirable to enable batch norm update. Default: false |
first_stage_anchor_generator |
AnchorGenerator |
required |
Anchor generator to compute RPN anchors. |
first_stage_atrous_rate |
int32 |
optional |
Atrous rate for the convolution op applied to the first_stage_features_to_crop tensor to obtain box predictions. Default: 1 |
first_stage_box_predictor_conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
required |
Hyperparameters for the convolutional RPN box predictor. |
first_stage_box_predictor_kernel_size |
int32 |
optional |
Kernel size to use for the convolution op just prior to RPN box predictions. Default: 3 |
first_stage_box_predictor_depth |
int32 |
optional |
Output depth for the convolution op just prior to RPN box predictions. Default: 512 |
first_stage_minibatch_size |
int32 |
optional |
The batch size to use for computing the first stage objectness and location losses. Default: 256 |
first_stage_positive_balance_fraction |
float |
optional |
Fraction of positive examples per image for the RPN. Default: 0.5 |
first_stage_nms_score_threshold |
float |
optional |
Non max suppression score threshold applied to first stage RPN proposals. Default: 0 |
first_stage_nms_iou_threshold |
float |
optional |
Non max suppression IOU threshold applied to first stage RPN proposals. Default: 0.7 |
first_stage_max_proposals |
int32 |
optional |
Maximum number of RPN proposals retained after first stage postprocessing. Default: 300 |
first_stage_localization_loss_weight |
float |
optional |
First stage RPN localization loss weight. Default: 2 |
first_stage_objectness_loss_weight |
float |
optional |
First stage RPN objectness loss weight. Default: 1 |
second_stage_box_predictor |
BoxPredictor |
required |
Hyperparameters for the second stage box predictor. If box predictor type is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a Faster R-CNN model is constructed. |
second_stage_batch_size |
int32 |
optional |
The batch size per image used for computing the classification and refined location loss of the box classifier. Note that this field is ignored if hard_example_miner is configured. Default: 64 |
second_stage_balance_fraction |
float |
optional |
Fraction of positive examples to use per image for the box classifier. Default: 0.25 |
second_stage_post_processing |
PostProcessing |
required |
Post processing to apply on the second stage box classifier predictions. Note: the score_converter provided to the FasterRCNNMetaArch constructor is taken from this second_stage_post_processing proto. |
second_stage_localization_loss_weight |
float |
optional |
Second stage refined localization loss weight. Default: 2 |
second_stage_classification_loss_weight |
float |
optional |
Second stage classification loss weight Default: 1 |
second_stage_mask_prediction_loss_weight |
float |
optional |
Second stage instance mask loss weight. Note that this is only applicable when MaskRCNNBoxPredictor is selected for second stage and configured to predict instance masks. Default: 1 |
hard_example_miner |
deepomatic.oef.losses.HardExampleMiner |
optional |
If not left to default, applies hard example mining only to classification and localization loss.. |
second_stage_classification_loss |
deepomatic.oef.losses.ClassificationLoss |
required |
Loss for second stage box classifers, supports Softmax and Sigmoid. Note that score converter must be consistent with loss type. When there are multiple labels assigned to the same boxes, recommend to use sigmoid loss and enable merge_multiple_label_boxes. If not specified, Softmax loss is used as default. |
inplace_batchnorm_update |
bool |
optional |
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters. Default: false |
use_matmul_crop_and_resize |
bool |
optional |
Force the use of matrix multiplication based crop and resize instead of standard tf.image.crop_and_resize while computing second stage input feature maps. Default: false |
clip_anchors_to_image |
bool |
optional |
Normally, anchors generated for a given image size are pruned during training if they lie outside the image window. Setting this option to true, clips the anchors to be within the image instead of pruning. Default: false |
use_matmul_gather_in_matcher |
bool |
optional |
After peforming matching between anchors and targets, in order to pull out targets for training Faster R-CNN meta architecture we perform a gather operation. This options specifies whether to use an alternate implementation of tf.gather that is faster on TPUs. Default: false |
use_static_balanced_label_sampler |
bool |
optional |
Whether to use the balanced positive negative sampler implementation with static shape guarantees. Default: false |
use_static_shapes |
bool |
optional |
If True, uses implementation of ops with static shape guarantees. Default: false |
use_static_shapes_for_eval |
bool |
optional |
If True, uses implementation of ops with static shape guarantees when running evaluation (specifically not is_training if False). Default: false |
use_partitioned_nms_in_first_stage |
bool |
optional |
If true, uses implementation of partitioned_non_max_suppression in first stage. Default: true |
return_raw_detections_during_predict |
bool |
optional |
Whether to return raw detections (pre NMS). Default: false |
use_combined_nms_in_first_stage |
bool |
optional |
Whether to use tf.image.combined_non_max_suppression. Default: false |
Training parameters for RFCN
Field |
Type |
Label |
Description |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor. |
pad_to_multiple |
int32 |
optional |
The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even. Default: 1 |
use_explicit_padding |
bool |
optional |
Whether to use explicit padding when extracting SSD multiresolution features. This will also apply to the base feature extractor if a MobileNet architecture is used. @vdel: this seems to have been added to make backbones compatible with some runtimes and seem deprecated now. Default: false |
use_depthwise |
bool |
optional |
Whether to use depthwise separable convolutions for to extract additional feature maps added by SSD. Default: false |
fpn |
FeaturePyramidNetworks |
optional |
Feature Pyramid Networks config. |
num_layers |
int32 |
optional |
The number of SSD layers. Default: 6 |
SSD as described in https://arxiv.org/abs/1512.02325.
SSD-Lite as described in https://arxiv.org/pdf/1801.04381.pdf
Next id: 27
Field |
Type |
Label |
Description |
feature_extractor |
SSDFeatureExtractor |
required |
Feature extractor config. |
box_coder |
BoxCoder |
required |
Box coder to encode the boxes. |
matcher |
Matcher |
required |
Matcher to match groundtruth with anchors. |
similarity_calculator |
RegionSimilarityCalculator |
required |
Region similarity calculator to compute similarity of boxes. |
encode_background_as_zeros |
bool |
optional |
Whether background targets are to be encoded as an all zeros vector or a one-hot vector (where background is the 0th class). Default: false |
negative_class_weight |
float |
optional |
classification weight to be associated to negative anchors (default: 1.0). The weight must be in [0., 1.]. Default: 1 |
box_predictor |
BoxPredictor |
optional |
Box predictor to attach to the features. |
anchor_generator |
AnchorGenerator |
required |
Anchor generator to compute anchors. |
post_processing |
PostProcessing |
required |
Post processing to apply on the predictions. |
normalize_loss_by_num_matches |
bool |
optional |
Whether to normalize the loss by number of groundtruth boxes that match to the anchors. Default: true |
normalize_loc_loss_by_codesize |
bool |
optional |
Whether to normalize the localization loss by the code size of the box encodings. This is applied along with other normalization factors. Default: false |
losses |
deepomatic.oef.losses.Loss |
required |
Loss configuration for training. |
freeze_batchnorm |
bool |
optional |
Whether to update batch norm parameters during training or not. When training with a relative small batch size (e.g. 1), it is desirable to disable batch norm update and use pretrained batch norm params. |
Note: Some feature extractors are used with canned arg_scopes (e.g resnet arg scopes). In these cases training behavior of batch norm variables may depend on both values of batch_norm_trainable
and is_training
.
When canned arg_scopes are used with feature extractors conv_hyperparams will apply only to the additional layers that are added and are outside the canned arg_scope. Default: false |
|
|
|
|
|
inplace_batchnorm_update |
bool |
optional |
Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters. Default: false |
|
add_background_class |
bool |
optional |
Whether to add an implicit background class to one-hot encodings of groundtruth labels. Set to false if training a single class model or using an explicit background class. Default: true |
|
explicit_background_class |
bool |
optional |
Whether to use an explicit background class. Set to true if using groundtruth labels with an explicit background class, as in multiclass scores. Default: false |
|
use_confidences_as_targets |
bool |
optional |
Default: false |
|
implicit_example_weight |
float |
optional |
Default: 1 |
|
return_raw_detections_during_predict |
bool |
optional |
Default: false |
|
mask_head_config |
SSDMetaArchitecture.MaskHead |
optional |
Configs for mask head. |
Configuration proto for MaskHead.
Next id: 11
Field |
Type |
Label |
Description |
mask_height |
int32 |
optional |
The height and the width of the predicted mask. Only used when predict_instance_masks is true. Default: 15 |
mask_width |
int32 |
optional |
Default: 15 |
masks_are_class_agnostic |
bool |
optional |
Whether to predict class agnostic masks. Only used when predict_instance_masks is true. Default: true |
mask_prediction_conv_depth |
int32 |
optional |
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes. Default: 256 |
mask_prediction_num_conv_layers |
int32 |
optional |
The number of convolutions applied to image_features in the mask prediction branch. Default: 2 |
convolve_then_upsample_masks |
bool |
optional |
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [mask_height , mask_width ] before applying convolutions and predicting masks. Default: false |
mask_loss_weight |
float |
optional |
Mask loss weight. Default: 5 |
mask_loss_sample_size |
int32 |
optional |
Number of boxes to be generated at training time for computing mask loss. Default: 16 |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for convolution ops used in the box predictor. |
initial_crop_size |
int32 |
optional |
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. Only used when we have second stage prediction head enabled (e.g. mask head). Default: 15 |
YoloParameters
Yolo V2 & V3 as described https://arxiv.org/abs/1612.08242
Field |
Type |
Label |
Description |
subdivisions |
int32 |
required |
The number of mini-batch split, so that it fits in GPU memory Using 1 will compute the whole mini-batch in 1 pass and may use lots of RAM |
classification_loss |
deepomatic.oef.losses.ClassificationLoss |
required |
Loss for classifers, supports Softmax and Sigmoid. |
Training parameters for Yolo v2
Training parameters for Yolo v3
Training parameters for Yolo v8
localization loss: IoU loss type
We use lower case names because the name of the enum is directly used
in thrid party code.
Name |
Number |
Description |
NONE |
0 |
|
iou |
1 |
|
ciou |
2 |
|
diou |
3 |
|
giou |
4 |
|
Top
deepomatic/oef/protos/models/image/ocr.proto
Image related models.
Attention
Attention-based OCR as described in https://arxiv.org/abs/1704.03549
Field |
Type |
Label |
Description |
use_autoregression |
bool |
optional |
Whether or not we should base the prediction of the next character also on previous characters. This is typically used to recognize words. This is typically NOT used for license plate prediction. Default: false |
num_lstm_units |
int32 |
optional |
The size of the hidden state vector Default: 256 |
use_coordinate_feature |
bool |
optional |
Whether we should add one hot vectors representing the location to add it as a prior Default: false |
feature_map_ratio |
int32 |
optional |
Feature map size ratio to input size Default: 8 |
weight_decay |
float |
optional |
float number regularization weight decay Default: 4e-05 |
lstm_state_clip_value |
float |
optional |
float number clip cell state by this value prior to the cell output activation Default: 10 |
label_smoothing |
float |
optional |
float number smooth factor towards 1/num_classes for labels Default: 0.1 |
use_attention |
bool |
optional |
Whether the OCR uses an attention mask to focus on each letter |
OCR
OCR model
Top
deepomatic/oef/protos/models/image/preprocessing.proto
Image related models.
AutoAugmentImage
Apply an Autoaugment policy to the image and bounding boxes.
Field |
Type |
Label |
Description |
policy_name |
string |
required |
What AutoAugment policy to apply to the Image. The available options are v0 , v1 , v2 , v3 for a detection task, and v4 for a classification/tagging task. v0 is the policy used for all of the results in the "detection" paper [1] and was found to achieve the best results on the COCO dataset. v1 , v2 and v3 are additional good policies found on the COCO dataset that have slight variation in what operations were used during the search procedure along with how many operations are applied in parallel to a single image (2 vs 3). v4 corresponds to the best policy found in the original AutoAugment paper for classification [2] 'AutoAugment: Learning Augmentation Strategies from Data' on reduced ImageNet dataset (see arxiv link, table 9 in the appendix). [1] Object detection: https://arxiv.org/pdf/1906.11172.pdf [2] Classification: https://arxiv.org/pdf/1805.09501.pdf |
ConvertClassLogitsToSoftmax
Converts class logits to softmax optionally scaling the values by temperature
first.
Field |
Type |
Label |
Description |
temperature |
float |
optional |
Scale to use on logits before applying softmax. Default: 1 |
DropLabelProbabilistically
Randomly drops ground truth boxes for a label with some probability.
Field |
Type |
Label |
Description |
label |
int32 |
optional |
The label that should be dropped. This corresponds to one of the entries in the label map. |
drop_probability |
float |
optional |
Probability of dropping the label. Default: 1 |
FixedShapeResizer
Configuration proto for image resizer that resizes to a fixed shape.
Field |
Type |
Label |
Description |
height |
int32 |
optional |
Desired height of image in pixels. Default: 300 |
width |
int32 |
optional |
Desired width of image in pixels. Default: 300 |
resize_method |
ResizeType |
optional |
Desired method when resizing image. Default: BILINEAR |
convert_to_grayscale |
bool |
optional |
Whether to also resize the image channels from 3 to 1 (RGB to grayscale). Default: false |
ImageResizer
Configuration proto for image resizing operations.
See builders/image_resizer_builder.py for details.
Field |
Type |
Label |
Description |
image_resizer |
ImageResizer |
required |
The input image resizer |
data_augmentation_options |
PreprocessingStep |
repeated |
Data augmentation options. |
KeepAspectRatioResizer
Configuration proto for image resizer that keeps aspect ratio.
Field |
Type |
Label |
Description |
min_dimension |
int32 |
optional |
Desired size of the smaller image dimension in pixels. Default: 0 |
max_dimension |
int32 |
required |
Desired size of the larger image dimension in pixels. |
resize_method |
ResizeType |
optional |
Desired method when resizing image. Default: BILINEAR |
pad_to_max_dimension |
bool |
optional |
Whether to pad the image with zeros so the output spatial size is [max_dimension, max_dimension]. Note that the zeros are padded to the bottom and the right of the resized image. Default: true |
convert_to_grayscale |
bool |
optional |
Whether to also resize the image channels from 3 to 1 (RGB to grayscale). Default: false |
per_channel_pad_value |
float |
repeated |
Per-channel pad value. This is only used when pad_to_max_dimension is True. If unspecified, a default pad value of 0 is applied to all channels. |
NormalizeImage
Normalizes pixel values in an image
For every channel in the image, moves the pixel values from the range
[original_minval, original_maxval] to [target_minval, target_maxval]
Field |
Type |
Label |
Description |
original_minval |
float |
optional |
|
original_maxval |
float |
optional |
|
target_minval |
float |
optional |
Default: 0 |
target_maxval |
float |
optional |
Default: 1 |
PreprocessingStep
Message for defining a preprocessing operation on input data.
See: //third_party/tensorflow_models/object_detection/core/preprocessor.py
Next ID: 39
RGBtoGray
Converts the RGB image to a grayscale image. This also converts the image
depth from 3 to 1, unlike RandomRGBtoGray which does not change the image
depth.
RandomAbsolutePadImage
Randomly adds a padding of size [0, max_height_padding), [0, max_width_padding).
Field |
Type |
Label |
Description |
max_height_padding |
int32 |
optional |
Height will be padded uniformly at random from [0, max_height_padding). |
max_width_padding |
int32 |
optional |
Width will be padded uniformly at random from [0, max_width_padding). |
pad_color |
float |
repeated |
Color of the padding. If unset, will pad using average color of the input image. |
RandomAdjustBrightness
Randomly changes image brightness by up to max_delta. Image outputs will be
saturated between 0 and 1.
Field |
Type |
Label |
Description |
max_delta |
float |
optional |
Default: 0.2 |
RandomAdjustContrast
Randomly scales contract by a value between [min_delta, max_delta].
Field |
Type |
Label |
Description |
min_delta |
float |
optional |
Default: 0.8 |
max_delta |
float |
optional |
Default: 1.25 |
RandomAdjustHue
Randomly alters hue by a value of up to max_delta.
Field |
Type |
Label |
Description |
max_delta |
float |
optional |
Default: 0.02 |
RandomAdjustSaturation
Randomly changes saturation by a value between [min_delta, max_delta].
Field |
Type |
Label |
Description |
min_delta |
float |
optional |
Default: 0.8 |
max_delta |
float |
optional |
Default: 1.25 |
RandomBlackPatches
Randomly adds black square patches to an image.
Field |
Type |
Label |
Description |
max_black_patches |
int32 |
optional |
The maximum number of black patches to add. Default: 10 |
probability |
float |
optional |
The probability of a black patch being added to an image. Default: 0.5 |
size_to_image_ratio |
float |
optional |
Ratio between the dimension of the black patch to the minimum dimension of the image (patch_width = patch_height = min(image_height, image_width)). Default: 0.1 |
RandomCropImage
Randomly crops the image and bounding boxes.
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropped image must cover at least one box by this fraction. Default: 1 |
min_aspect_ratio |
float |
optional |
Aspect ratio bounds of cropped image. Default: 0.75 |
max_aspect_ratio |
float |
optional |
Default: 1.33 |
min_area |
float |
optional |
Allowed area ratio of cropped image to original image. Default: 0.1 |
max_area |
float |
optional |
Default: 1 |
overlap_thresh |
float |
optional |
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3 |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability of keeping the original image. Default: 0 |
RandomCropPadImage
Randomly crops an image followed by a random pad.
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropping operation must cover at least one box by this fraction. Default: 1 |
min_aspect_ratio |
float |
optional |
Aspect ratio bounds of image after cropping operation. Default: 0.75 |
max_aspect_ratio |
float |
optional |
Default: 1.33 |
min_area |
float |
optional |
Allowed area ratio of image after cropping operation. Default: 0.1 |
max_area |
float |
optional |
Default: 1 |
overlap_thresh |
float |
optional |
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3 |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability of keeping the original image during the crop operation. Default: 0 |
min_padded_size_ratio |
float |
repeated |
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound. Both of the following fields should be length 2. |
max_padded_size_ratio |
float |
repeated |
|
pad_color |
float |
repeated |
Color of the padding. If unset, will pad using average color of the input image. This field should be of length 3. |
RandomCropToAspectRatio
Randomly crops an iamge to a given aspect ratio.
Field |
Type |
Label |
Description |
aspect_ratio |
float |
optional |
Aspect ratio. Default: 1 |
overlap_thresh |
float |
optional |
Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3 |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
RandomDistortColor
Performs a random color distortion. color_orderings should either be 0 or 1.
Field |
Type |
Label |
Description |
color_ordering |
int32 |
optional |
|
RandomDownscaleToTargetPixels
Randomly shrinks image (keeping aspect ratio) to a target number of pixels.
If the image contains less than the chosen target number of pixels, it will
not be changed.
Field |
Type |
Label |
Description |
random_coef |
float |
optional |
Probability of keeping the original image. Default: 0 |
min_target_pixels |
int32 |
optional |
The target number of pixels will be chosen to be in the range [min_target_pixels, max_target_pixels] Default: 300000 |
max_target_pixels |
int32 |
optional |
Default: 500000 |
RandomHorizontalFlip
Randomly horizontally flips the image and detections with the specified
probability, default to 50% of the time.
Field |
Type |
Label |
Description |
keypoint_flip_permutation |
int32 |
repeated |
Specifies a mapping from the original keypoint indices to horizontally flipped indices. This is used in the event that keypoints are specified, in which case when the image is horizontally flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4 If nothing is specified the order of keypoint will be mantained. |
probability |
float |
optional |
The probability of running this augmentation for each image. Default: 0.5 |
RandomImageScale
Randomly enlarges or shrinks image (keeping aspect ratio).
Field |
Type |
Label |
Description |
min_scale_ratio |
float |
optional |
Default: 0.5 |
max_scale_ratio |
float |
optional |
Default: 2 |
RandomJitterBoxes
Randomly jitters corners of boxes in the image determined by ratio.
ie. If a box is [100, 200] and ratio is 0.02, the corners can move by [1, 4].
Field |
Type |
Label |
Description |
ratio |
float |
optional |
Default: 0.05 |
RandomJpegQuality
Applies a jpeg encoding with a random quality factor.
Field |
Type |
Label |
Description |
random_coef |
float |
optional |
Probability of keeping the original image. Default: 0 |
min_jpeg_quality |
int32 |
optional |
Minimum jpeg quality to use. Default: 0 |
max_jpeg_quality |
int32 |
optional |
Maximum jpeg quality to use. Default: 100 |
RandomPadImage
Randomly adds padding to the image.
Field |
Type |
Label |
Description |
min_image_height |
int32 |
optional |
Minimum dimensions for padded image. If unset, will use original image dimension as a lower bound. |
min_image_width |
int32 |
optional |
|
max_image_height |
int32 |
optional |
Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound. |
max_image_width |
int32 |
optional |
|
pad_color |
float |
repeated |
Color of the padding. If unset, will pad using average color of the input image. |
RandomPatchGaussian
Field |
Type |
Label |
Description |
random_coef |
float |
optional |
Probability of keeping the original image. Default: 0 |
min_patch_size |
int32 |
optional |
The patch size will be chosen to be in the range [min_patch_size, max_patch_size). Default: 1 |
max_patch_size |
int32 |
optional |
Default: 250 |
min_gaussian_stddev |
float |
optional |
The standard deviation of the gaussian noise applied within the patch will be chosen to be in the range [min_gaussian_stddev, max_gaussian_stddev). Default: 0 |
max_gaussian_stddev |
float |
optional |
Default: 1 |
RandomPixelValueScale
Randomly scales the values of all pixels in the image by some constant value
between [minval,maxval], then clip the value to a range between [0, 1.0].
Field |
Type |
Label |
Description |
minval |
float |
optional |
Default: 0.9 |
maxval |
float |
optional |
Default: 1.1 |
RandomRGBtoGray
Randomly convert entire image to grey scale.
Field |
Type |
Label |
Description |
probability |
float |
optional |
Default: 0.1 |
RandomResizeMethod
Randomly resizes the image up to [target_height, target_width].
Field |
Type |
Label |
Description |
target_height |
int32 |
optional |
|
target_width |
int32 |
optional |
|
RandomRotation90
Randomly rotates the image and detections by 90 degrees counter-clockwise
with the specified probability, default to 50% of the time.
Field |
Type |
Label |
Description |
keypoint_rot_permutation |
int32 |
repeated |
Specifies a mapping from the original keypoint indices to 90 degree counter clockwise indices. This is used in the event that keypoints are specified, in which case when the image is rotated the keypoints might need to be permuted. |
probability |
float |
optional |
The probability of running this augmentation for each image. Default: 0.5 |
RandomScaleCropAndPadToSquare
Randomly scale, crop, and then pad an image to the desired square output
dimensions. Specifically, this method first samples a random_scale factor
from a uniform distribution between scale_min and scale_max, and then resizes
the image such that it's maximum dimension is (output_size * random_scale).
Secondly, a square output_size crop is extracted from the resized image, and
finally the cropped region is padded to the desired square output_size.
The augmentation is borrowed from [1]
[1]: https://arxiv.org/abs/1911.09070
Field |
Type |
Label |
Description |
output_size |
int32 |
optional |
The (square) output image size Default: 512 |
scale_min |
float |
optional |
The minimum and maximum values from which to sample the random scale. Default: 0.1 |
scale_max |
float |
optional |
Default: 2 |
RandomSelfConcatImage
Randomly concatenates the image with itself horizontally and/or vertically.
Field |
Type |
Label |
Description |
concat_vertical_probability |
float |
optional |
Probability of concatenating the image vertically. Default: 0.1 |
concat_horizontal_probability |
float |
optional |
Probability of concatenating the image horizontally. Default: 0.1 |
RandomSquareCropByScale
Extract a square sized crop from an image whose side length is sampled by
randomly scaling the maximum spatial dimension of the image. If part of the
crop falls outside the image, it is filled with zeros.
The augmentation is borrowed from [1]
[1]: https://arxiv.org/abs/1904.07850
Field |
Type |
Label |
Description |
max_border |
int32 |
optional |
The maximum size of the border. The border defines distance in pixels to the image boundaries that will not be considered as a center of a crop. To make sure that the border does not go over the center of the image, we chose the border value by computing the minimum k, such that (max_border / (2**k)) < image_dimension/2 Default: 128 |
scale_min |
float |
optional |
The minimum and maximum values of scale. Default: 0.6 |
scale_max |
float |
optional |
Default: 1.3 |
num_scales |
int32 |
optional |
The number of discrete scale values to randomly sample between [min_scale, max_scale] Default: 8 |
RandomVerticalFlip
Randomly vertically flips the image and detections with the specified
probability, default to 50% of the time.
Field |
Type |
Label |
Description |
keypoint_flip_permutation |
int32 |
repeated |
Specifies a mapping from the original keypoint indices to vertically flipped indices. This is used in the event that keypoints are specified, in which case when the image is vertically flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4 |
probability |
float |
optional |
The probability of running this augmentation for each image. Default: 0.5 |
RemapLabels
Remap a set of labels to a new label.
Field |
Type |
Label |
Description |
original_labels |
int32 |
repeated |
Labels to be remapped. |
new_label |
int32 |
optional |
Label to map to. |
ResizeImage
Resizes images to [new_height, new_width].
SSDRandomCrop
Randomly crops a image according to:
Liu et al., SSD: Single shot multibox detector.
This preprocessing step defines multiple SSDRandomCropOperations. Only one
operation (chosen at random) is actually performed on an image.
SSDRandomCropFixedAspectRatio
Randomly crops a image to a fixed aspect ratio according to:
Liu et al., SSD: Single shot multibox detector.
Multiple SSDRandomCropFixedAspectRatioOperations are defined by this
preprocessing step. Only one operation (chosen at random) is actually
performed on an image.
SSDRandomCropFixedAspectRatioOperation
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropped image must cover at least this fraction of one original bounding box. |
min_area |
float |
optional |
The area of the cropped image must be within the range of [min_area, max_area]. |
max_area |
float |
optional |
|
overlap_thresh |
float |
optional |
Cropped box area ratio must be above this threhold to be kept. |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability a crop operation is skipped. |
SSDRandomCropOperation
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropped image must cover at least this fraction of one original bounding box. |
min_aspect_ratio |
float |
optional |
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio]. |
max_aspect_ratio |
float |
optional |
|
min_area |
float |
optional |
The area of the cropped image must be within the range of [min_area, max_area]. |
max_area |
float |
optional |
|
overlap_thresh |
float |
optional |
Cropped box area ratio must be above this threhold to be kept. |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability a crop operation is skipped. |
SSDRandomCropPad
Randomly crops and pads an image according to:
Liu et al., SSD: Single shot multibox detector.
This preprocessing step defines multiple SSDRandomCropPadOperations. Only one
operation (chosen at random) is actually performed on an image.
SSDRandomCropPadFixedAspectRatio
Randomly crops and pads an image to a fixed aspect ratio according to:
Liu et al., SSD: Single shot multibox detector.
Multiple SSDRandomCropPadFixedAspectRatioOperations are defined by this
preprocessing step. Only one operation (chosen at random) is actually
performed on an image.
Field |
Type |
Label |
Description |
operations |
SSDRandomCropPadFixedAspectRatioOperation |
repeated |
|
aspect_ratio |
float |
optional |
Aspect ratio to pad to. This value is used for all crop and pad operations. Default: 1 |
min_padded_size_ratio |
float |
repeated |
Min ratio of padded image height and width to the input image's height and width. Two entries per operation. |
max_padded_size_ratio |
float |
repeated |
Max ratio of padded image height and width to the input image's height and width. Two entries per operation. |
SSDRandomCropPadFixedAspectRatioOperation
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropped image must cover at least this fraction of one original bounding box. |
min_aspect_ratio |
float |
optional |
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio]. |
max_aspect_ratio |
float |
optional |
|
min_area |
float |
optional |
The area of the cropped image must be within the range of [min_area, max_area]. |
max_area |
float |
optional |
|
overlap_thresh |
float |
optional |
Cropped box area ratio must be above this threhold to be kept. |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability a crop operation is skipped. |
SSDRandomCropPadOperation
Field |
Type |
Label |
Description |
min_object_covered |
float |
optional |
Cropped image must cover at least this fraction of one original bounding box. |
min_aspect_ratio |
float |
optional |
The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio]. |
max_aspect_ratio |
float |
optional |
|
min_area |
float |
optional |
The area of the cropped image must be within the range of [min_area, max_area]. |
max_area |
float |
optional |
|
overlap_thresh |
float |
optional |
Cropped box area ratio must be above this threhold to be kept. |
clip_boxes |
bool |
optional |
Whether to clip the boxes to the cropped image. Default: true |
random_coef |
float |
optional |
Probability a crop operation is skipped. |
min_padded_size_ratio |
float |
repeated |
Min ratio of padded image height and width to the input image's height and width. Two entries per operation. |
max_padded_size_ratio |
float |
repeated |
Max ratio of padded image height and width to the input image's height and width. Two entries per operation. |
pad_color_r |
float |
optional |
Padding color. |
pad_color_g |
float |
optional |
|
pad_color_b |
float |
optional |
|
ScaleBoxesToPixelCoordinates
Scales boxes from normalized coordinates to pixel coordinates.
SubtractChannelMean
Normalizes an image by subtracting a mean from each channel.
Field |
Type |
Label |
Description |
means |
float |
repeated |
The mean to subtract from each channel. Should be of same dimension of channels in the input image. |
ResizeImage.Method
Name |
Number |
Description |
AREA |
1 |
|
BICUBIC |
2 |
|
BILINEAR |
3 |
|
NEAREST_NEIGHBOR |
4 |
|
ResizeType
Enumeration type for image resizing methods provided in TensorFlow.
Name |
Number |
Description |
BILINEAR |
0 |
Corresponds to tf.image.ResizeMethod.BILINEAR |
NEAREST_NEIGHBOR |
1 |
Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR |
BICUBIC |
2 |
Corresponds to tf.image.ResizeMethod.BICUBIC |
AREA |
3 |
Corresponds to tf.image.ResizeMethod.AREA |
Top
deepomatic/oef/protos/models/image/segmentation.proto
Segmentation models
Training parameters for Mask-RCNN
Note: it's similar to FasterRCNNMetaArchitecture, but for readibility and potential future
changes it's better to have it as a separate message definition
Field |
Type |
Label |
Description |
parameters |
deepomatic.oef.models.image.detection.RCNNParameters |
required |
|
initial_crop_size |
int32 |
required |
Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. |
maxpool_kernel_size |
int32 |
required |
Kernel size of the max pool op on the cropped feature map during ROI pooling. |
maxpool_stride |
int32 |
required |
Stride of the max pool op on the cropped feature map during ROI pooling. |
Segmentation
Top
deepomatic/oef/protos/models/image/detection/anchor_generator.proto
AnchorGenerator
Configuration proto for the anchor generator to use in the object detection
pipeline. See core/anchor_generator.py for details.
Top
deepomatic/oef/protos/models/image/detection/argmax_matcher.proto
ArgMaxMatcher
Configuration proto for ArgMaxMatcher. See
matchers/argmax_matcher.py for details.
Field |
Type |
Label |
Description |
matched_threshold |
float |
optional |
Threshold for positive matches. Default: 0.5 |
unmatched_threshold |
float |
optional |
Threshold for negative matches. Default: 0.5 |
ignore_thresholds |
bool |
optional |
Whether to construct ArgMaxMatcher without thresholds. Default: false |
negatives_lower_than_unmatched |
bool |
optional |
If True then negative matches are the ones below the unmatched_threshold, whereas ignored matches are in between the matched and umatched threshold. If False, then negative matches are in between the matched and unmatched threshold, and everything lower than unmatched is ignored. Default: true |
force_match_for_each_row |
bool |
optional |
Whether to ensure each row is matched to at least one column. Default: false |
use_matmul_gather |
bool |
optional |
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather Default: false |
Top
deepomatic/oef/protos/models/image/detection/bipartite_matcher.proto
BipartiteMatcher
Configuration proto for bipartite matcher. See
matchers/bipartite_matcher.py for details.
Field |
Type |
Label |
Description |
use_matmul_gather |
bool |
optional |
Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather Default: false |
Top
deepomatic/oef/protos/models/image/detection/box_coder.proto
BoxCoder
Configuration proto for the box coder to be used in the object detection
pipeline. See core/box_coder.py for details.
Top
deepomatic/oef/protos/models/image/detection/box_predictor.proto
BoxPredictor
Configuration proto for box predictor. See core/box_predictor.py for details.
ConvolutionalBoxPredictor
Configuration proto for Convolutional box predictor.
Next id: 13
Field |
Type |
Label |
Description |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for convolution ops used in the box predictor. |
min_depth |
int32 |
optional |
Minimum feature depth prior to predicting box encodings and class predictions. Default: 0 |
max_depth |
int32 |
optional |
Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions. Default: 0 |
num_layers_before_predictor |
int32 |
optional |
Number of the additional conv layers before the predictor. Default: 0 |
use_dropout |
bool |
optional |
Whether to use dropout for class prediction. Default: true |
dropout_keep_probability |
float |
optional |
Keep probability for dropout Default: 0.8 |
kernel_size |
int32 |
optional |
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height). Default: 1 |
box_code_size |
int32 |
optional |
Size of the encoding for boxes. Default: 4 |
apply_sigmoid_to_scores |
bool |
optional |
Whether to apply sigmoid to the output of class predictions. TODO(jonathanhuang): Do we need this since we have a post processing module.? Default: false |
class_prediction_bias_init |
float |
optional |
Default: 0 |
use_depthwise |
bool |
optional |
Whether to use depthwise separable convolution for box predictor layers. Default: false |
box_encodings_clip_range |
ConvolutionalBoxPredictor.BoxEncodingsClipRange |
optional |
|
ConvolutionalBoxPredictor.BoxEncodingsClipRange
If specified, apply clipping to box encodings.
Field |
Type |
Label |
Description |
min |
float |
optional |
|
max |
float |
optional |
|
MaskRCNNBoxPredictor
TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn
head easily.
Next id: 15
Field |
Type |
Label |
Description |
fc_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for fully connected ops used in the box predictor. |
use_dropout |
bool |
optional |
Whether to use dropout op prior to the both box and class predictions. Default: false |
dropout_keep_probability |
float |
optional |
Keep probability for dropout. This is only used if use_dropout is true. Default: 0.5 |
box_code_size |
int32 |
optional |
Size of the encoding for the boxes. Default: 4 |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for convolution ops used in the box predictor. |
predict_instance_masks |
bool |
optional |
Whether to predict instance masks inside detection boxes. Default: false |
mask_prediction_conv_depth |
int32 |
optional |
The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes. Default: 256 |
predict_keypoints |
bool |
optional |
Whether to predict keypoints inside detection boxes. Default: false |
mask_height |
int32 |
optional |
The height and the width of the predicted mask. Default: 15 |
mask_width |
int32 |
optional |
Default: 15 |
mask_prediction_num_conv_layers |
int32 |
optional |
The number of convolutions applied to image_features in the mask prediction branch. Default: 2 |
masks_are_class_agnostic |
bool |
optional |
Default: false |
share_box_across_classes |
bool |
optional |
Whether to use one box for all classes rather than a different box for each class. Default: false |
convolve_then_upsample_masks |
bool |
optional |
Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [mask_height , mask_width ] before applying convolutions and predicting masks. Default: false |
RfcnBoxPredictor
Field |
Type |
Label |
Description |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for convolution ops used in the box predictor. |
num_spatial_bins_height |
int32 |
optional |
Bin sizes for RFCN crops. Default: 3 |
num_spatial_bins_width |
int32 |
optional |
Default: 3 |
depth |
int32 |
optional |
Target depth to reduce the input image features to. Default: 1024 |
box_code_size |
int32 |
optional |
Size of the encoding for the boxes. Default: 4 |
crop_height |
int32 |
optional |
Size to resize the rfcn crops to. Default: 12 |
crop_width |
int32 |
optional |
Default: 12 |
WeightSharedConvolutionalBoxPredictor
Configuration proto for weight shared convolutional box predictor.
Next id: 19
Field |
Type |
Label |
Description |
conv_hyperparams |
deepomatic.oef.models.image.utils.hyperparameters.Hyperparams |
optional |
Hyperparameters for convolution ops used in the box predictor. |
num_layers_before_predictor |
int32 |
optional |
Number of the additional conv layers before the predictor. Default: 0 |
depth |
int32 |
optional |
Output depth for the convolution ops prior to predicting box encodings and class predictions. Default: 0 |
kernel_size |
int32 |
optional |
Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height). Default: 3 |
box_code_size |
int32 |
optional |
Size of the encoding for boxes. Default: 4 |
class_prediction_bias_init |
float |
optional |
Bias initialization for class prediction. It has been show to stabilize training where there are large number of negative boxes. See https://arxiv.org/abs/1708.02002 for details. Default: 0 |
use_dropout |
bool |
optional |
Whether to use dropout for class prediction. Default: false |
dropout_keep_probability |
float |
optional |
Keep probability for dropout. Default: 0.8 |
share_prediction_tower |
bool |
optional |
Whether to share the multi-layer tower between box prediction and class prediction heads. Default: false |
use_depthwise |
bool |
optional |
Whether to use depthwise separable convolution for box predictor layers. Default: false |
score_converter |
WeightSharedConvolutionalBoxPredictor.ScoreConverter |
optional |
Callable elementwise score converter at inference time. Default: IDENTITY |
box_encodings_clip_range |
WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange |
optional |
|
WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange
If specified, apply clipping to box encodings.
Field |
Type |
Label |
Description |
min |
float |
optional |
|
max |
float |
optional |
|
WeightSharedConvolutionalBoxPredictor.ScoreConverter
Enum to specify how to convert the detection scores at inference time.
Name |
Number |
Description |
IDENTITY |
0 |
Input scores equals output scores. |
SIGMOID |
1 |
Applies a sigmoid on input scores. |
Top
deepomatic/oef/protos/models/image/detection/calibration.proto
CalibrationConfig
Message wrapper for various calibration configurations.
ClassIdFunctionApproximations
Message for class-specific domain/range mapping for function
approximations.
ClassIdFunctionApproximations.ClassIdXyPairsMapEntry
Field |
Type |
Label |
Description |
key |
int32 |
optional |
|
value |
XYPairs |
optional |
|
ClassIdSigmoidCalibrations
Message for class-specific Sigmoid Calibration.
ClassIdSigmoidCalibrations.ClassIdSigmoidParametersMapEntry
FunctionApproximation
Message for class-agnostic domain/range mapping for function
approximations.
Field |
Type |
Label |
Description |
x_y_pairs |
XYPairs |
optional |
Message mapping class labels to indices |
SigmoidCalibration
Message for class-agnostic Sigmoid Calibration.
Field |
Type |
Label |
Description |
sigmoid_parameters |
SigmoidParameters |
optional |
Message mapping class index to Sigmoid Parameters |
SigmoidParameters
Message defining parameters for sigmoid calibration.
Field |
Type |
Label |
Description |
a |
float |
optional |
Default: -1 |
b |
float |
optional |
Default: 0 |
TemperatureScalingCalibration
Message for Temperature Scaling Calibration.
Field |
Type |
Label |
Description |
scaler |
float |
optional |
|
XYPairs
Message to store a domain/range pair for function to be approximated.
Field |
Type |
Label |
Description |
x_y_pair |
XYPairs.XYPair |
repeated |
Sequence of x/y pairs for function approximation. |
training_data_type |
TrainingDataType |
optional |
Description of data used to fit the calibration model. |
XYPairs.XYPair
Field |
Type |
Label |
Description |
x |
float |
optional |
|
y |
float |
optional |
|
TrainingDataType
Description of data used to fit the calibration model. CLASS_SPECIFIC
indicates that the calibration parameters are derived from detections
pertaining to a single class. ALL_CLASSES indicates that parameters were
obtained by fitting a model on detections from all classes (including the
background class).
Name |
Number |
Description |
DATA_TYPE_UNKNOWN |
0 |
|
ALL_CLASSES |
1 |
|
CLASS_SPECIFIC |
2 |
|
Top
deepomatic/oef/protos/models/image/detection/faster_rcnn_box_coder.proto
FasterRcnnBoxCoder
Configuration proto for FasterRCNNBoxCoder. See
box_coders/faster_rcnn_box_coder.py for details.
Field |
Type |
Label |
Description |
y_scale |
float |
optional |
Scale factor for anchor encoded box center. Default: 10 |
x_scale |
float |
optional |
Default: 10 |
height_scale |
float |
optional |
Scale factor for anchor encoded box height. Default: 5 |
width_scale |
float |
optional |
Scale factor for anchor encoded box width. Default: 5 |
Top
deepomatic/oef/protos/models/image/detection/flexible_grid_anchor_generator.proto
AnchorGrid
Field |
Type |
Label |
Description |
base_sizes |
float |
repeated |
The base sizes in pixels for each anchor in this anchor layer. |
aspect_ratios |
float |
repeated |
The aspect ratios for each anchor in this anchor layer. |
height_stride |
uint32 |
optional |
The anchor height stride in pixels. |
width_stride |
uint32 |
optional |
The anchor width stride in pixels. |
height_offset |
uint32 |
optional |
The anchor height offset in pixels. Default: 0 |
width_offset |
uint32 |
optional |
The anchor width offset in pixels. Default: 0 |
FlexibleGridAnchorGenerator
Field |
Type |
Label |
Description |
anchor_grid |
AnchorGrid |
repeated |
|
normalize_coordinates |
bool |
optional |
Whether to produce anchors in normalized coordinates. Default: true |
Top
deepomatic/oef/protos/models/image/detection/grid_anchor_generator.proto
GridAnchorGenerator
Configuration proto for GridAnchorGenerator. See
anchor_generators/grid_anchor_generator.py for details.
Field |
Type |
Label |
Description |
height |
int32 |
optional |
Anchor height in pixels. Default: 256 |
width |
int32 |
optional |
Anchor width in pixels. Default: 256 |
height_stride |
int32 |
optional |
Anchor stride in height dimension in pixels. Default: 16 |
width_stride |
int32 |
optional |
Anchor stride in width dimension in pixels. Default: 16 |
height_offset |
int32 |
optional |
Anchor height offset in pixels. Default: 0 |
width_offset |
int32 |
optional |
Anchor width offset in pixels. Default: 0 |
scales |
float |
repeated |
List of scales for the anchors. |
aspect_ratios |
float |
repeated |
List of aspect ratios for the anchors. |
Top
deepomatic/oef/protos/models/image/detection/keypoint_box_coder.proto
KeypointBoxCoder
Configuration proto for KeypointBoxCoder. See
box_coders/keypoint_box_coder.py for details.
Field |
Type |
Label |
Description |
num_keypoints |
int32 |
optional |
|
y_scale |
float |
optional |
Scale factor for anchor encoded box center and keypoints. Default: 10 |
x_scale |
float |
optional |
Default: 10 |
height_scale |
float |
optional |
Scale factor for anchor encoded box height. Default: 5 |
width_scale |
float |
optional |
Scale factor for anchor encoded box width. Default: 5 |
Top
deepomatic/oef/protos/models/image/detection/matcher.proto
Matcher
Configuration proto for the matcher to be used in the object detection
pipeline. See core/matcher.py for details.
Top
deepomatic/oef/protos/models/image/detection/mean_stddev_box_coder.proto
MeanStddevBoxCoder
Configuration proto for MeanStddevBoxCoder. See
box_coders/mean_stddev_box_coder.py for details.
Field |
Type |
Label |
Description |
stddev |
float |
optional |
The standard deviation used to encode and decode boxes. Default: 0.01 |
Top
deepomatic/oef/protos/models/image/detection/multiscale_anchor_generator.proto
MultiscaleAnchorGenerator
Configuration proto for RetinaNet anchor generator described in
https://arxiv.org/abs/1708.02002. See
anchor_generators/multiscale_grid_anchor_generator.py for details.
Field |
Type |
Label |
Description |
min_level |
int32 |
optional |
minimum level in feature pyramid Default: 3 |
max_level |
int32 |
optional |
maximum level in feature pyramid Default: 7 |
anchor_scale |
float |
optional |
Scale of anchor to feature stride Default: 4 |
aspect_ratios |
float |
repeated |
Aspect ratios for anchors at each grid point. |
scales_per_octave |
int32 |
optional |
Number of intermediate scale each scale octave Default: 2 |
normalize_coordinates |
bool |
optional |
Whether to produce anchors in normalized coordinates. Default: true |
Top
deepomatic/oef/protos/models/image/detection/post_processing.proto
BatchNonMaxSuppression
Configuration proto for non-max-suppression operation on a batch of
detections.
Field |
Type |
Label |
Description |
score_threshold |
float |
optional |
Scalar threshold for score (low scoring boxes are removed). Default: 0 |
iou_threshold |
float |
optional |
Scalar threshold for IOU (boxes that have high IOU overlap with previously selected boxes are removed). Default: 0.6 |
max_detections_per_class |
int32 |
optional |
Maximum number of detections to retain per class. Default: 100 |
max_total_detections |
int32 |
optional |
Maximum number of detections to retain across all classes. Default: 100 |
use_static_shapes |
bool |
optional |
Whether to use the implementation of NMS that guarantees static shapes. Default: false |
use_class_agnostic_nms |
bool |
optional |
Whether to use class agnostic NMS. Class-agnostic NMS function implements a class-agnostic version of Non Maximal Suppression where if max_classes_per_detection=k, 1) we keep the top-k scores for each detection and 2) during NMS, each detection only uses the highest class score for sorting. 3) Compared to regular NMS, the worst runtime of this version is O(N^2) instead of O(KN^2) where N is the number of detections and K the number of classes. Default: false |
use_combined_nms |
bool |
optional |
Whether to use tf.image.combined_non_max_suppression. Default: false |
change_coordinate_frame |
bool |
optional |
Whether to change coordinate frame of the boxlist to be relative to window's frame. Default: true |
use_hard_nms |
bool |
optional |
Use hard NMS. Note that even if this field is set false, the behavior of NMS will be equivalent to hard NMS; This field when set to true forces the tf.image.non_max_suppression function to be called instead of tf.image.non_max_suppression_with_scores and can be used to export models for older versions of TF. Default: false |
use_cpu_nms |
bool |
optional |
Use cpu NMS. NMSV3/NMSV4 by default runs on GPU, which may cause OOM issue if the model is large and/or batch size is large during training. Setting this flag to false moves the nms op to CPU when OOM happens. The flag is not needed if use_hard_nms = false, as soft NMS currently runs on CPU by default. Default: false |
PostProcessing
Configuration proto for post-processing predicted boxes and
scores.
Field |
Type |
Label |
Description |
batch_non_max_suppression |
BatchNonMaxSuppression |
optional |
Non max suppression parameters. |
score_converter |
PostProcessing.ScoreConverter |
optional |
Score converter to use. Default: IDENTITY |
logit_scale |
float |
optional |
Scale logit (input) value before conversion in post-processing step. Typically used for softmax distillation, though can be used to scale for other reasons. Default: 1 |
calibration_config |
CalibrationConfig |
optional |
Calibrate score outputs. Calibration is applied after score converter and before non max suppression. |
PostProcessing.ScoreConverter
Enum to specify how to convert the detection scores.
Name |
Number |
Description |
IDENTITY |
0 |
Input scores equals output scores. |
SIGMOID |
1 |
Applies a sigmoid on input scores. |
SOFTMAX |
2 |
Applies a softmax on input scores |
Top
deepomatic/oef/protos/models/image/detection/region_similarity_calculator.proto
IoaSimilarity
Configuration for intersection-over-area (IOA) similarity calculator.
IouSimilarity
Configuration for intersection-over-union (IOU) similarity calculator.
NegSqDistSimilarity
Configuration for negative squared distance similarity calculator.
RegionSimilarityCalculator
Configuration proto for region similarity calculators. See
core/region_similarity_calculator.py for details.
ThresholdedIouSimilarity
Configuration for thresholded-intersection-over-union similarity calculator.
Field |
Type |
Label |
Description |
iou_threshold |
float |
optional |
IOU threshold used for filtering scores. Default: 0.5 |
Top
deepomatic/oef/protos/models/image/detection/square_box_coder.proto
SquareBoxCoder
Configuration proto for SquareBoxCoder. See
box_coders/square_box_coder.py for details.
Field |
Type |
Label |
Description |
y_scale |
float |
optional |
Scale factor for anchor encoded box center. Default: 10 |
x_scale |
float |
optional |
Default: 10 |
length_scale |
float |
optional |
Scale factor for anchor encoded box length. Default: 5 |
Top
deepomatic/oef/protos/models/image/detection/ssd_anchor_generator.proto
SsdAnchorGenerator
Configuration proto for SSD anchor generator described in
https://arxiv.org/abs/1512.02325. See
anchor_generators/multiple_grid_anchor_generator.py for details.
Field |
Type |
Label |
Description |
num_layers |
int32 |
optional |
Number of grid layers to create anchors for. Default: 6 |
min_scale |
float |
optional |
Scale of anchors corresponding to finest resolution. Default: 0.2 |
max_scale |
float |
optional |
Scale of anchors corresponding to coarsest resolution Default: 0.95 |
scales |
float |
repeated |
Can be used to override min_scale->max_scale, with an explicitly defined set of scales. If empty, then min_scale->max_scale is used. |
aspect_ratios |
float |
repeated |
Aspect ratios for anchors at each grid point. |
interpolated_scale_aspect_ratio |
float |
optional |
When this aspect ratio is greater than 0, then an additional anchor, with an interpolated scale is added with this aspect ratio. Default: 1 |
reduce_boxes_in_lowest_layer |
bool |
optional |
Whether to use the following aspect ratio and scale combination for the layer with the finest resolution : (scale=0.1, aspect_ratio=1.0), (scale=min_scale, aspect_ration=2.0), (scale=min_scale, aspect_ratio=0.5). Default: true |
base_anchor_height |
float |
optional |
The base anchor size in height dimension. Default: 1 |
base_anchor_width |
float |
optional |
The base anchor size in width dimension. Default: 1 |
height_stride |
int32 |
repeated |
Anchor stride in height dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers. |
width_stride |
int32 |
repeated |
Anchor stride in width dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers. |
height_offset |
int32 |
repeated |
Anchor height offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers. |
width_offset |
int32 |
repeated |
Anchor width offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers. |
Top
deepomatic/oef/protos/models/image/utils/hyperparameters.proto
BatchNorm
Configuration proto for batch norm to apply after convolution op. See
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm
Field |
Type |
Label |
Description |
decay |
float |
optional |
Default: 0.999 |
center |
bool |
optional |
Default: true |
scale |
bool |
optional |
Default: false |
epsilon |
float |
optional |
Default: 0.001 |
train |
bool |
optional |
Whether to train the batch norm variables. If this is set to false during training, the current value of the batch_norm variables are used for forward pass but they are never updated. Default: true |
GroupNorm
Configuration proto for group normalization to apply after convolution op.
https://arxiv.org/abs/1803.08494
Hyperparams
Configuration proto for the convolution op hyperparameters
Field |
Type |
Label |
Description |
op |
Hyperparams.Op |
optional |
Default: CONV |
regularizer |
Regularizer |
optional |
Regularizer for the weights of the convolution op. |
initializer |
Initializer |
optional |
Initializer for the weights of the convolution op. |
activation |
Hyperparams.Activation |
optional |
Default: RELU |
batch_norm |
BatchNorm |
optional |
Note that if nothing below is selected, then no normalization is applied BatchNorm hyperparameters. |
group_norm |
GroupNorm |
optional |
GroupNorm hyperparameters. This is only supported on a subset of models. Note that the current implementation of group norm instantiated in tf.contrib.group.layers.group_norm() only supports fixed_size_resizer for image preprocessing. |
regularize_depthwise |
bool |
optional |
Whether depthwise convolutions should be regularized. If this parameter is NOT set then the conv hyperparams will default to the parent scope. Default: false |
Initializer
Proto with one-of field for initializers.
L1Regularizer
Configuration proto for L1 Regularizer.
See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l1_regularizer
Field |
Type |
Label |
Description |
weight |
float |
optional |
Default: 1 |
L2Regularizer
Configuration proto for L2 Regularizer.
See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l2_regularizer
Field |
Type |
Label |
Description |
weight |
float |
optional |
Default: 1 |
RandomNormalInitializer
Configuration proto for random normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer
Field |
Type |
Label |
Description |
mean |
float |
optional |
Default: 0 |
stddev |
float |
optional |
Default: 1 |
Regularizer
Proto with one-of field for regularizers.
TruncatedNormalInitializer
Configuration proto for truncated normal initializer. See
https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer
Field |
Type |
Label |
Description |
mean |
float |
optional |
Default: 0 |
stddev |
float |
optional |
Default: 1 |
VarianceScalingInitializer
Configuration proto for variance scaling initializer. See
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/
variance_scaling_initializer
Hyperparams.Activation
Type of activation to apply after convolution.
Name |
Number |
Description |
NONE |
0 |
Use None (no activation) |
RELU |
1 |
Use tf.nn.relu |
RELU_6 |
2 |
Use tf.nn.relu6 |
Hyperparams.Op
Operations affected by hyperparameters.
Name |
Number |
Description |
CONV |
1 |
Convolution, Separable Convolution, Convolution transpose. |
FC |
2 |
Fully connected |
VarianceScalingInitializer.Mode
Name |
Number |
Description |
FAN_IN |
0 |
|
FAN_OUT |
1 |
|
FAN_AVG |
2 |
|
Scalar Value Types
.proto Type |
Notes |
C++ |
Java |
Python |
Go |
C# |
PHP |
Ruby |
double |
|
double |
double |
float |
float64 |
double |
float |
Float |
float |
|
float |
float |
float |
float32 |
float |
float |
Float |
int32 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
int64 |
Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
uint32 |
Uses variable-length encoding. |
uint32 |
int |
int/long |
uint32 |
uint |
integer |
Bignum or Fixnum (as required) |
uint64 |
Uses variable-length encoding. |
uint64 |
long |
int/long |
uint64 |
ulong |
integer/string |
Bignum or Fixnum (as required) |
sint32 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
sint64 |
Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
fixed32 |
Always four bytes. More efficient than uint32 if values are often greater than 2^28. |
uint32 |
int |
int |
uint32 |
uint |
integer |
Bignum or Fixnum (as required) |
fixed64 |
Always eight bytes. More efficient than uint64 if values are often greater than 2^56. |
uint64 |
long |
int/long |
uint64 |
ulong |
integer/string |
Bignum |
sfixed32 |
Always four bytes. |
int32 |
int |
int |
int32 |
int |
integer |
Bignum or Fixnum (as required) |
sfixed64 |
Always eight bytes. |
int64 |
long |
int/long |
int64 |
long |
integer/string |
Bignum |
bool |
|
bool |
boolean |
boolean |
bool |
bool |
boolean |
TrueClass/FalseClass |
string |
A string must always contain UTF-8 encoded or 7-bit ASCII text. |
string |
String |
str/unicode |
string |
string |
string |
String (UTF-8) |
bytes |
May contain any arbitrary sequence of bytes. |
string |
ByteString |
str |
[]byte |
ByteString |
string |
String (ASCII-8BIT) |