Skip to the content.

Protocol Documentation

Table of Contents

Top

deepomatic/oef/protos/dataoperation.proto

DataOperation

An operation that can be applied to the whole dataset before the training.

Field Type Label Description
loss_based_balancing LossBasedBalancing optional The loss based balancing operation.

LossBasedBalancing

A class-balancing operation that will duplicate elements of less represented classes.

Field Type Label Description
batch_size float optional Number of datapoints considered before checking for entropy stability. If lower or equal to 1: this represents the fraction of the dataset to consider Default: 0.01
epsilon float optional The value below which entropy is considered stable. Default: 0.0001
max_dataset_expansion float optional Multiple of the dataset size allowed. Default: 2

Top

deepomatic/oef/protos/dataset.proto

Dataset related messages.

Dataset

A dataset pointer: you give the path to the root of all data source and the path to an object of type DatasetDump.

Field Type Label Description
root string optional Root of all the data source. Might be a Google Storage path.
config_path string required Path of the dataset config file.
operations deepomatic.oef.dataoperation.DataOperation repeated A set of optional operations to be applied to the dataset (filter, balancing, …).
margin_crop float optional option to add margin when the images are cropped, the value added is the percent of the minimum dimension [0.0-1.0] Default: 0

Top

deepomatic/oef/protos/dataset_dump_v1.proto

Dataset related messages. DEPRECATED! Please use dataset_dump_v2

Annotation

Stores the value of an annotation.

Field Type Label Description
name string   Refers to the name of the annotation specification.
bool_value bool   A boolean value.
string_value string   A string value.
int_value int32   An integer value.
float_value float   A float value.
distribution_value LabelDistribution   A list of probabilites per label.

AnnotationSpecification

Specification for a concept.

Field Type Label Description
tag_spec TagAnnotationSpecification   Annotation of type LABEL.

DataPoint

A data-point is a data source augmented with its annotations.

Field Type Label Description
input_data InputData repeated The data sources.
meta google.protobuf.Struct   User defined meta informations.

DatasetDump

A dataset

Field Type Label Description
specifications Specification   The dataset specification.
splits Splits   The dataset splits.
data_points DatasetDump.DataPointsEntry repeated All the data points associated with their ID.

DatasetDump.DataPointsEntry

Field Type Label Description
key string    
value DataPoint    

ImageBBox

A bounding box region.

Field Type Label Description
xmin float   Left x-coordinate.
ymin float   Top y-coordinate.
xmax float   Right x-coordinate.
ymax float   Bottom y-coordinate.

InputData

A data-source that would be available at inference time.

Field Type Label Description
name string   The name source name (must match the one of the data source names declared in the specifications).
value string   Unused for now, it should rather be of type Annotation.
value_from string   Use this to load the data source from a file. The path should be relative to the dataset root.
regions Region repeated List of data regions.

InputDataSpecification

Specification for a data source (e.g. an image).

Field Type Label Description
name string   Data source name (in order to match the specification to the data annotation).
type InputDataSpecification.InputDataType   Data source type.
region_annotations AnnotationSpecification repeated List of region related specifications [deprecated].

LabelDistribution

A probability distribution over labels.

Field Type Label Description
probas LabelDistributionEntry repeated a dict of probabilities: {label: value}.

LabelDistributionEntry

An entry of a probabilistic distribution.

Field Type Label Description
name string   The label name.
index int32   The label index. It does not have to be 0-based, it might be anything that suits your needs.
proba float   Probability of this label.

NoRegion

A dummy region that refer to the whole data source.

Region

A data-source region (including the dummy NoRegion).

Field Type Label Description
type Region.RegionType   Type of region (depends on the data source and the Model.model_type) (TODO: should be a one-of)
annotations Annotation repeated Stores all the annotations of the region.
none NoRegion   A dummy region that actually refers to the whole data source.
image_bbox ImageBBox   A 2D bounding box.
id string   The id of that Region
parent_id string   Refers to the id of a parent Region

Specification

Specification for a data point.

Field Type Label Description
input_data InputDataSpecification repeated List of all the possible data source for a single data point.
annotations AnnotationSpecification repeated List of all the possible concepts for a single data point.

Split

Stores a dataset split.

Field Type Label Description
ids SplitByIds   Use this to define a split by IDs.
@exclude Unused for now - should be used in conjuction with reproducible below. float pct = 2;

SplitByIds

Defines a dataset split with the list of data point IDs.

Field Type Label Description
ids string repeated List of IDs that belong to this split.

Splits

Stores all the dataset splits.

Field Type Label Description
split_map Splits.SplitMapEntry repeated Map of all the dataset splits.
@exclude Unused for now - should be used in conjuction with pct above. bool reproducible = 2;

Splits.SplitMapEntry

Field Type Label Description
key string    
value Split    

TagAnnotationSpecification

Specifies a tag ID and name for a LABEL related annotation.

Field Type Label Description
id int32   Unique deepomatic studio ID.
tag_name string   Tag name.

InputDataSpecification.InputDataType

Name Number Description
IMAGE 0  

Region.RegionType

Type of region (depends on the data source and the Model.model_type).

Name Number Description
NONE 0  
IMAGE_BBOX 1  

Top

deepomatic/oef/protos/dataset_dump_v2.proto

Dataset related messages.

Annotation

Stores the annotation.

Field Type Label Description
view_id string   view ID to which this annotation belongs
concepts Concepts   Concept annotation
text string   Text annotation
mask Mask   Mask annotation
keypoints Keypoints   Keypoints annotation

ConceptAnnotation

Specifies list of concepts.

Field Type Label Description
concepts ConceptAnnotation.Concept repeated List of concepts

ConceptAnnotation.Concept

Specifies a tag ID and name for a LABEL related annotation.

Field Type Label Description
id string   Unique deepomatic studio ID (needed for studio and vulcan).
name string   Concept name.

Concepts

A concept annotation.

Field Type Label Description
concept_ids string repeated list of concept IDs. Should be of length 1 for single-label annotations

DataPoint

A data-point is a data source augmented with its region annotations.

Field Type Label Description
id string   data point ID
url string   URL to a file. The path should be relative to the dataset root.
metadata google.protobuf.Struct   User defined meta-data informations.
regions Region repeated List of data regions.
splits string repeated Splits

DatasetHeader

A dataset header. Contains a list of views and splits.

Field Type Label Description
views View repeated The dataset views.
splits string repeated The dataset splits.
name string   Name of the dataset

KeypointAnnotation

Specifies a list of keypoint nodes and skeleton (node: [edges] for each node).

Field Type Label Description
keypoints KeypointAnnotation.Keypoint repeated List of keypoints

KeypointAnnotation.Keypoint

Keypoint definition

Field Type Label Description
id string   node id
name string   node name
edges string repeated node edges

Keypoints

A keypoint annotation.

Field Type Label Description
keypoints Keypoints.Keypoint repeated list of keypoints

Keypoints.Keypoint

A keypoint entry.

Field Type Label Description
concept_id string   Concept ID
x float   x-position
y float   y-position
is_visible bool   Is the keypoint visible. The coordinates can be non-zero (labeled) but the point is not visible.

Mask

A mask annotation.

Field Type Label Description
concept_id string   Concept ID
polygons Mask.Polygons   polygons
is_thing bool   Is it a thing or stuff?

Mask.Polygons

A polygon annotation

Field Type Label Description
polygons Mask.Polygons.Polygon repeated A list of polygon. An object may be partially occluded and needs more than 1 polygon to be described

Mask.Polygons.Polygon

Single Polygon with a list of vertices

Field Type Label Description
vertices Mask.Polygons.Polygon.Vertex repeated List of vertices of the polygon

Mask.Polygons.Polygon.Vertex

Vertex - a point of the polygon contour

Field Type Label Description
x float   x position
y float   y position

Region

A data-source region (including the dummy WholeData).

Field Type Label Description
annotations Annotation repeated Stores all the annotations of the region.
whole_data Region.WholeData   A dummy region that actually refers to the whole data source.
image_bbox Region.ImageBBox   A 2D bounding box.
id string   The id of that Region
parent_id string   Refers to the id of a parent Region

Region.ImageBBox

A bounding box region.

Field Type Label Description
xmin float   Left x-coordinate.
ymin float   Top y-coordinate.
xmax float   Right x-coordinate.
ymax float   Bottom y-coordinate.

Region.WholeData

A dummy region that refer to the whole data source.

TextAnnotation

Specifies a list of allowed characters and if word recognition is wanted.

Field Type Label Description
allowed_characters string repeated allowed characters.
use_word_recognition bool   use auto-regression (for word recognition)

View

Field Type Label Description
id string   Unique deepomatic studio ID. It should match experiment/view_ids.
parent_id string   Unique deepomatic studio ID of the parent view.
concepts ConceptAnnotation   Annotation of type LABEL.
text TextAnnotation   Annotation of type TEXT
keypoints KeypointAnnotation   Annotation of type KEYPOINTS
type View.ViewType   View type
conditions View.Condition repeated Condition list(list()) condition = [] => no condition condition = [[]] => without concept condition = [[a], [b]] => a OR b condition = [[a, b]] => a AND b
name string   Name of view

View.Condition

FIXME: This might handle other types as strings

Field Type Label Description
condition string repeated View condition

View.ViewType

Specifies the type of the view

Name Number Description
undefined 0 Default value and unsupported: ViewType has to be set
classification 1 Multi-class classification view
tagging 2 Multi-label classification view
detection 3 Detection view
ocr 4 OCR view
segmentation 5 Segmentation view
keypoint 6 Keypoint view

Top

deepomatic/oef/protos/experiment.proto

Experiment related messages.

Experiment

This is the main message to define an experiment.

Field Type Label Description
seed int32 optional Seed to be able to reproduce experiment Default: 0
dataset deepomatic.oef.dataset.Dataset required The dataset
trainer deepomatic.oef.trainer.Trainer required The model
hyperparameters Experiment.HyperparametersEntry repeated The hyper-parameters Map experiment parameters to hyper-parameter sampling distribution The key has to start with trainer., i.e., trainer.initial_learning_rate
max_hp_runs int32 optional Maximum number of hyper-parameter runs (default = 1 for standard training) Default: 1
view_ids string repeated List of views on which to train. It should match DatasetHeader/views ids. If this parameter is empty, the default view id used will be the first of DatasetHeader/views.

Experiment.HyperparametersEntry

Field Type Label Description
key string optional  
value deepomatic.oef.hyperparameter.HyperParameter optional  

Top

deepomatic/oef/protos/hyperparameter.proto

CategoricalDistribution

Field Type Label Description
values Value repeated  

HyperParameter

Field Type Label Description
categorical CategoricalDistribution optional  
uniform UniformDistribution optional  
log_uniform LogarithmicUniformDistribution optional  
normal NormalDistribution optional  

LogarithmicUniformDistribution

Field Type Label Description
min float required  
max float required  

NormalDistribution

Field Type Label Description
mu float required  
sigma float required  

UniformDistribution

Field Type Label Description
min float required  
max float required  

Value

Field Type Label Description
integer_value int32 optional  
float_value float optional  
boolean_value bool optional  
string_value string optional  

File-level Extensions

| Extension | Type | Base | Number | Description | | ——— | —- | —- | —— | ———– | | field_option | HyperParameter | .google.protobuf.FieldOptions | 1000 | | | oneof_option | HyperParameter | .google.protobuf.OneofOptions | 1000 | |

Top

deepomatic/oef/protos/losses.proto

BootstrappedSigmoidClassificationLoss

Classification loss using a sigmoid function over the class prediction with the highest prediction score.

Field Type Label Description
alpha float optional Interpolation weight between 0 and 1.
hard_bootstrap bool optional Whether hard boot strapping should be used or not. If true, will only use one class favored by model. Othewise, will use all predicted class probabilities. Default: false
anchorwise_output bool optional DEPRECATED, do not use. Output loss per anchor. Default: false

ClassificationLoss

Configuration for class prediction loss function.

Field Type Label Description
weighted_sigmoid WeightedSigmoidClassificationLoss optional  
weighted_softmax WeightedSoftmaxClassificationLoss optional  
weighted_logits_softmax WeightedSoftmaxClassificationAgainstLogitsLoss optional  
bootstrapped_sigmoid BootstrappedSigmoidClassificationLoss optional  
weighted_sigmoid_focal SigmoidFocalClassificationLoss optional  

HardExampleMiner

Configuration for hard example miner.

Field Type Label Description
num_hard_examples int32 optional Maximum number of hard examples to be selected per image (prior to enforcing max negative to positive ratio constraint). If set to 0, all examples obtained after NMS are considered. Default: 64
iou_threshold float optional Minimum intersection over union for an example to be discarded during NMS. Default: 0.7
loss_type HardExampleMiner.LossType optional Default: BOTH
max_negatives_per_positive int32 optional Maximum number of negatives to retain for each positive anchor. If num_negatives_per_positive is 0 no prespecified negative:positive ratio is enforced. Default: 0
min_negatives_per_image int32 optional Minimum number of negative anchors to sample for a given image. Setting this to a positive number samples negatives in an image without any positive anchors and thus not bias the model towards having at least one detection per image. Default: 0

LocalizationLoss

Configuration for bounding box localization loss function.

Field Type Label Description
weighted_l2 WeightedL2LocalizationLoss optional  
weighted_smooth_l1 WeightedSmoothL1LocalizationLoss optional  
weighted_iou WeightedIOULocalizationLoss optional  

Loss

Message for configuring the localization loss, classification loss and hard example miner used for training object detection models. See core/losses.py for details

Field Type Label Description
localization_loss LocalizationLoss optional Localization loss to use.
classification_loss ClassificationLoss optional Classification loss to use.
hard_example_miner HardExampleMiner optional If not left to default, applies hard example mining.
classification_weight float optional Classification loss weight. Default: 1
localization_weight float optional Localization loss weight. Default: 1
random_example_sampler RandomExampleSampler optional If not left to default, applies random example sampling.
equalization_loss Loss.EqualizationLoss optional  
expected_loss_weights Loss.ExpectedLossWeights optional Method to compute expected loss weights with respect to balanced positive/negative sampling scheme. If NONE, use explicit sampling. TODO(birdbrain): Move under ExpectedLossWeights. Default: NONE
min_num_negative_samples float optional Minimum number of effective negative samples. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights. Default: 0
desired_negative_sampling_ratio float optional Desired number of effective negative samples per positive sample. Only applies if expected_loss_weights is not NONE. TODO(birdbrain): Move under ExpectedLossWeights. Default: 3

Loss.EqualizationLoss

Equalization loss.

Field Type Label Description
weight float optional Weight equalization loss strength. Default: 0
exclude_prefixes string repeated When computing equalization loss, ops that start with equalization_exclude_prefixes will be ignored. Only used when equalization_weight > 0.

RandomExampleSampler

Configuration for random example sampler.

Field Type Label Description
positive_sample_fraction float optional The desired fraction of positive samples in batch when applying random example sampling. Default: 0.01

SigmoidFocalClassificationLoss

Sigmoid Focal cross entropy loss as described in https://arxiv.org/abs/1708.02002

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Default: false
gamma float optional modulating factor for the loss. Default: 2
alpha float optional alpha weighting factor for the loss.

WeightedIOULocalizationLoss

Intersection over union location loss: 1 - IOU

WeightedL2LocalizationLoss

L2 location loss: 0.5 * ||weight * (a - b)|| ^ 2

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Output loss per anchor. Default: false

WeightedSigmoidClassificationLoss

Classification loss using a sigmoid function over class predictions.

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Output loss per anchor. Default: false

WeightedSmoothL1LocalizationLoss

SmoothL1 (Huber) location loss. The smooth L1_loss is defined elementwise as .5 x^2 if |x| <= delta and delta * (|x|-0.5*delta) otherwise, where x is the difference between predictions and target.

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Output loss per anchor. Default: false
delta float optional Delta value for huber loss. Default: 1

WeightedSoftmaxClassificationAgainstLogitsLoss

Classification loss using a softmax function over class predictions and a softmax function over the groundtruth labels (assumed to be logits).

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Default: false
logit_scale float optional Scale and softmax groundtruth logits before calculating softmax classification loss. Typically used for softmax distillation with teacher annotations stored as logits. Default: 1

WeightedSoftmaxClassificationLoss

Classification loss using a softmax function over class predictions.

Field Type Label Description
anchorwise_output bool optional DEPRECATED, do not use. Output loss per anchor. Default: false
logit_scale float optional Scale logit (input) value before calculating softmax classification loss. Typically used for softmax distillation. Default: 1

HardExampleMiner.LossType

Whether to use classification losses ('cls', default), localization losses ('loc') or both losses ('both'). In the case of 'both', cls_loss_weight and loc_loss_weight are used to compute weighted sum of the two losses.

Name Number Description
BOTH 0  
CLASSIFICATION 1  
LOCALIZATION 2  

Loss.ExpectedLossWeights

Name Number Description
NONE 0  
EXPECTED_SAMPLING 1 Use expected_classification_loss_by_expected_sampling from third_party/tensorflow_models/object_detection/utils/ops.py
REWEIGHTING_UNMATCHED_ANCHORS 2 Use expected_classification_loss_by_reweighting_unmatched_anchors from third_party/tensorflow_models/object_detection/utils/ops.py

Top

deepomatic/oef/protos/optimizer.proto

AdamOptimizer

Configuration message for the AdamOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer

Field Type Label Description
beta_1 float optional Default: 0.9
beta_2 float optional Default: 0.999
epsilon float optional Default: 1e-08

ConstantLearningRate

Configuration message for a constant learning rate.

CosineDecayLearningRate

Configuration message for a cosine decaying learning rate as defined in object_detection/utils/learning_schedules.py

Field Type Label Description
total_steps_pct float optional Default: 1.07
warmup_learning_rate float optional Default: 0.0002
warmup_steps_pct float optional Default: 0.0025
hold_base_rate_steps_pct float optional Default: 0

ExponentialDecayLearningRate

Configuration message for an exponentially decaying learning rate. See https://www.tensorflow.org/versions/master/api_docs/python/train/
decaying_the_learning_rate#exponential_decay

Field Type Label Description
decay_steps_pct float optional Default: 0.006
decay_factor float optional Default: 0.95
staircase bool optional Default: true
burnin_learning_rate float optional Default: 0
burnin_steps_pct float optional Default: 0
min_learning_rate float optional Default: 0

LearningRatePolicy

Configuration message for optimizer learning rate.

Field Type Label Description
constant_learning_rate ConstantLearningRate optional  
exponential_decay_learning_rate ExponentialDecayLearningRate optional  
manual_step_learning_rate ManualStepLearningRate optional  
cosine_decay_learning_rate CosineDecayLearningRate optional  
triangular_cyclical_learning_rate TriangularCyclicalLearningRatePatched optional  
one_cycle_learning_rate OnceCycleLearningRate optional  

ManualStepLearningRate

Configuration message for a manually defined learning rate schedule.

Field Type Label Description
schedule ManualStepLearningRate.LearningRateSchedule repeated  
warmup bool optional Whether to linearly interpolate learning rates for steps in [0, schedule[0].step]. Default: false

ManualStepLearningRate.LearningRateSchedule

Field Type Label Description
step_pct float optional  
learning_rate_factor float optional Default: 0.1

MomentumOptimizer

Configuration message for the MomentumOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/MomentumOptimizer

Field Type Label Description
momentum_optimizer_value float optional Default: 0.9

NadamOptimizer

Configuration message for the NadamOptimizer See: https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/keras/optimizers/Nadam

Field Type Label Description
beta_1 float optional Default: 0.9
beta_2 float optional Default: 0.999
epsilon float optional Default: 1e-07

OnceCycleLearningRate

Field Type Label Description
cycle_steps float required Default: 1
min_max_lr_ratio float optional Default: 10

Optimizer

Top level optimizer message.

Field Type Label Description
rms_prop_optimizer RMSPropOptimizer optional  
momentum_optimizer MomentumOptimizer optional  
adam_optimizer AdamOptimizer optional  
nadam_optimizer NadamOptimizer optional  
rectified_adam_optimizer RectifiedAdamOptimizer optional  
yogi_optimizer YogiOptimizer optional  
use_moving_average bool optional Default: false
moving_average_decay float optional Default: 0.9999

RMSPropOptimizer

Configuration message for the RMSPropOptimizer See: https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer

Field Type Label Description
momentum_optimizer_value float optional Default: 0.9
decay float optional Default: 0.9
epsilon float optional Default: 1e-07

RectifiedAdamOptimizer

Configuration message for the RAdamOptimizer https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/RectifiedAdam

Field Type Label Description
beta_1 float optional Default: 0.9
beta_2 float optional Default: 0.999
epsilon float optional Default: 1e-07

TriangularCyclicalLearningRatePatched

Field Type Label Description
cycle_steps float required Default: 1
min_max_lr_ratio float optional Default: 10

YogiOptimizer

Configuration message for Yogi optimizer https://www.tensorflow.org/addons/api_docs/python/tfa/optimizers/Yogi

Field Type Label Description
beta_1 float optional Default: 0.9
beta_2 float optional Default: 0.999
epsilon float optional Default: 0.001

Top

deepomatic/oef/protos/trainer.proto

Experiment related messages.

Quantization

Allows to quantize the model weight into int8.

Field Type Label Description
delay int32 optional Number of steps to delay before quantization takes effect during training. Default: 500000
weight_bits int32 optional Number of bits to use for quantizing weights. Only 8 bit is supported for now. Default: 8
activation_bits int32 optional Number of bits to use for quantizing activations. Only 8 bit is supported for now. Default: 8

Trainer

This is the main message to define an experiment.

Field Type Label Description
inputs deepomatic.oef.models.image.preprocessing.Input repeated Data augmentation options for each input
image_classification deepomatic.oef.models.image.classification.Classification optional A classification model.
image_detection deepomatic.oef.models.image.detection.Detection optional A detection model.
image_ocr deepomatic.oef.models.image.ocr.OCR optional An OCR model
image_segmentation deepomatic.oef.models.image.segmentation.Segmentation optional A segmentation model.
batch_size int32 required Batch size: set by default according to the chosen model if not set.
eval_batch_size int32 optional Batch size for evaluation: set to batch_size if set to non-positive value. Default: 0
num_train_epochs float optional Number of batches processed during training (in epochs). Used only if num_train_steps is zero. Default: 6
num_train_steps int32 optional Number of batches processed during training. If zero, the trainer will use num_train_epochs instead. Default: 0
num_eval_steps int32 optional Number of batches processed during evaluation (use zero to run on the whole validation set). Default: 0
add_regularization_loss bool optional Additional loss generated by the regularization function Default: true
freeze_variables string repeated Variables that should not be updated during training. If update_trainable_variables is not empty, only eliminates the included variables according to freeze_variables patterns.
update_trainable_variables string repeated Variables that should be updated during training. Note that variables which also match the patterns in freeze_variables will be excluded.
pretrained_parameters string optional URL to pretrained parameters
keep_checkpoint_every_n_hours float optional Time interval between two parameter checkpoints, in hours. Default: 1
resume_training bool optional Whether to load all variables, or only those within the feature extractor scopes. If true, the global step will be reset to zero. Default: false
do_not_restore_variables string repeated Variables that should not restored from a checkpoint when fine-tuning. Typically useful for some convolutions close to label where it may be better to initialize them with random weights. This is not used when resuming training. You can indicate prefixes of parameter names: all parameter starting with any of the prefix will be skipped.
restore_backbone_weights_only bool optional Set it to true to restore only the weights of the backbone when working with complex meta-architectures like detection. When set to true, it will restore the backbone weights but not the meta-architecture weights. Default: false
initial_learning_rate float optional Initial learning rate
learning_rate_policy deepomatic.oef.optimizer.LearningRatePolicy optional Learning rate policy
optimizer deepomatic.oef.optimizer.Optimizer optional Optimizer type
gradient_clipping_by_norm float optional Max value for a gradient, prevents exploding gradients when backpropagating. Use 0 to deactivate Default: 10
use_float16 bool optional Use float16 instead of float32 Default: false
quantization Quantization optional Parameters for quantization.

Top

deepomatic/oef/protos/models/image/backbones.proto

Backbone

The list of allowed backbones

Field Type Label Description
input deepomatic.oef.models.image.preprocessing.Input required Data augmentation options. Deprecated, prefer using trainer.inputs
width_multiplier float optional A multiplier for the number of channels Default: 1
min_width int32 optional Minimum number of channels Default: 8
hyperparameters deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters that may override default backbones values
vgg VGGBackbone optional CustomBackbone custom = 16;
inception InceptionBackbone optional  
inception_resnet InceptionResNetBackbone optional  
resnet ResNetBackbone optional  
mobilenet MobileNetBackbone optional  
nasnet NasNetBackbone optional  
darknet DarknetBackbone optional  
efficientnet EfficientNetBackbone optional  
yolo_v8 YoloV8Backbone optional  

DarknetBackbone

Field Type Label Description
depth DarknetBackbone.Depth required The backbone variant: Darknet-19 or Darknet-53

EfficientNetBackbone

Field Type Label Description
version EfficientNetBackbone.Version required The backbone variant: EfficientNet-B[0-8] or EfficientNet-L2
survival_prob float optional Default: 0.8

InceptionBackbone

Field Type Label Description
version InceptionBackbone.Version required The implementation version

InceptionResNetBackbone

Field Type Label Description
version InceptionResNetBackbone.Version optional The implementation version Default: V2

MobileNetBackbone

Field Type Label Description
version MobileNetBackbone.Version optional The implementation version Default: V2

NasNetBackbone

Field Type Label Description
version NasNetBackbone.Version optional The implementation version Default: NasNet
depth NasNetBackbone.Depth required The backbone variant: NasNet-Large or NasNet-Mobile

ResNetBackbone

Field Type Label Description
version ResNetBackbone.Version optional The implementation version Default: V2
depth ResNetBackbone.Depth required The backbone variant: ResNet-50, ResNet-101, etc…

VGGBackbone

Field Type Label Description
depth VGGBackbone.Depth required The backbone variant: VGG-16, VGG-19, etc…

YoloV8Backbone

Field Type Label Description
version YoloV8Backbone.Version required The implementation version

DarknetBackbone.Depth

Name Number Description
DEPTH_19 19  
DEPTH_53 53  

EfficientNetBackbone.Version

Name Number Description
B0 0 https://arxiv.org/abs/1905.11946
B1 1 https://arxiv.org/abs/1905.11946
B2 2 https://arxiv.org/abs/1905.11946
B3 3 https://arxiv.org/abs/1905.11946
B4 4 https://arxiv.org/abs/1905.11946
B5 5 https://arxiv.org/abs/1905.11946
B6 6 https://arxiv.org/abs/1905.11946
B7 7 https://arxiv.org/abs/1905.11946
B8 8 https://arxiv.org/abs/1911.09665
L2 10 https://arxiv.org/abs/1911.04252

InceptionBackbone.Version

Name Number Description
V1 1  
V2 2  
V3 3  
V4 4  

InceptionResNetBackbone.Version

Name Number Description
V2 2  

MobileNetBackbone.Version

Name Number Description
V1 1  
V2 2  

NasNetBackbone.Depth

Name Number Description
LARGE 0  
MOBILE 1  

NasNetBackbone.Version

Name Number Description
NasNet 1 The typology of those flags is used to generate human readable strings in experiment_to_display_name.py: keep those mixed cases
https://arxiv.org/abs/1707.07012      
  PNasNet 2 https://arxiv.org/abs/1712.00559

ResNetBackbone.Depth

Name Number Description
DEPTH_50 50  
DEPTH_101 101  
DEPTH_152 152 DEPTH_200 = 200;

ResNetBackbone.Version

Name Number Description
V1 1  
V2 2  

VGGBackbone.Depth

Name Number Description
DEPTH_11 11  
DEPTH_16 16  
DEPTH_19 19  

YoloV8Backbone.Version

Name Number Description
Nano 1  
Small 2  
Medium 3  
Large 4  
Extra 5  

Top

deepomatic/oef/protos/models/image/classification.proto

Image related models.

Classification

Classification model

Field Type Label Description
backbone deepomatic.oef.models.image.backbones.Backbone required Select from Inception, Resnet, etc…
label_smoothing float optional If greater than 0 then smooth the labels towards 1/num_classes Default: 0
loss deepomatic.oef.losses.ClassificationLoss required Classification loss function
dropout_keep_prob float required Drop-out keep probability

Top

deepomatic/oef/protos/models/image/detection.proto

Image related models.

Detection

Detection model

Field Type Label Description
backbone deepomatic.oef.models.image.backbones.Backbone optional Select from Inception, Resnet, etc…
label_smoothing float optional If greater than 0 then smooth the labels towards 1/num_classes Default: 0
faster_rcnn FasterRCNNMetaArchitecture optional  
rfcn RFCNMetaArchitecture optional  
ssd SSDMetaArchitecture optional  
yolo_v2 YoloV2MetaArchitecture optional  
yolo_v3 YoloV3MetaArchitecture optional  
yolo_v3_keras YoloV3MetaArchitecture optional  
yolo_v3_spp YoloV3MetaArchitecture optional  
yolo_v8 YoloV8MetaArchitecture optional  
efficientdet EfficientDetMetaArchitecture optional next is 25

EfficientDetMetaArchitecture

Training parameters for EfficientDet

Field Type Label Description
activation_function string optional activation function to be used ('swish', 'swish_native', 'relu', 'relu6') Default: swish
min_level int32 optional integer number of minimum level of the output feature pyramid Default: 3
max_level int32 optional integer number of maximum level of the output feature pyramid Default: 7
num_scales int32 optional integer number of intermediate anchor scales added on each level Default: 3
aspect_ratios EfficientDetMetaArchitecture.AspectRatio repeated list of aspect ratio anchors added on each level
anchor_scale float optional float number scale of size of the base anchor to the feature stride 2^level Default: 4
alpha float optional classification loss: focal loss float number weighting factor alpha Default: 0.25
gamma float optional classification loss: focal loss float number focusing parameter gamma Default: 1.5
delta float optional localization loss: huber loss float number transition parameter delta from quadratic to linear function Default: 0.1
box_loss_weight float optional localisation loss weight: huber loss Default: 50
iou_loss_type EfficientDetMetaArchitecture.IOULossType optional Default: NONE
iou_loss_weight float optional localisation loss weight: IoU loss Default: 1
weight_decay float optional float number regularization weight decay Default: 4e-05
box_class_repeats int32 optional integer number of layers in classification / box net Default: 3
fpn_cell_repeats int32 optional integer number of layers in BiFPN Default: 3
fpn_num_filters int32 optional integer number of intermediate layers in BiFPN Default: 88
fpn_name string optional configuration name of BiFPN ('bifpn_sum', 'bifpn_fa', 'bifpn_dyn') Default: bifpn_fa

EfficientDetMetaArchitecture.AspectRatio

Field Type Label Description
height_ratio float required  
width_ratio float required  

FasterRCNNMetaArchitecture

Training parameters for Faster-RCNN

Field Type Label Description
parameters RCNNParameters required  
initial_crop_size int32 required Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling.
maxpool_kernel_size int32 required Kernel size of the max pool op on the cropped feature map during ROI pooling.
maxpool_stride int32 required Stride of the max pool op on the cropped feature map during ROI pooling.

FeaturePyramidNetworks

Configuration for Feature Pyramid Networks.

We recommend to use multi_resolution_feature_map_generator with FPN, and the levels there must match the levels defined below for better performance. Correspondence from FPN levels to Resnet/Mobilenet V1 feature maps: FPN Level Resnet Feature Map Mobilenet-V1 Feature Map 2 Block 1 Conv2d_3_pointwise 3 Block 2 Conv2d_5_pointwise 4 Block 3 Conv2d_11_pointwise 5 Block 4 Conv2d_13_pointwise 6 Bottomup_5 bottom_up_Conv2d_14 7 Bottomup_6 bottom_up_Conv2d_15 8 Bottomup_7 bottom_up_Conv2d_16 9 Bottomup_8 bottom_up_Conv2d_17

Field Type Label Description
min_level int32 optional minimum level in feature pyramid Default: 3
max_level int32 optional maximum level in feature pyramid Default: 7
additional_layer_depth int32 optional channel depth for additional coarse feature layers. Default: 256

RCNNParameters

Faster-RCNN and RFCN as described in https://arxiv.org/abs/1506.01497

Field Type Label Description
number_of_stages int32 optional Whether to construct only the Region Proposal Network (RPN). Default: 2
first_stage_features_stride int32 optional Output stride of extracted RPN feature map. Default: 16
batch_norm_trainable bool optional Whether to update batch norm parameters during training or not. When training with a relative large batch size (e.g. 8), it could be desirable to enable batch norm update. Default: false
first_stage_anchor_generator AnchorGenerator required Anchor generator to compute RPN anchors.
first_stage_atrous_rate int32 optional Atrous rate for the convolution op applied to the first_stage_features_to_crop tensor to obtain box predictions. Default: 1
first_stage_box_predictor_conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams required Hyperparameters for the convolutional RPN box predictor.
first_stage_box_predictor_kernel_size int32 optional Kernel size to use for the convolution op just prior to RPN box predictions. Default: 3
first_stage_box_predictor_depth int32 optional Output depth for the convolution op just prior to RPN box predictions. Default: 512
first_stage_minibatch_size int32 optional The batch size to use for computing the first stage objectness and location losses. Default: 256
first_stage_positive_balance_fraction float optional Fraction of positive examples per image for the RPN. Default: 0.5
first_stage_nms_score_threshold float optional Non max suppression score threshold applied to first stage RPN proposals. Default: 0
first_stage_nms_iou_threshold float optional Non max suppression IOU threshold applied to first stage RPN proposals. Default: 0.7
first_stage_max_proposals int32 optional Maximum number of RPN proposals retained after first stage postprocessing. Default: 300
first_stage_localization_loss_weight float optional First stage RPN localization loss weight. Default: 2
first_stage_objectness_loss_weight float optional First stage RPN objectness loss weight. Default: 1
second_stage_box_predictor BoxPredictor required Hyperparameters for the second stage box predictor. If box predictor type is set to rfcn_box_predictor, a R-FCN model is constructed, otherwise a Faster R-CNN model is constructed.
second_stage_batch_size int32 optional The batch size per image used for computing the classification and refined location loss of the box classifier. Note that this field is ignored if hard_example_miner is configured. Default: 64
second_stage_balance_fraction float optional Fraction of positive examples to use per image for the box classifier. Default: 0.25
second_stage_post_processing PostProcessing required Post processing to apply on the second stage box classifier predictions. Note: the score_converter provided to the FasterRCNNMetaArch constructor is taken from this second_stage_post_processing proto.
second_stage_localization_loss_weight float optional Second stage refined localization loss weight. Default: 2
second_stage_classification_loss_weight float optional Second stage classification loss weight Default: 1
second_stage_mask_prediction_loss_weight float optional Second stage instance mask loss weight. Note that this is only applicable when MaskRCNNBoxPredictor is selected for second stage and configured to predict instance masks. Default: 1
hard_example_miner deepomatic.oef.losses.HardExampleMiner optional If not left to default, applies hard example mining only to classification and localization loss..
second_stage_classification_loss deepomatic.oef.losses.ClassificationLoss required Loss for second stage box classifers, supports Softmax and Sigmoid. Note that score converter must be consistent with loss type. When there are multiple labels assigned to the same boxes, recommend to use sigmoid loss and enable merge_multiple_label_boxes. If not specified, Softmax loss is used as default.
inplace_batchnorm_update bool optional Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters. Default: false
use_matmul_crop_and_resize bool optional Force the use of matrix multiplication based crop and resize instead of standard tf.image.crop_and_resize while computing second stage input feature maps. Default: false
clip_anchors_to_image bool optional Normally, anchors generated for a given image size are pruned during training if they lie outside the image window. Setting this option to true, clips the anchors to be within the image instead of pruning. Default: false
use_matmul_gather_in_matcher bool optional After peforming matching between anchors and targets, in order to pull out targets for training Faster R-CNN meta architecture we perform a gather operation. This options specifies whether to use an alternate implementation of tf.gather that is faster on TPUs. Default: false
use_static_balanced_label_sampler bool optional Whether to use the balanced positive negative sampler implementation with static shape guarantees. Default: false
use_static_shapes bool optional If True, uses implementation of ops with static shape guarantees. Default: false
use_static_shapes_for_eval bool optional If True, uses implementation of ops with static shape guarantees when running evaluation (specifically not is_training if False). Default: false
use_partitioned_nms_in_first_stage bool optional If true, uses implementation of partitioned_non_max_suppression in first stage. Default: true
return_raw_detections_during_predict bool optional Whether to return raw detections (pre NMS). Default: false
use_combined_nms_in_first_stage bool optional Whether to use tf.image.combined_non_max_suppression. Default: false

RFCNMetaArchitecture

Training parameters for RFCN

Field Type Label Description
parameters RCNNParameters required  

SSDFeatureExtractor

Field Type Label Description
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters that affect the layers of feature extractor added on top of the base feature extractor.
pad_to_multiple int32 optional The nearest multiple to zero-pad the input height and width dimensions to. For example, if pad_to_multiple = 2, input dimensions are zero-padded until the resulting dimensions are even. Default: 1
use_explicit_padding bool optional Whether to use explicit padding when extracting SSD multiresolution features. This will also apply to the base feature extractor if a MobileNet architecture is used. @vdel: this seems to have been added to make backbones compatible with some runtimes and seem deprecated now. Default: false
use_depthwise bool optional Whether to use depthwise separable convolutions for to extract additional feature maps added by SSD. Default: false
fpn FeaturePyramidNetworks optional Feature Pyramid Networks config.
num_layers int32 optional The number of SSD layers. Default: 6

SSDMetaArchitecture

SSD as described in https://arxiv.org/abs/1512.02325. SSD-Lite as described in https://arxiv.org/pdf/1801.04381.pdf Next id: 27

Field Type Label Description
feature_extractor SSDFeatureExtractor required Feature extractor config.
box_coder BoxCoder required Box coder to encode the boxes.
matcher Matcher required Matcher to match groundtruth with anchors.
similarity_calculator RegionSimilarityCalculator required Region similarity calculator to compute similarity of boxes.
encode_background_as_zeros bool optional Whether background targets are to be encoded as an all zeros vector or a one-hot vector (where background is the 0th class). Default: false
negative_class_weight float optional classification weight to be associated to negative anchors (default: 1.0). The weight must be in [0., 1.]. Default: 1
box_predictor BoxPredictor optional Box predictor to attach to the features.
anchor_generator AnchorGenerator required Anchor generator to compute anchors.
post_processing PostProcessing required Post processing to apply on the predictions.
normalize_loss_by_num_matches bool optional Whether to normalize the loss by number of groundtruth boxes that match to the anchors. Default: true
normalize_loc_loss_by_codesize bool optional Whether to normalize the localization loss by the code size of the box encodings. This is applied along with other normalization factors. Default: false
losses deepomatic.oef.losses.Loss required Loss configuration for training.
freeze_batchnorm bool optional Whether to update batch norm parameters during training or not. When training with a relative small batch size (e.g. 1), it is desirable to disable batch norm update and use pretrained batch norm params.

Note: Some feature extractors are used with canned arg_scopes (e.g resnet arg scopes). In these cases training behavior of batch norm variables may depend on both values of batch_norm_trainable and is_training.

When canned arg_scopes are used with feature extractors conv_hyperparams will apply only to the additional layers that are added and are outside the canned arg_scope. Default: false        
  inplace_batchnorm_update bool optional Whether to update batch_norm inplace during training. This is required for batch norm to work correctly on TPUs. When this is false, user must add a control dependency on tf.GraphKeys.UPDATE_OPS for train/loss op in order to update the batch norm moving average parameters. Default: false
  add_background_class bool optional Whether to add an implicit background class to one-hot encodings of groundtruth labels. Set to false if training a single class model or using an explicit background class. Default: true
  explicit_background_class bool optional Whether to use an explicit background class. Set to true if using groundtruth labels with an explicit background class, as in multiclass scores. Default: false
  use_confidences_as_targets bool optional Default: false
  implicit_example_weight float optional Default: 1
  return_raw_detections_during_predict bool optional Default: false
  mask_head_config SSDMetaArchitecture.MaskHead optional Configs for mask head.

SSDMetaArchitecture.MaskHead

Configuration proto for MaskHead. Next id: 11

Field Type Label Description
mask_height int32 optional The height and the width of the predicted mask. Only used when predict_instance_masks is true. Default: 15
mask_width int32 optional Default: 15
masks_are_class_agnostic bool optional Whether to predict class agnostic masks. Only used when predict_instance_masks is true. Default: true
mask_prediction_conv_depth int32 optional The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes. Default: 256
mask_prediction_num_conv_layers int32 optional The number of convolutions applied to image_features in the mask prediction branch. Default: 2
convolve_then_upsample_masks bool optional Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [mask_height, mask_width] before applying convolutions and predicting masks. Default: false
mask_loss_weight float optional Mask loss weight. Default: 5
mask_loss_sample_size int32 optional Number of boxes to be generated at training time for computing mask loss. Default: 16
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for convolution ops used in the box predictor.
initial_crop_size int32 optional Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling. Only used when we have second stage prediction head enabled (e.g. mask head). Default: 15

YoloParameters

Yolo V2 & V3 as described https://arxiv.org/abs/1612.08242

Field Type Label Description
subdivisions int32 required The number of mini-batch split, so that it fits in GPU memory Using 1 will compute the whole mini-batch in 1 pass and may use lots of RAM
classification_loss deepomatic.oef.losses.ClassificationLoss required Loss for classifers, supports Softmax and Sigmoid.

YoloV2MetaArchitecture

Training parameters for Yolo v2

Field Type Label Description
parameters YoloParameters required  

YoloV3MetaArchitecture

Training parameters for Yolo v3

Field Type Label Description
parameters YoloParameters required  

YoloV8MetaArchitecture

Training parameters for Yolo v8

EfficientDetMetaArchitecture.IOULossType

localization loss: IoU loss type We use lower case names because the name of the enum is directly used in thrid party code.

Name Number Description
NONE 0  
iou 1  
ciou 2  
diou 3  
giou 4  

Top

deepomatic/oef/protos/models/image/ocr.proto

Image related models.

Attention

Attention-based OCR as described in https://arxiv.org/abs/1704.03549

Field Type Label Description
use_autoregression bool optional Whether or not we should base the prediction of the next character also on previous characters. This is typically used to recognize words. This is typically NOT used for license plate prediction. Default: false
num_lstm_units int32 optional The size of the hidden state vector Default: 256
use_coordinate_feature bool optional Whether we should add one hot vectors representing the location to add it as a prior Default: false
feature_map_ratio int32 optional Feature map size ratio to input size Default: 8
weight_decay float optional float number regularization weight decay Default: 4e-05
lstm_state_clip_value float optional float number clip cell state by this value prior to the cell output activation Default: 10
label_smoothing float optional float number smooth factor towards 1/num_classes for labels Default: 0.1
use_attention bool optional Whether the OCR uses an attention mask to focus on each letter
next: 9 Default: true

OCR

OCR model

Field Type Label Description
backbone deepomatic.oef.models.image.backbones.Backbone required Select from Inception, Resnet, etc…
attention Attention optional  

Top

deepomatic/oef/protos/models/image/preprocessing.proto

Image related models.

AutoAugmentImage

Apply an Autoaugment policy to the image and bounding boxes.

Field Type Label Description
policy_name string required What AutoAugment policy to apply to the Image. The available options are v0, v1, v2, v3 for a detection task, and v4 for a classification/tagging task. v0 is the policy used for all of the results in the "detection" paper [1] and was found to achieve the best results on the COCO dataset. v1, v2 and v3 are additional good policies found on the COCO dataset that have slight variation in what operations were used during the search procedure along with how many operations are applied in parallel to a single image (2 vs 3). v4 corresponds to the best policy found in the original AutoAugment paper for classification [2] 'AutoAugment: Learning Augmentation Strategies from Data' on reduced ImageNet dataset (see arxiv link, table 9 in the appendix). [1] Object detection: https://arxiv.org/pdf/1906.11172.pdf [2] Classification: https://arxiv.org/pdf/1805.09501.pdf

ConvertClassLogitsToSoftmax

Converts class logits to softmax optionally scaling the values by temperature first.

Field Type Label Description
temperature float optional Scale to use on logits before applying softmax. Default: 1

DropLabelProbabilistically

Randomly drops ground truth boxes for a label with some probability.

Field Type Label Description
label int32 optional The label that should be dropped. This corresponds to one of the entries in the label map.
drop_probability float optional Probability of dropping the label. Default: 1

FixedShapeResizer

Configuration proto for image resizer that resizes to a fixed shape.

Field Type Label Description
height int32 optional Desired height of image in pixels. Default: 300
width int32 optional Desired width of image in pixels. Default: 300
resize_method ResizeType optional Desired method when resizing image. Default: BILINEAR
convert_to_grayscale bool optional Whether to also resize the image channels from 3 to 1 (RGB to grayscale). Default: false

ImageResizer

Configuration proto for image resizing operations. See builders/image_resizer_builder.py for details.

Field Type Label Description
keep_aspect_ratio_resizer KeepAspectRatioResizer optional  
fixed_shape_resizer FixedShapeResizer optional  

Input

Field Type Label Description
image_resizer ImageResizer required The input image resizer
data_augmentation_options PreprocessingStep repeated Data augmentation options.

KeepAspectRatioResizer

Configuration proto for image resizer that keeps aspect ratio.

Field Type Label Description
min_dimension int32 optional Desired size of the smaller image dimension in pixels. Default: 0
max_dimension int32 required Desired size of the larger image dimension in pixels.
resize_method ResizeType optional Desired method when resizing image. Default: BILINEAR
pad_to_max_dimension bool optional Whether to pad the image with zeros so the output spatial size is [max_dimension, max_dimension]. Note that the zeros are padded to the bottom and the right of the resized image. Default: true
convert_to_grayscale bool optional Whether to also resize the image channels from 3 to 1 (RGB to grayscale). Default: false
per_channel_pad_value float repeated Per-channel pad value. This is only used when pad_to_max_dimension is True. If unspecified, a default pad value of 0 is applied to all channels.

NormalizeImage

Normalizes pixel values in an image For every channel in the image, moves the pixel values from the range [original_minval, original_maxval] to [target_minval, target_maxval]

Field Type Label Description
original_minval float optional  
original_maxval float optional  
target_minval float optional Default: 0
target_maxval float optional Default: 1

PreprocessingStep

Message for defining a preprocessing operation on input data. See: //third_party/tensorflow_models/object_detection/core/preprocessor.py Next ID: 39

Field Type Label Description
normalize_image NormalizeImage optional  
random_horizontal_flip RandomHorizontalFlip optional  
random_pixel_value_scale RandomPixelValueScale optional  
random_image_scale RandomImageScale optional  
random_rgb_to_gray RandomRGBtoGray optional  
random_adjust_brightness RandomAdjustBrightness optional  
random_adjust_contrast RandomAdjustContrast optional  
random_adjust_hue RandomAdjustHue optional  
random_adjust_saturation RandomAdjustSaturation optional  
random_distort_color RandomDistortColor optional  
random_jitter_boxes RandomJitterBoxes optional  
random_crop_image RandomCropImage optional  
random_pad_image RandomPadImage optional  
random_crop_pad_image RandomCropPadImage optional  
random_crop_to_aspect_ratio RandomCropToAspectRatio optional  
random_black_patches RandomBlackPatches optional  
random_resize_method RandomResizeMethod optional  
scale_boxes_to_pixel_coordinates ScaleBoxesToPixelCoordinates optional  
resize_image ResizeImage optional  
subtract_channel_mean SubtractChannelMean optional  
ssd_random_crop SSDRandomCrop optional  
ssd_random_crop_pad SSDRandomCropPad optional  
ssd_random_crop_fixed_aspect_ratio SSDRandomCropFixedAspectRatio optional  
ssd_random_crop_pad_fixed_aspect_ratio SSDRandomCropPadFixedAspectRatio optional  
random_vertical_flip RandomVerticalFlip optional  
random_rotation90 RandomRotation90 optional  
rgb_to_gray RGBtoGray optional  
convert_class_logits_to_softmax ConvertClassLogitsToSoftmax optional  
random_absolute_pad_image RandomAbsolutePadImage optional  
random_self_concat_image RandomSelfConcatImage optional  
autoaugment_image AutoAugmentImage optional  
drop_label_probabilistically DropLabelProbabilistically optional  
remap_labels RemapLabels optional  
random_jpeg_quality RandomJpegQuality optional  
random_downscale_to_target_pixels RandomDownscaleToTargetPixels optional  
random_patch_gaussian RandomPatchGaussian optional  
random_square_crop_by_scale RandomSquareCropByScale optional  
random_scale_crop_and_pad_to_square RandomScaleCropAndPadToSquare optional  
probability float optional Default: 1

RGBtoGray

Converts the RGB image to a grayscale image. This also converts the image depth from 3 to 1, unlike RandomRGBtoGray which does not change the image depth.

RandomAbsolutePadImage

Randomly adds a padding of size [0, max_height_padding), [0, max_width_padding).

Field Type Label Description
max_height_padding int32 optional Height will be padded uniformly at random from [0, max_height_padding).
max_width_padding int32 optional Width will be padded uniformly at random from [0, max_width_padding).
pad_color float repeated Color of the padding. If unset, will pad using average color of the input image.

RandomAdjustBrightness

Randomly changes image brightness by up to max_delta. Image outputs will be saturated between 0 and 1.

Field Type Label Description
max_delta float optional Default: 0.2

RandomAdjustContrast

Randomly scales contract by a value between [min_delta, max_delta].

Field Type Label Description
min_delta float optional Default: 0.8
max_delta float optional Default: 1.25

RandomAdjustHue

Randomly alters hue by a value of up to max_delta.

Field Type Label Description
max_delta float optional Default: 0.02

RandomAdjustSaturation

Randomly changes saturation by a value between [min_delta, max_delta].

Field Type Label Description
min_delta float optional Default: 0.8
max_delta float optional Default: 1.25

RandomBlackPatches

Randomly adds black square patches to an image.

Field Type Label Description
max_black_patches int32 optional The maximum number of black patches to add. Default: 10
probability float optional The probability of a black patch being added to an image. Default: 0.5
size_to_image_ratio float optional Ratio between the dimension of the black patch to the minimum dimension of the image (patch_width = patch_height = min(image_height, image_width)). Default: 0.1

RandomCropImage

Randomly crops the image and bounding boxes.

Field Type Label Description
min_object_covered float optional Cropped image must cover at least one box by this fraction. Default: 1
min_aspect_ratio float optional Aspect ratio bounds of cropped image. Default: 0.75
max_aspect_ratio float optional Default: 1.33
min_area float optional Allowed area ratio of cropped image to original image. Default: 0.1
max_area float optional Default: 1
overlap_thresh float optional Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability of keeping the original image. Default: 0

RandomCropPadImage

Randomly crops an image followed by a random pad.

Field Type Label Description
min_object_covered float optional Cropping operation must cover at least one box by this fraction. Default: 1
min_aspect_ratio float optional Aspect ratio bounds of image after cropping operation. Default: 0.75
max_aspect_ratio float optional Default: 1.33
min_area float optional Allowed area ratio of image after cropping operation. Default: 0.1
max_area float optional Default: 1
overlap_thresh float optional Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability of keeping the original image during the crop operation. Default: 0
min_padded_size_ratio float repeated Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound. Both of the following fields should be length 2.
max_padded_size_ratio float repeated  
pad_color float repeated Color of the padding. If unset, will pad using average color of the input image. This field should be of length 3.

RandomCropToAspectRatio

Randomly crops an iamge to a given aspect ratio.

Field Type Label Description
aspect_ratio float optional Aspect ratio. Default: 1
overlap_thresh float optional Minimum overlap threshold of cropped boxes to keep in new image. If the ratio between a cropped bounding box and the original is less than this value, it is removed from the new image. Default: 0.3
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true

RandomDistortColor

Performs a random color distortion. color_orderings should either be 0 or 1.

Field Type Label Description
color_ordering int32 optional  

RandomDownscaleToTargetPixels

Randomly shrinks image (keeping aspect ratio) to a target number of pixels. If the image contains less than the chosen target number of pixels, it will not be changed.

Field Type Label Description
random_coef float optional Probability of keeping the original image. Default: 0
min_target_pixels int32 optional The target number of pixels will be chosen to be in the range [min_target_pixels, max_target_pixels] Default: 300000
max_target_pixels int32 optional Default: 500000

RandomHorizontalFlip

Randomly horizontally flips the image and detections with the specified probability, default to 50% of the time.

Field Type Label Description
keypoint_flip_permutation int32 repeated Specifies a mapping from the original keypoint indices to horizontally flipped indices. This is used in the event that keypoints are specified, in which case when the image is horizontally flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4 If nothing is specified the order of keypoint will be mantained.
probability float optional The probability of running this augmentation for each image. Default: 0.5

RandomImageScale

Randomly enlarges or shrinks image (keeping aspect ratio).

Field Type Label Description
min_scale_ratio float optional Default: 0.5
max_scale_ratio float optional Default: 2

RandomJitterBoxes

Randomly jitters corners of boxes in the image determined by ratio. ie. If a box is [100, 200] and ratio is 0.02, the corners can move by [1, 4].

Field Type Label Description
ratio float optional Default: 0.05

RandomJpegQuality

Applies a jpeg encoding with a random quality factor.

Field Type Label Description
random_coef float optional Probability of keeping the original image. Default: 0
min_jpeg_quality int32 optional Minimum jpeg quality to use. Default: 0
max_jpeg_quality int32 optional Maximum jpeg quality to use. Default: 100

RandomPadImage

Randomly adds padding to the image.

Field Type Label Description
min_image_height int32 optional Minimum dimensions for padded image. If unset, will use original image dimension as a lower bound.
min_image_width int32 optional  
max_image_height int32 optional Maximum dimensions for padded image. If unset, will use double the original image dimension as a lower bound.
max_image_width int32 optional  
pad_color float repeated Color of the padding. If unset, will pad using average color of the input image.

RandomPatchGaussian

Field Type Label Description
random_coef float optional Probability of keeping the original image. Default: 0
min_patch_size int32 optional The patch size will be chosen to be in the range [min_patch_size, max_patch_size). Default: 1
max_patch_size int32 optional Default: 250
min_gaussian_stddev float optional The standard deviation of the gaussian noise applied within the patch will be chosen to be in the range [min_gaussian_stddev, max_gaussian_stddev). Default: 0
max_gaussian_stddev float optional Default: 1

RandomPixelValueScale

Randomly scales the values of all pixels in the image by some constant value between [minval,maxval], then clip the value to a range between [0, 1.0].

Field Type Label Description
minval float optional Default: 0.9
maxval float optional Default: 1.1

RandomRGBtoGray

Randomly convert entire image to grey scale.

Field Type Label Description
probability float optional Default: 0.1

RandomResizeMethod

Randomly resizes the image up to [target_height, target_width].

Field Type Label Description
target_height int32 optional  
target_width int32 optional  

RandomRotation90

Randomly rotates the image and detections by 90 degrees counter-clockwise with the specified probability, default to 50% of the time.

Field Type Label Description
keypoint_rot_permutation int32 repeated Specifies a mapping from the original keypoint indices to 90 degree counter clockwise indices. This is used in the event that keypoints are specified, in which case when the image is rotated the keypoints might need to be permuted.
probability float optional The probability of running this augmentation for each image. Default: 0.5

RandomScaleCropAndPadToSquare

Randomly scale, crop, and then pad an image to the desired square output dimensions. Specifically, this method first samples a random_scale factor from a uniform distribution between scale_min and scale_max, and then resizes the image such that it's maximum dimension is (output_size * random_scale). Secondly, a square output_size crop is extracted from the resized image, and finally the cropped region is padded to the desired square output_size. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1911.09070

Field Type Label Description
output_size int32 optional The (square) output image size Default: 512
scale_min float optional The minimum and maximum values from which to sample the random scale. Default: 0.1
scale_max float optional Default: 2

RandomSelfConcatImage

Randomly concatenates the image with itself horizontally and/or vertically.

Field Type Label Description
concat_vertical_probability float optional Probability of concatenating the image vertically. Default: 0.1
concat_horizontal_probability float optional Probability of concatenating the image horizontally. Default: 0.1

RandomSquareCropByScale

Extract a square sized crop from an image whose side length is sampled by randomly scaling the maximum spatial dimension of the image. If part of the crop falls outside the image, it is filled with zeros. The augmentation is borrowed from [1] [1]: https://arxiv.org/abs/1904.07850

Field Type Label Description
max_border int32 optional The maximum size of the border. The border defines distance in pixels to the image boundaries that will not be considered as a center of a crop. To make sure that the border does not go over the center of the image, we chose the border value by computing the minimum k, such that (max_border / (2**k)) < image_dimension/2 Default: 128
scale_min float optional The minimum and maximum values of scale. Default: 0.6
scale_max float optional Default: 1.3
num_scales int32 optional The number of discrete scale values to randomly sample between [min_scale, max_scale] Default: 8

RandomVerticalFlip

Randomly vertically flips the image and detections with the specified probability, default to 50% of the time.

Field Type Label Description
keypoint_flip_permutation int32 repeated Specifies a mapping from the original keypoint indices to vertically flipped indices. This is used in the event that keypoints are specified, in which case when the image is vertically flipped the keypoints will need to be permuted. E.g. for keypoints representing left_eye, right_eye, nose_tip, mouth, left_ear, right_ear (in that order), one might specify the keypoint_flip_permutation below: keypoint_flip_permutation: 1 keypoint_flip_permutation: 0 keypoint_flip_permutation: 2 keypoint_flip_permutation: 3 keypoint_flip_permutation: 5 keypoint_flip_permutation: 4
probability float optional The probability of running this augmentation for each image. Default: 0.5

RemapLabels

Remap a set of labels to a new label.

Field Type Label Description
original_labels int32 repeated Labels to be remapped.
new_label int32 optional Label to map to.

ResizeImage

Resizes images to [new_height, new_width].

Field Type Label Description
new_height int32 optional  
new_width int32 optional  
method ResizeImage.Method optional Default: BILINEAR

SSDRandomCrop

Randomly crops a image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropOperations. Only one operation (chosen at random) is actually performed on an image.

Field Type Label Description
operations SSDRandomCropOperation repeated  

SSDRandomCropFixedAspectRatio

Randomly crops a image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.

Field Type Label Description
operations SSDRandomCropFixedAspectRatioOperation repeated  
aspect_ratio float optional Aspect ratio to crop to. This value is used for all crop operations. Default: 1

SSDRandomCropFixedAspectRatioOperation

Field Type Label Description
min_object_covered float optional Cropped image must cover at least this fraction of one original bounding box.
min_area float optional The area of the cropped image must be within the range of [min_area, max_area].
max_area float optional  
overlap_thresh float optional Cropped box area ratio must be above this threhold to be kept.
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability a crop operation is skipped.

SSDRandomCropOperation

Field Type Label Description
min_object_covered float optional Cropped image must cover at least this fraction of one original bounding box.
min_aspect_ratio float optional The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
max_aspect_ratio float optional  
min_area float optional The area of the cropped image must be within the range of [min_area, max_area].
max_area float optional  
overlap_thresh float optional Cropped box area ratio must be above this threhold to be kept.
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability a crop operation is skipped.

SSDRandomCropPad

Randomly crops and pads an image according to: Liu et al., SSD: Single shot multibox detector. This preprocessing step defines multiple SSDRandomCropPadOperations. Only one operation (chosen at random) is actually performed on an image.

Field Type Label Description
operations SSDRandomCropPadOperation repeated  

SSDRandomCropPadFixedAspectRatio

Randomly crops and pads an image to a fixed aspect ratio according to: Liu et al., SSD: Single shot multibox detector. Multiple SSDRandomCropPadFixedAspectRatioOperations are defined by this preprocessing step. Only one operation (chosen at random) is actually performed on an image.

Field Type Label Description
operations SSDRandomCropPadFixedAspectRatioOperation repeated  
aspect_ratio float optional Aspect ratio to pad to. This value is used for all crop and pad operations. Default: 1
min_padded_size_ratio float repeated Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
max_padded_size_ratio float repeated Max ratio of padded image height and width to the input image's height and width. Two entries per operation.

SSDRandomCropPadFixedAspectRatioOperation

Field Type Label Description
min_object_covered float optional Cropped image must cover at least this fraction of one original bounding box.
min_aspect_ratio float optional The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
max_aspect_ratio float optional  
min_area float optional The area of the cropped image must be within the range of [min_area, max_area].
max_area float optional  
overlap_thresh float optional Cropped box area ratio must be above this threhold to be kept.
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability a crop operation is skipped.

SSDRandomCropPadOperation

Field Type Label Description
min_object_covered float optional Cropped image must cover at least this fraction of one original bounding box.
min_aspect_ratio float optional The aspect ratio of the cropped image must be within the range of [min_aspect_ratio, max_aspect_ratio].
max_aspect_ratio float optional  
min_area float optional The area of the cropped image must be within the range of [min_area, max_area].
max_area float optional  
overlap_thresh float optional Cropped box area ratio must be above this threhold to be kept.
clip_boxes bool optional Whether to clip the boxes to the cropped image. Default: true
random_coef float optional Probability a crop operation is skipped.
min_padded_size_ratio float repeated Min ratio of padded image height and width to the input image's height and width. Two entries per operation.
max_padded_size_ratio float repeated Max ratio of padded image height and width to the input image's height and width. Two entries per operation.
pad_color_r float optional Padding color.
pad_color_g float optional  
pad_color_b float optional  

ScaleBoxesToPixelCoordinates

Scales boxes from normalized coordinates to pixel coordinates.

SubtractChannelMean

Normalizes an image by subtracting a mean from each channel.

Field Type Label Description
means float repeated The mean to subtract from each channel. Should be of same dimension of channels in the input image.

ResizeImage.Method

Name Number Description
AREA 1  
BICUBIC 2  
BILINEAR 3  
NEAREST_NEIGHBOR 4  

ResizeType

Enumeration type for image resizing methods provided in TensorFlow.

Name Number Description
BILINEAR 0 Corresponds to tf.image.ResizeMethod.BILINEAR
NEAREST_NEIGHBOR 1 Corresponds to tf.image.ResizeMethod.NEAREST_NEIGHBOR
BICUBIC 2 Corresponds to tf.image.ResizeMethod.BICUBIC
AREA 3 Corresponds to tf.image.ResizeMethod.AREA

Top

deepomatic/oef/protos/models/image/segmentation.proto

Segmentation models

MaskRCNNMetaArchitecture

Training parameters for Mask-RCNN Note: it's similar to FasterRCNNMetaArchitecture, but for readibility and potential future changes it's better to have it as a separate message definition

Field Type Label Description
parameters deepomatic.oef.models.image.detection.RCNNParameters required  
initial_crop_size int32 required Output size (width and height are set to be the same) of the initial bilinear interpolation based cropping during ROI pooling.
maxpool_kernel_size int32 required Kernel size of the max pool op on the cropped feature map during ROI pooling.
maxpool_stride int32 required Stride of the max pool op on the cropped feature map during ROI pooling.

Segmentation

Field Type Label Description
backbone deepomatic.oef.models.image.backbones.Backbone required Select from Inception, Resnet, etc…
label_smoothing float optional If greater than 0 then smooth the labels towards 1/num_classes Default: 0
mask_rcnn MaskRCNNMetaArchitecture optional  

Top

deepomatic/oef/protos/models/image/detection/anchor_generator.proto

AnchorGenerator

Configuration proto for the anchor generator to use in the object detection pipeline. See core/anchor_generator.py for details.

Field Type Label Description
grid_anchor_generator GridAnchorGenerator optional  
ssd_anchor_generator SsdAnchorGenerator optional  
multiscale_anchor_generator MultiscaleAnchorGenerator optional  
flexible_grid_anchor_generator FlexibleGridAnchorGenerator optional  

Top

deepomatic/oef/protos/models/image/detection/argmax_matcher.proto

ArgMaxMatcher

Configuration proto for ArgMaxMatcher. See matchers/argmax_matcher.py for details.

Field Type Label Description
matched_threshold float optional Threshold for positive matches. Default: 0.5
unmatched_threshold float optional Threshold for negative matches. Default: 0.5
ignore_thresholds bool optional Whether to construct ArgMaxMatcher without thresholds. Default: false
negatives_lower_than_unmatched bool optional If True then negative matches are the ones below the unmatched_threshold, whereas ignored matches are in between the matched and umatched threshold. If False, then negative matches are in between the matched and unmatched threshold, and everything lower than unmatched is ignored. Default: true
force_match_for_each_row bool optional Whether to ensure each row is matched to at least one column. Default: false
use_matmul_gather bool optional Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather Default: false

Top

deepomatic/oef/protos/models/image/detection/bipartite_matcher.proto

BipartiteMatcher

Configuration proto for bipartite matcher. See matchers/bipartite_matcher.py for details.

Field Type Label Description
use_matmul_gather bool optional Force constructed match objects to use matrix multiplication based gather instead of standard tf.gather Default: false

Top

deepomatic/oef/protos/models/image/detection/box_coder.proto

BoxCoder

Configuration proto for the box coder to be used in the object detection pipeline. See core/box_coder.py for details.

Field Type Label Description
faster_rcnn_box_coder FasterRcnnBoxCoder optional  
mean_stddev_box_coder MeanStddevBoxCoder optional  
square_box_coder SquareBoxCoder optional  
keypoint_box_coder KeypointBoxCoder optional  

Top

deepomatic/oef/protos/models/image/detection/box_predictor.proto

BoxPredictor

Configuration proto for box predictor. See core/box_predictor.py for details.

Field Type Label Description
convolutional_box_predictor ConvolutionalBoxPredictor optional  
mask_rcnn_box_predictor MaskRCNNBoxPredictor optional  
rfcn_box_predictor RfcnBoxPredictor optional  
weight_shared_convolutional_box_predictor WeightSharedConvolutionalBoxPredictor optional  

ConvolutionalBoxPredictor

Configuration proto for Convolutional box predictor. Next id: 13

Field Type Label Description
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for convolution ops used in the box predictor.
min_depth int32 optional Minimum feature depth prior to predicting box encodings and class predictions. Default: 0
max_depth int32 optional Maximum feature depth prior to predicting box encodings and class predictions. If max_depth is set to 0, no additional feature map will be inserted before location and class predictions. Default: 0
num_layers_before_predictor int32 optional Number of the additional conv layers before the predictor. Default: 0
use_dropout bool optional Whether to use dropout for class prediction. Default: true
dropout_keep_probability float optional Keep probability for dropout Default: 0.8
kernel_size int32 optional Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height). Default: 1
box_code_size int32 optional Size of the encoding for boxes. Default: 4
apply_sigmoid_to_scores bool optional Whether to apply sigmoid to the output of class predictions. TODO(jonathanhuang): Do we need this since we have a post processing module.? Default: false
class_prediction_bias_init float optional Default: 0
use_depthwise bool optional Whether to use depthwise separable convolution for box predictor layers. Default: false
box_encodings_clip_range ConvolutionalBoxPredictor.BoxEncodingsClipRange optional  

ConvolutionalBoxPredictor.BoxEncodingsClipRange

If specified, apply clipping to box encodings.

Field Type Label Description
min float optional  
max float optional  

MaskRCNNBoxPredictor

TODO(alirezafathi): Refactor the proto file to be able to configure mask rcnn head easily. Next id: 15

Field Type Label Description
fc_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for fully connected ops used in the box predictor.
use_dropout bool optional Whether to use dropout op prior to the both box and class predictions. Default: false
dropout_keep_probability float optional Keep probability for dropout. This is only used if use_dropout is true. Default: 0.5
box_code_size int32 optional Size of the encoding for the boxes. Default: 4
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for convolution ops used in the box predictor.
predict_instance_masks bool optional Whether to predict instance masks inside detection boxes. Default: false
mask_prediction_conv_depth int32 optional The depth for the first conv2d_transpose op applied to the image_features in the mask prediction branch. If set to 0, the value will be set automatically based on the number of channels in the image features and the number of classes. Default: 256
predict_keypoints bool optional Whether to predict keypoints inside detection boxes. Default: false
mask_height int32 optional The height and the width of the predicted mask. Default: 15
mask_width int32 optional Default: 15
mask_prediction_num_conv_layers int32 optional The number of convolutions applied to image_features in the mask prediction branch. Default: 2
masks_are_class_agnostic bool optional Default: false
share_box_across_classes bool optional Whether to use one box for all classes rather than a different box for each class. Default: false
convolve_then_upsample_masks bool optional Whether to apply convolutions on mask features before upsampling using nearest neighbor resizing. By default, mask features are resized to [mask_height, mask_width] before applying convolutions and predicting masks. Default: false

RfcnBoxPredictor

Field Type Label Description
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for convolution ops used in the box predictor.
num_spatial_bins_height int32 optional Bin sizes for RFCN crops. Default: 3
num_spatial_bins_width int32 optional Default: 3
depth int32 optional Target depth to reduce the input image features to. Default: 1024
box_code_size int32 optional Size of the encoding for the boxes. Default: 4
crop_height int32 optional Size to resize the rfcn crops to. Default: 12
crop_width int32 optional Default: 12

WeightSharedConvolutionalBoxPredictor

Configuration proto for weight shared convolutional box predictor. Next id: 19

Field Type Label Description
conv_hyperparams deepomatic.oef.models.image.utils.hyperparameters.Hyperparams optional Hyperparameters for convolution ops used in the box predictor.
num_layers_before_predictor int32 optional Number of the additional conv layers before the predictor. Default: 0
depth int32 optional Output depth for the convolution ops prior to predicting box encodings and class predictions. Default: 0
kernel_size int32 optional Size of final convolution kernel. If the spatial resolution of the feature map is smaller than the kernel size, then the kernel size is set to min(feature_width, feature_height). Default: 3
box_code_size int32 optional Size of the encoding for boxes. Default: 4
class_prediction_bias_init float optional Bias initialization for class prediction. It has been show to stabilize training where there are large number of negative boxes. See https://arxiv.org/abs/1708.02002 for details. Default: 0
use_dropout bool optional Whether to use dropout for class prediction. Default: false
dropout_keep_probability float optional Keep probability for dropout. Default: 0.8
share_prediction_tower bool optional Whether to share the multi-layer tower between box prediction and class prediction heads. Default: false
use_depthwise bool optional Whether to use depthwise separable convolution for box predictor layers. Default: false
score_converter WeightSharedConvolutionalBoxPredictor.ScoreConverter optional Callable elementwise score converter at inference time. Default: IDENTITY
box_encodings_clip_range WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange optional  

WeightSharedConvolutionalBoxPredictor.BoxEncodingsClipRange

If specified, apply clipping to box encodings.

Field Type Label Description
min float optional  
max float optional  

WeightSharedConvolutionalBoxPredictor.ScoreConverter

Enum to specify how to convert the detection scores at inference time.

Name Number Description
IDENTITY 0 Input scores equals output scores.
SIGMOID 1 Applies a sigmoid on input scores.

Top

deepomatic/oef/protos/models/image/detection/calibration.proto

CalibrationConfig

Message wrapper for various calibration configurations.

Field Type Label Description
function_approximation FunctionApproximation optional Class-agnostic calibration via linear interpolation (usually output from isotonic regression).
class_id_function_approximations ClassIdFunctionApproximations optional Per-class calibration via linear interpolation.
sigmoid_calibration SigmoidCalibration optional Class-agnostic sigmoid calibration.
class_id_sigmoid_calibrations ClassIdSigmoidCalibrations optional Per-class sigmoid calibration.
temperature_scaling_calibration TemperatureScalingCalibration optional Temperature scaling calibration.

ClassIdFunctionApproximations

Message for class-specific domain/range mapping for function approximations.

Field Type Label Description
class_id_xy_pairs_map ClassIdFunctionApproximations.ClassIdXyPairsMapEntry repeated Message mapping class ids to indices.

ClassIdFunctionApproximations.ClassIdXyPairsMapEntry

Field Type Label Description
key int32 optional  
value XYPairs optional  

ClassIdSigmoidCalibrations

Message for class-specific Sigmoid Calibration.

Field Type Label Description
class_id_sigmoid_parameters_map ClassIdSigmoidCalibrations.ClassIdSigmoidParametersMapEntry repeated Message mapping class index to Sigmoid Parameters.

ClassIdSigmoidCalibrations.ClassIdSigmoidParametersMapEntry

Field Type Label Description
key int32 optional  
value SigmoidParameters optional  

FunctionApproximation

Message for class-agnostic domain/range mapping for function approximations.

Field Type Label Description
x_y_pairs XYPairs optional Message mapping class labels to indices

SigmoidCalibration

Message for class-agnostic Sigmoid Calibration.

Field Type Label Description
sigmoid_parameters SigmoidParameters optional Message mapping class index to Sigmoid Parameters

SigmoidParameters

Message defining parameters for sigmoid calibration.

Field Type Label Description
a float optional Default: -1
b float optional Default: 0

TemperatureScalingCalibration

Message for Temperature Scaling Calibration.

Field Type Label Description
scaler float optional  

XYPairs

Message to store a domain/range pair for function to be approximated.

Field Type Label Description
x_y_pair XYPairs.XYPair repeated Sequence of x/y pairs for function approximation.
training_data_type TrainingDataType optional Description of data used to fit the calibration model.

XYPairs.XYPair

Field Type Label Description
x float optional  
y float optional  

TrainingDataType

Description of data used to fit the calibration model. CLASS_SPECIFIC indicates that the calibration parameters are derived from detections pertaining to a single class. ALL_CLASSES indicates that parameters were obtained by fitting a model on detections from all classes (including the background class).

Name Number Description
DATA_TYPE_UNKNOWN 0  
ALL_CLASSES 1  
CLASS_SPECIFIC 2  

Top

deepomatic/oef/protos/models/image/detection/faster_rcnn_box_coder.proto

FasterRcnnBoxCoder

Configuration proto for FasterRCNNBoxCoder. See box_coders/faster_rcnn_box_coder.py for details.

Field Type Label Description
y_scale float optional Scale factor for anchor encoded box center. Default: 10
x_scale float optional Default: 10
height_scale float optional Scale factor for anchor encoded box height. Default: 5
width_scale float optional Scale factor for anchor encoded box width. Default: 5

Top

deepomatic/oef/protos/models/image/detection/flexible_grid_anchor_generator.proto

AnchorGrid

Field Type Label Description
base_sizes float repeated The base sizes in pixels for each anchor in this anchor layer.
aspect_ratios float repeated The aspect ratios for each anchor in this anchor layer.
height_stride uint32 optional The anchor height stride in pixels.
width_stride uint32 optional The anchor width stride in pixels.
height_offset uint32 optional The anchor height offset in pixels. Default: 0
width_offset uint32 optional The anchor width offset in pixels. Default: 0

FlexibleGridAnchorGenerator

Field Type Label Description
anchor_grid AnchorGrid repeated  
normalize_coordinates bool optional Whether to produce anchors in normalized coordinates. Default: true

Top

deepomatic/oef/protos/models/image/detection/grid_anchor_generator.proto

GridAnchorGenerator

Configuration proto for GridAnchorGenerator. See anchor_generators/grid_anchor_generator.py for details.

Field Type Label Description
height int32 optional Anchor height in pixels. Default: 256
width int32 optional Anchor width in pixels. Default: 256
height_stride int32 optional Anchor stride in height dimension in pixels. Default: 16
width_stride int32 optional Anchor stride in width dimension in pixels. Default: 16
height_offset int32 optional Anchor height offset in pixels. Default: 0
width_offset int32 optional Anchor width offset in pixels. Default: 0
scales float repeated List of scales for the anchors.
aspect_ratios float repeated List of aspect ratios for the anchors.

Top

deepomatic/oef/protos/models/image/detection/keypoint_box_coder.proto

KeypointBoxCoder

Configuration proto for KeypointBoxCoder. See box_coders/keypoint_box_coder.py for details.

Field Type Label Description
num_keypoints int32 optional  
y_scale float optional Scale factor for anchor encoded box center and keypoints. Default: 10
x_scale float optional Default: 10
height_scale float optional Scale factor for anchor encoded box height. Default: 5
width_scale float optional Scale factor for anchor encoded box width. Default: 5

Top

deepomatic/oef/protos/models/image/detection/matcher.proto

Matcher

Configuration proto for the matcher to be used in the object detection pipeline. See core/matcher.py for details.

Field Type Label Description
argmax_matcher ArgMaxMatcher optional  
bipartite_matcher BipartiteMatcher optional  

Top

deepomatic/oef/protos/models/image/detection/mean_stddev_box_coder.proto

MeanStddevBoxCoder

Configuration proto for MeanStddevBoxCoder. See box_coders/mean_stddev_box_coder.py for details.

Field Type Label Description
stddev float optional The standard deviation used to encode and decode boxes. Default: 0.01

Top

deepomatic/oef/protos/models/image/detection/multiscale_anchor_generator.proto

MultiscaleAnchorGenerator

Configuration proto for RetinaNet anchor generator described in https://arxiv.org/abs/1708.02002. See anchor_generators/multiscale_grid_anchor_generator.py for details.

Field Type Label Description
min_level int32 optional minimum level in feature pyramid Default: 3
max_level int32 optional maximum level in feature pyramid Default: 7
anchor_scale float optional Scale of anchor to feature stride Default: 4
aspect_ratios float repeated Aspect ratios for anchors at each grid point.
scales_per_octave int32 optional Number of intermediate scale each scale octave Default: 2
normalize_coordinates bool optional Whether to produce anchors in normalized coordinates. Default: true

Top

deepomatic/oef/protos/models/image/detection/post_processing.proto

BatchNonMaxSuppression

Configuration proto for non-max-suppression operation on a batch of detections.

Field Type Label Description
score_threshold float optional Scalar threshold for score (low scoring boxes are removed). Default: 0
iou_threshold float optional Scalar threshold for IOU (boxes that have high IOU overlap with previously selected boxes are removed). Default: 0.6
max_detections_per_class int32 optional Maximum number of detections to retain per class. Default: 100
max_total_detections int32 optional Maximum number of detections to retain across all classes. Default: 100
use_static_shapes bool optional Whether to use the implementation of NMS that guarantees static shapes. Default: false
use_class_agnostic_nms bool optional Whether to use class agnostic NMS. Class-agnostic NMS function implements a class-agnostic version of Non Maximal Suppression where if max_classes_per_detection=k, 1) we keep the top-k scores for each detection and 2) during NMS, each detection only uses the highest class score for sorting. 3) Compared to regular NMS, the worst runtime of this version is O(N^2) instead of O(KN^2) where N is the number of detections and K the number of classes. Default: false
use_combined_nms bool optional Whether to use tf.image.combined_non_max_suppression. Default: false
change_coordinate_frame bool optional Whether to change coordinate frame of the boxlist to be relative to window's frame. Default: true
use_hard_nms bool optional Use hard NMS. Note that even if this field is set false, the behavior of NMS will be equivalent to hard NMS; This field when set to true forces the tf.image.non_max_suppression function to be called instead of tf.image.non_max_suppression_with_scores and can be used to export models for older versions of TF. Default: false
use_cpu_nms bool optional Use cpu NMS. NMSV3/NMSV4 by default runs on GPU, which may cause OOM issue if the model is large and/or batch size is large during training. Setting this flag to false moves the nms op to CPU when OOM happens. The flag is not needed if use_hard_nms = false, as soft NMS currently runs on CPU by default. Default: false

PostProcessing

Configuration proto for post-processing predicted boxes and scores.

Field Type Label Description
batch_non_max_suppression BatchNonMaxSuppression optional Non max suppression parameters.
score_converter PostProcessing.ScoreConverter optional Score converter to use. Default: IDENTITY
logit_scale float optional Scale logit (input) value before conversion in post-processing step. Typically used for softmax distillation, though can be used to scale for other reasons. Default: 1
calibration_config CalibrationConfig optional Calibrate score outputs. Calibration is applied after score converter and before non max suppression.

PostProcessing.ScoreConverter

Enum to specify how to convert the detection scores.

Name Number Description
IDENTITY 0 Input scores equals output scores.
SIGMOID 1 Applies a sigmoid on input scores.
SOFTMAX 2 Applies a softmax on input scores

Top

deepomatic/oef/protos/models/image/detection/region_similarity_calculator.proto

IoaSimilarity

Configuration for intersection-over-area (IOA) similarity calculator.

IouSimilarity

Configuration for intersection-over-union (IOU) similarity calculator.

NegSqDistSimilarity

Configuration for negative squared distance similarity calculator.

RegionSimilarityCalculator

Configuration proto for region similarity calculators. See core/region_similarity_calculator.py for details.

Field Type Label Description
neg_sq_dist_similarity NegSqDistSimilarity optional  
iou_similarity IouSimilarity optional  
ioa_similarity IoaSimilarity optional  
thresholded_iou_similarity ThresholdedIouSimilarity optional  

ThresholdedIouSimilarity

Configuration for thresholded-intersection-over-union similarity calculator.

Field Type Label Description
iou_threshold float optional IOU threshold used for filtering scores. Default: 0.5

Top

deepomatic/oef/protos/models/image/detection/square_box_coder.proto

SquareBoxCoder

Configuration proto for SquareBoxCoder. See box_coders/square_box_coder.py for details.

Field Type Label Description
y_scale float optional Scale factor for anchor encoded box center. Default: 10
x_scale float optional Default: 10
length_scale float optional Scale factor for anchor encoded box length. Default: 5

Top

deepomatic/oef/protos/models/image/detection/ssd_anchor_generator.proto

SsdAnchorGenerator

Configuration proto for SSD anchor generator described in https://arxiv.org/abs/1512.02325. See anchor_generators/multiple_grid_anchor_generator.py for details.

Field Type Label Description
num_layers int32 optional Number of grid layers to create anchors for. Default: 6
min_scale float optional Scale of anchors corresponding to finest resolution. Default: 0.2
max_scale float optional Scale of anchors corresponding to coarsest resolution Default: 0.95
scales float repeated Can be used to override min_scale->max_scale, with an explicitly defined set of scales. If empty, then min_scale->max_scale is used.
aspect_ratios float repeated Aspect ratios for anchors at each grid point.
interpolated_scale_aspect_ratio float optional When this aspect ratio is greater than 0, then an additional anchor, with an interpolated scale is added with this aspect ratio. Default: 1
reduce_boxes_in_lowest_layer bool optional Whether to use the following aspect ratio and scale combination for the layer with the finest resolution : (scale=0.1, aspect_ratio=1.0), (scale=min_scale, aspect_ration=2.0), (scale=min_scale, aspect_ratio=0.5). Default: true
base_anchor_height float optional The base anchor size in height dimension. Default: 1
base_anchor_width float optional The base anchor size in width dimension. Default: 1
height_stride int32 repeated Anchor stride in height dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
width_stride int32 repeated Anchor stride in width dimension in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
height_offset int32 repeated Anchor height offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.
width_offset int32 repeated Anchor width offset in pixels for each layer. The length of this field is expected to be equal to the value of num_layers.

Top

deepomatic/oef/protos/models/image/utils/hyperparameters.proto

BatchNorm

Configuration proto for batch norm to apply after convolution op. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/batch_norm

Field Type Label Description
decay float optional Default: 0.999
center bool optional Default: true
scale bool optional Default: false
epsilon float optional Default: 0.001
train bool optional Whether to train the batch norm variables. If this is set to false during training, the current value of the batch_norm variables are used for forward pass but they are never updated. Default: true

GroupNorm

Configuration proto for group normalization to apply after convolution op. https://arxiv.org/abs/1803.08494

Hyperparams

Configuration proto for the convolution op hyperparameters

Field Type Label Description
op Hyperparams.Op optional Default: CONV
regularizer Regularizer optional Regularizer for the weights of the convolution op.
initializer Initializer optional Initializer for the weights of the convolution op.
activation Hyperparams.Activation optional Default: RELU
batch_norm BatchNorm optional Note that if nothing below is selected, then no normalization is applied BatchNorm hyperparameters.
group_norm GroupNorm optional GroupNorm hyperparameters. This is only supported on a subset of models. Note that the current implementation of group norm instantiated in tf.contrib.group.layers.group_norm() only supports fixed_size_resizer for image preprocessing.
regularize_depthwise bool optional Whether depthwise convolutions should be regularized. If this parameter is NOT set then the conv hyperparams will default to the parent scope. Default: false

Initializer

Proto with one-of field for initializers.

Field Type Label Description
truncated_normal_initializer TruncatedNormalInitializer optional  
variance_scaling_initializer VarianceScalingInitializer optional  
random_normal_initializer RandomNormalInitializer optional  

L1Regularizer

Configuration proto for L1 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l1_regularizer

Field Type Label Description
weight float optional Default: 1

L2Regularizer

Configuration proto for L2 Regularizer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/l2_regularizer

Field Type Label Description
weight float optional Default: 1

RandomNormalInitializer

Configuration proto for random normal initializer. See https://www.tensorflow.org/api_docs/python/tf/random_normal_initializer

Field Type Label Description
mean float optional Default: 0
stddev float optional Default: 1

Regularizer

Proto with one-of field for regularizers.

Field Type Label Description
l1_regularizer L1Regularizer optional  
l2_regularizer L2Regularizer optional  

TruncatedNormalInitializer

Configuration proto for truncated normal initializer. See https://www.tensorflow.org/api_docs/python/tf/truncated_normal_initializer

Field Type Label Description
mean float optional Default: 0
stddev float optional Default: 1

VarianceScalingInitializer

Configuration proto for variance scaling initializer. See https://www.tensorflow.org/api_docs/python/tf/contrib/layers/ variance_scaling_initializer

Field Type Label Description
factor float optional Default: 2
uniform bool optional Default: false
mode VarianceScalingInitializer.Mode optional Default: FAN_IN

Hyperparams.Activation

Type of activation to apply after convolution.

Name Number Description
NONE 0 Use None (no activation)
RELU 1 Use tf.nn.relu
RELU_6 2 Use tf.nn.relu6

Hyperparams.Op

Operations affected by hyperparameters.

Name Number Description
CONV 1 Convolution, Separable Convolution, Convolution transpose.
FC 2 Fully connected

VarianceScalingInitializer.Mode

Name Number Description
FAN_IN 0  
FAN_OUT 1  
FAN_AVG 2  

Scalar Value Types

.proto Type Notes C++ Java Python Go C# PHP Ruby
double   double double float float64 double float Float
float   float float float float32 float float Float
int32 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. int32 int int int32 int integer Bignum or Fixnum (as required)
int64 Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. int64 long int/long int64 long integer/string Bignum
uint32 Uses variable-length encoding. uint32 int int/long uint32 uint integer Bignum or Fixnum (as required)
uint64 Uses variable-length encoding. uint64 long int/long uint64 ulong integer/string Bignum or Fixnum (as required)
sint32 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. int32 int int int32 int integer Bignum or Fixnum (as required)
sint64 Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. int64 long int/long int64 long integer/string Bignum
fixed32 Always four bytes. More efficient than uint32 if values are often greater than 2^28. uint32 int int uint32 uint integer Bignum or Fixnum (as required)
fixed64 Always eight bytes. More efficient than uint64 if values are often greater than 2^56. uint64 long int/long uint64 ulong integer/string Bignum
sfixed32 Always four bytes. int32 int int int32 int integer Bignum or Fixnum (as required)
sfixed64 Always eight bytes. int64 long int/long int64 long integer/string Bignum
bool   bool boolean boolean bool bool boolean TrueClass/FalseClass
string A string must always contain UTF-8 encoded or 7-bit ASCII text. string String str/unicode string string string String (UTF-8)
bytes May contain any arbitrary sequence of bytes. string ByteString str []byte ByteString string String (ASCII-8BIT)