The Shape-from-Template Homepage


Home Database Benchmark Results Q&A

The Benchmark

The goal of the benchmark is to evaluate the current methods, detect solved cases, track the progress on currently studied unresolved cases and stimulate research in new unresolved cases. This goal is handled by the evaluation tracks. Each of these define a specific SfT scenario and uses the corresponding template-dataset pairs from the database. The registration and/or 3D shape groundtruth was reserved for evaluation purposes only for these data used in the evaluation tracks. The SfT benchmark proposes a complete evaluation methodology, designed to accomodate the variety of SfT algorithms in a unified and consistent manner.

Error Classes

All errors are computed using evaluation landmarks, which are predefined points evenly distributed in the template 3D shape. These evaluation landmarks do not necessarily cover the template 3D shape. We use three classes of errors. A-errors are registration errors computed in the image and expressed in px. B-errors and C-errors are 3D shape errors computed in camera coordinates and expressed in mm. The difference between B-errors and C-errors is that the former is unregistered and the latter is registered. Both errors are based on predicted evaluation landmarks in camera coordinates. B-errors use the distance between these predictions and the groundtruth 3D shape while C-errors use the distance to the corresponding groundtruth evaluation landmarks in camera coordinates. Consequently, the B-errors always underestimate the C-errors. The C-errors are more sensible and they are preferred over the B-errors whenever possible. We happen the visibility class whenever required to specify where the errors are or can be computed. For instance, the A1-errors are registration errors computed only on the object's visible part in the image.

Handling Various Inputs and Outputs via Algorithm Classes and Optional Interpolation

Existing SfT algorithms may have different inputs and outputs. They fall into three main classes: A-algorithms compute registration, B-algorithms require registration and compute 3D shape and C-algorithms compute registration and 3D shape. Consequently, not all algorithms can be run on all datasets and the class of computable errors depends on the algorithm's and dataset's categories:
Algorithms Inputs Outputs A-datasets B-datasets C-datasets
A-algorithms Template, image Registration A-errors All errors uncomputable A-errors
B-algorithms Template, image, registration 3D shape All errors uncomputable Do not run C-errors
C-algorithms Template, image Registration, 3D shape A-errors B-errors A-errors and C-errors
In addition, the outputs of SfT algorithms have different possible representations, which may be sparse or dense. The benchmark uses the following rules to conduct evaluation and to accomodate as many types of outputs as possible:

Providing Various Cases via Evaluation Tracks

We defined evaluation tracks to accomodate the different types of SfT scenarios. These evaluation tracks will be dynamically updated to keep the benchmark up-to-date with the state of research. As of today, the benchmark has the following 20 tracks:

IdNameDatasets𝒯Description
001ET001-SimpleDefault001, 0020.5Usual simple case: still-images ; thin-shell, flattenable, well-textured objects ; smooth, isometric deformations
** The other evaluation tracks will be made available depending on the outcome of our paper submission **

For each evaluation track, the benchmark provides inclass algorithm rankings and inter-class algorithm rankings. This is because some classes partially share their outputs which may thus be directly compared. For instance, both A-algorithms and C-algorithms compute registration and may be evaluated and compared in this respect. This provides valuable insights into how SfT should be solved. Indeed, in the above example, it provides empirical evidence to answer the question of whether constraining registration by the 3D deformable shape as in C-algorithms brings an improvement or not over using image-level constraints only as in A-algorithms.

Error Statistics

Each evaluation landmark provides an error, whether for registration or 3D shape. Our statistics are established over the set of landmarks in an image, for the joint set of landmarks of all images in a dataset or for the joint set of landmarks of all datasets in an evaluation track. The error statistic are labelled with shortnames defined as follows:

Website created and maintained by Adrien Bartoli.