[Joseph DeGol]

Geometry-Informed Material Recognition

Joseph DeGol, Mani Golparvar-Fard, Derek Hoiem

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR '16')

Spotlight Paper (9.7%)







Our goal is to recognize material categories using images and geometry information. In many applications, such as construction management, coarse geometry information is available. We investigate how 3D geometry (surface normals, camera intrinsic and extrinsic parameters) can be used with 2D features (texture and color) to improve material classification. We introduce a new dataset, GeoMat, which is the first to provide both image and geometry data in the form of: (i) training and testing patches that were extracted at different scales and perspectives from real world examples of each material category, and (ii) a large scale construction site scene that includes 160 images and over 800,000 hand labeled 3D points. Our results show that using 2D and 3D features both jointly and independently to model materials improves classification accuracy across multiple scales and viewing directions for both material patches and images of a large scale construction site scene.


Our source code was built on top of the code provided by Cimpoi et. al. in their paper Describing Texture in the Wild. We edited this code to include our image and geometry data for both focus and scene scale and to use some new feature representations. The code is written in MATLAB and uses VLFeat (which is included in our zip for convenience and to make sure future releases don't break the code). Matlab version R2015a was used to run this code.

To setup the code and data and run an experiment:

1. Download Code

2. Unzip

3. Read README.txt

4. Run Setup_GeoInfMatRec.m

4. Run texture_experiments.m

Click (1) the download icon above to download the code. Next, unzip (2) the downloaded file and navigate inside. There should be a README.txt file, a directory named World, and a MATLAB file named Setup_GeoInfMatRec.m. Read (3) the README.txt file for more details about the code. Run (4) Setup_GeoInfMatRec.m in MATLAB, which will download all the data to the appropriate location. Inside the World directory, there is a texture_experiments.m file that runs the experiments. To reproduce our best result (FV-N+CNN+N3D), Run (5) texture_experiments.m in MATLAB.

The code should run out of the box to evaluate the FV-N + CNN + N3D features (the best performing feature set from the paper). The file that runs the code is texture_experiments.m and this file can be edited to run other combinations of features. See the README.txt file for more information on data and code setup and the comments in texture_experiments.m for more information about running specific experiments.


This section provides access to the original material patches, if you want the data setup with the code, download it with the source code above!

The GeoMat dataset (Geometry and Materials) is a dataset designed for experimenting with both sparse geometry (depth, normal vectors, camera calibrations) with images for material recognition. The Dataset consists are two parts: focus scale and scene scale.

Focus Scale:
The focus scale part consists of 400 training and 200 testing patches per category. There are 19 material categories in total (Asphalt, Brick, Cement - Granular, Cement - Smooth, Concrete - Cast_in_place, Concrete - Precast, Foliage, Grass, Gravel, Marble, Metal - Grills, Paving, Soil - Compact, Soil - Dirt and Veg, Soil - Loose, Soil - Mulch, Stone - Granular, Stone - Limestone, and Wood). Note that for reproducing experiments, the data is provided as part of the source section and this data is here for completeness and future applications. Each sample is a MATLAB *.mat file consisting of the following fields:

Class - class label as a number between 1 and 19 where 1 is asphalt and proceeds in alphabetical order to wood at 19
Sample - label of the instance of the material that the patch was extracted from
Surface - name of the image that the patch was extracted from
Position - X and Y location of the center of the rectangle in the image where the patch was extracted
Corners - X and Y locations of the corners of the rectangle in the image where the patch was extracted
Image - original image patch that was extracted
GrayIm - original image converted to gray scale
GrayImNorm - gray scale image that has been contrast normalized
HSVIm - original image converted to HSV
HSVImNorm - HSV image that has been contrast normalized
Scale - values indicating the original size of the patch; e.g. (100x100, 200x200, 400x400, or 800x800)
Normals - 8 values: X and Y location of each normal; the nx, ny, and nz values of the normal; theta and phi; 1
Depth - depth patch made from interpolating the sparse depth data
Intrinsics - camera intrinsics matrix estimated using structure-from-motion
Extrinsics - rotation and translation matrix estimated using structure-from-motion

Note that in practice, the sparse depth data was interpolate to get a dense depth patch and this patch was used to calculate normal vectors (rather than using the Normals field described above). Note also that there are also fields in each *.mat for warped patches that were frontally rectified using the camera calibration information. These fields are denoted with a "W" (and a Homography matrix field is provided as well); however, we found the homography warpings to be somewhat unreliable and to produce worse results overall. Above, when I use the word "instance", I am referring to a specific wall or ground surface. For example, many different brick walls (i.e. different instances) were used for extracted the GeoMat patches.

Click the material images to download the samples for that category (each one is about 1GB in size). Alternatively, you can download all the categories using the provided script below.



Cement - Granular

Cement - Smooth

Concrete - Cast In Place

Concrete - Precast





Metal - Grills


Soil - Compact

Soil - Dirt and Vegetation

Soil - Loose

Soil - Mulch

Stone - Granular

Stone - Limestone


Scene Scale:
The scene scale part consists of 160 images of a construction site scene where 11 of the 19 material categories are represented. The images were used to generate a 3D point cloud of the scene and the points were hand labeled. There are over 900,000 total points, and it was difficult to accurately label every single point, so it is possible that a small number of labels are incorrect. The images, geometry (point cloud and extrinsics), and labels are available below. Note that for reproducing experiments, the data is provided as part of the source section and this data is here for completeness and future applications.




For experiments, we used the images and point cloud data to generate test patches. These patches were only used for testing after training with the focus scale part of the GeoMat dataset. To create the patches, each image (001 through 160) was split into super-pixels and the corresponding geometry and labels were projected onto the super-pixel region. Each one of these super-pixels paired with geometry and labels is a *.mat file with the following fields:

DD - Depth patch for the super-pixel
Distortion - distortion coefficients
Extrinsics - rotation and translation matrix estimated using structure-from-motion
GravityR - gravity vector of the camera assuming the photos were taken in landscape
ImgPath - path to the image that the super-pixel was extracted from
Intrinsics - camera intrinsics matrix estimated using structure-from-motion
Label - pixel-wise material labels: 0 is not labeled and 20 is unknown.
NX - Normal vector patch for x-direction
NY - Normal vector patch for y-direction
NZ - Normal vector patch for z-direction
Patch - RGB Image Patch for super-pixel

The samples that were used for experiments in this paper are downloadable below. Again, the easiest way to set up this data is to use the method provided in the source section above which will download all the data automatically for you.

Samples 001 to 020

Samples 021 to 040

Samples 041 to 060

Samples 061 to 080

Samples 081 to 101

Samples 101 to 120

Samples 121 to 140

Samples 141 to 160

See the paper and supplementary material (poster and talk too!) for more details on the way the data was created for both focus scale and scene scale parts of the GeoMat dataset.

  author    = {Joseph DeGol and Mani Golparvar-Fard and Derek Hoiem},
  title     = {Geometry-Informed Material Recognition},
  booktitle = {CVPR},
  year      = {2016}

Last Updated: 4/8/17