[WIP; page under construction]
Photogrammetry involves photographing a static subject from a variety of angles and entering those photos into a computer software which outputs a virtual 3D reconstruction.
Subjects with a simple shape are easier to capture than those with complex geometry. Surfaces which are opaque, rough, and uniquely patterned are easier to capture than those which are translucent, shiny, and self-identical. Surface illumination is best performed with light which is diffuse and evenly spread, not bright and focused.
The aim of this document is to introduce the reader to photogrammetry. To that end, both a formal conceptual definition and a practical description of the production process are provided. This text finishes with a discussion of additional practical topics that will hopefully guide or at least assist the reader when they embark upon their own photogrammetry project.
The author of this document, Daniel Muirhead, has (at the time of writing) captured more than 250,000 photographs for use in photogrammetry projects, in the process logging more than 6,000 hours in a variety of photogrammetry softwares.
In this context, by “photogrammetry” is meant the photo capture of a real world subject and the subsequent use of these photos as inputs in a computer program (“photogrammetry software”) which automatically reconstructs that subject as a virtual 3D model.
Photogrammetry involves 4 pieces of equipment:
- capture device, for capturing photos of the subject;
- conveyance of capture device, for moving the device around the subject;
- photogrammetry software, for automatically reconstructing the subject from photos as a virtual 3D model;
- computer, for running the photogrammetry software.
[graphic 1, equipment]
Capture device, examples:
- any (digital) camera which is capable of capturing (digital) photos;
- any (digital) camcorder which is capable of capturing (digital) video, this video can be decomposed into a sequence of stills which then serve as ‘photos’;
- in theory, any device which is consistent in the way it captures/projects the 3D world onto a 2D image.
[graphic 2, capture devices]
Conveyance of capture device, examples:
- the capture device is attached to a tripod with adjustable elevation while the subject is mounted on a rotating turntable;
- the capture device is held by (or otherwise attached to) the photographer as they traverse around the subject;
- the capture device is mounted on a vehicle (e.g. atop a ground-based vehicle or underside an aerial-based vehicle) which, being manually/remotely/autonomously controlled, traverses around the subject.
[graphic 3, conveyance]
Photogrammetry software, examples:
- a variety of commercial softwares are available from different companies, such as 3DFflow, Agisoft, CapturingReality, and Pix4D. However this document is intended as an unbiased overview of the photogrammetry production process so specific recommendations are not provided here. Instead the reader is advised to perform their own research and experimentation as to which software would be best for their project;
- a number of Free and Open Source Software (FOSS) photogrammetry applications are available, these include COLMAP and Meshroom.
- if possible, it is highly advisable to run the photogrammetry software on a computer with a graphics card (GPU) which is CUDA-enabled, because almost all the photogrammetry software that this author has experience of performs significantly faster on a computer with a CUDA-enabled GPU compared to that same computer when only utilising its central processor (CPU);
- other factors that can affect the photogrammetry software processing speed are the specification of the CPU, the amount of RAM available, and whether the project files (and the software itself and its working/temp file cache) are located on a solid state drive (SSD) instead of a hard disk drive (HDD), in addition to the speed of the connection between the storage (SDD/HDD) and the motherboard;
- the computing platform might be desktop, laptop, tablet (with the photogrammetry processing power determined by the factors referenced directly above);
- again, no specific computer specifications are provided here, instead the reader is recommended to perform their own research, this could involve directly contacting the software developers to receive advice regarding recommended computer specification per specified budget.
This document assumes that the photogrammetry process is performed via a piece of software, offline on a standalone computer. Other options for the performance of the photogrammetry stage are available – e.g. online/cloud processing – but these are not discussed here.
The process is linear and can be summarised in terms of three successive stages:
- Digital Asset Generation.
Each stage generates its own product, respectively these are:
- Virtual 3D (photogrammetry) Model;
- Further/Derived Digital Assets.
[graphic 4, workflow]
The real world subject, to be captured, is a three-dimensional volume that is covered by a surface. If, for the sake of description, we visualise the subject’s surface divided into many smaller equal-area patches, then the idea during capture is to photograph each of these patches from a diversity of angles.
[graphic 5, subject surface]
Diverse photos of a subject are distinguishable by virtue of variation in their location (position in 3D space quantified as x/y/z coordinates) and/or their orientation (yaw/pitch/roll of capture device). The path that the capture device follows, through successive capture locations and/or orientations, can be described as the capture trajectory.
[graphic 6, capture trajectory]
The complexity of the capture trajectory is co-variant with the complexity of the subject’s surface geometry. That is, subjects with which are simple in shape will (unsurprisingly) be easier to capture than subjects with a complex shape. Subjects with more nooks and crannies, extrusions and partially occluded areas, will require more effort and time to perform a comprehensive capture. Before engaging in capture, if the photographer knows the approximate geometry of the subject, then it should be possible in theory to calculate in advance an optimal capture trajectory.
[graphic 7, cap traj complexity]
The overall scale of the subject (e.g. very large subjects), by comparison, has no direct relevance to the capture trajectory, provided the photographer has access to suitable means for conveying the capture device. For example, the capture trajectory for a miniature scale model of a town, and the capture trajectory for its full-size real world equivalent, would in theory be identical (once the scale/size of the two subjects was accounted for). The only difference would be in the mode of conveyance – handheld (or tripod/turntable) in the case of the miniature scale model, and vehicle-mounted in the case of the full size town.
[graphic 8, cap traj and subject scale]
2) Photogrammetry/Virtual 3D (photogrammetry) Model
The photogrammetry process is managed by the photogrammetry software and is mostly automated, involving user input only when: adding photos as inputs; tweaking parameters/settings depending on desired output; and (optionally) performing quality assurance checks mid-production. However the actual activity of reconstructing the virtual 3D model from the photos is performed entirely by the software in the sense that it involves no manual effort or decision-making on the part of the user.
The automated process performed by the photogrammetry software has several successive, discrete stages. (The actual mechanics of the software are not discussed here, instead what follows is a description from the perspective of a practitioner.) Broadly, these are:
- import of inputs – all the photos are loaded into the software;
- SPARSE – the software calculates the relative position and orientation of all the inputs, and illustrates the result of this calculation within a 3D viewport, representing both the subject as a sparse point cloud and the respective inputs’ positions and orientations relative to the subject;
- DENSE – a more detailed model is generated, a dense point cloud;
- MESH – the software generates a surface that stretches like a skin across the point cloud, the resulting asset is called a mesh. This mesh is made up of points (vertices), lines connecting the points (edges), and faces stretched between the lines (polygons). The faces of the mesh are what make it visible as a solid model in virtual 3D space (the vertices and edges are present but do not comprise the visible part of the mesh). If the (untextured) mesh has color, this is determined by the colors of the mesh’s points (vertices), i.e. it will be a vert-colored mesh, with the color of the visible faces being an interpolation of the vert colors;
- TEXTURE – the software maps the faces of the mesh onto a 2D image (this is called “unwrapping the UVs”), and then paints color onto the faces in a process called texturing. This results in a mesh with UV coordinates (a 3D model, with each of its faces a specified area) and a texture (a 2D image which wraps around that 3D model, with specified areas within that 2D space that correspond to the relevant faces on the 3D model). The textured mesh has colors that are more accurate than the vert-colored mesh.
[graphic 9, photg summary]
Throughout these stages, there is also the opportunity to perform additional processing, this might include:
- manual cropping of the Sparse, Dense, or Mesh (e.g. to remove irrelevant background)
[graphic 10, manual cropping]
- decimation, an automated process for point clouds and meshes that reduces density by thinning out (deleting) points but in a way that retains the model’s overall geometry;
[graphic 11, decimation]
- retopology, an automated process for meshes that results in equalizing the distance between points (vertices) and relatedly equalizing the relative size of the mesh’s faces;
[graphic 12, retopology]
- baking, when a high resolution mesh is duplicated (copied), then that copy is converted into a low res mesh through decimation+retopo, but the geometry of the high res is projected (baked) onto a special set of textures (namely, ‘normal maps’ or ‘bump maps’) that, when viewed in a lit environment, embellish the appearance of the low res mesh. This means that the low res mesh (which now has the appearance of the high res) can be rendered with greater efficiency in real time applications (e.g. computer games) that are otherwise constrained by how many polygons they can display on screen simultaneously (i.e. low res meshes with baked normals render faster than their full high res counterparts).
[graphic 13, baking]
3) Further/derived digital assets
The end product of photogrammetry is a virtual 3D model. For this model to be of practical application within a given context it usually requires further processing.
As a simple example, consider the scale of the model, i.e. its size. For any model generated with
- Factors affecting capture (surface, illumination)
- Recommended capture trajectories
Factors affecting capture (surface, illumination)
A photograph captures the interaction between a source of illumination and a surface, or more usually between diverse sources of illumination and diverse surfaces.