1Introduction {#intro} 2============ 3 4OpenCV (Open Source Computer Vision Library: <http://opencv.org>) is an open-source BSD-licensed 5library that includes several hundreds of computer vision algorithms. The document describes the 6so-called OpenCV 2.x API, which is essentially a C++ API, as opposite to the C-based OpenCV 1.x API. 7The latter is described in opencv1x.pdf. 8 9OpenCV has a modular structure, which means that the package includes several shared or static 10libraries. The following modules are available: 11 12- @ref core - a compact module defining basic data structures, including the dense 13 multi-dimensional array Mat and basic functions used by all other modules. 14- @ref imgproc - an image processing module that includes linear and non-linear image filtering, 15 geometrical image transformations (resize, affine and perspective warping, generic table-based 16 remapping), color space conversion, histograms, and so on. 17- **video** - a video analysis module that includes motion estimation, background subtraction, 18 and object tracking algorithms. 19- **calib3d** - basic multiple-view geometry algorithms, single and stereo camera calibration, 20 object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction. 21- **features2d** - salient feature detectors, descriptors, and descriptor matchers. 22- **objdetect** - detection of objects and instances of the predefined classes (for example, 23 faces, eyes, mugs, people, cars, and so on). 24- **highgui** - an easy-to-use interface to simple UI capabilities. 25- **videoio** - an easy-to-use interface to video capturing and video codecs. 26- **gpu** - GPU-accelerated algorithms from different OpenCV modules. 27- ... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and 28 others. 29 30The further chapters of the document describe functionality of each module. But first, make sure to 31get familiar with the common API concepts used thoroughly in the library. 32 33API Concepts 34------------ 35 36### cv Namespace 37 38All the OpenCV classes and functions are placed into the cv namespace. Therefore, to access this 39functionality from your code, use the cv:: specifier or using namespace cv; directive: 40@code 41#include "opencv2/core.hpp" 42... 43cv::Mat H = cv::findHomography(points1, points2, CV_RANSAC, 5); 44... 45@endcode 46or : 47~~~ 48 #include "opencv2/core.hpp" 49 using namespace cv; 50 ... 51 Mat H = findHomography(points1, points2, CV_RANSAC, 5 ); 52 ... 53~~~ 54Some of the current or future OpenCV external names may conflict with STL or other libraries. In 55this case, use explicit namespace specifiers to resolve the name conflicts: 56@code 57 Mat a(100, 100, CV_32F); 58 randu(a, Scalar::all(1), Scalar::all(std::rand())); 59 cv::log(a, a); 60 a /= std::log(2.); 61@endcode 62 63### Automatic Memory Management 64 65OpenCV handles all the memory automatically. 66 67First of all, std::vector, Mat, and other data structures used by the functions and methods have 68destructors that deallocate the underlying memory buffers when needed. This means that the 69destructors do not always deallocate the buffers as in case of Mat. They take into account possible 70data sharing. A destructor decrements the reference counter associated with the matrix data buffer. 71The buffer is deallocated if and only if the reference counter reaches zero, that is, when no other 72structures refer to the same buffer. Similarly, when a Mat instance is copied, no actual data is 73really copied. Instead, the reference counter is incremented to memorize that there is another owner 74of the same data. There is also the Mat::clone method that creates a full copy of the matrix data. 75See the example below: 76@code 77 // create a big 8Mb matrix 78 Mat A(1000, 1000, CV_64F); 79 80 // create another header for the same matrix; 81 // this is an instant operation, regardless of the matrix size. 82 Mat B = A; 83 // create another header for the 3-rd row of A; no data is copied either 84 Mat C = B.row(3); 85 // now create a separate copy of the matrix 86 Mat D = B.clone(); 87 // copy the 5-th row of B to C, that is, copy the 5-th row of A 88 // to the 3-rd row of A. 89 B.row(5).copyTo(C); 90 // now let A and D share the data; after that the modified version 91 // of A is still referenced by B and C. 92 A = D; 93 // now make B an empty matrix (which references no memory buffers), 94 // but the modified version of A will still be referenced by C, 95 // despite that C is just a single row of the original A 96 B.release(); 97 98 // finally, make a full copy of C. As a result, the big modified 99 // matrix will be deallocated, since it is not referenced by anyone 100 C = C.clone(); 101@endcode 102You see that the use of Mat and other basic structures is simple. But what about high-level classes 103or even user data types created without taking automatic memory management into account? For them, 104OpenCV offers the Ptr template class that is similar to std::shared\_ptr from C++11. So, instead of 105using plain pointers: 106@code 107 T* ptr = new T(...); 108@endcode 109you can use: 110@code 111 Ptr<T> ptr(new T(...)); 112@endcode 113or: 114@code 115 Ptr<T> ptr = makePtr<T>(...); 116@endcode 117Ptr\<T\> encapsulates a pointer to a T instance and a reference counter associated with the pointer. 118See the Ptr description for details. 119 120### Automatic Allocation of the Output Data 121 122OpenCV deallocates the memory automatically, as well as automatically allocates the memory for 123output function parameters most of the time. So, if a function has one or more input arrays (cv::Mat 124instances) and some output arrays, the output arrays are automatically allocated or reallocated. The 125size and type of the output arrays are determined from the size and type of input arrays. If needed, 126the functions take extra parameters that help to figure out the output array properties. 127 128Example: 129@code 130 #include "opencv2/imgproc.hpp" 131 #include "opencv2/highgui.hpp" 132 133 using namespace cv; 134 135 int main(int, char**) 136 { 137 VideoCapture cap(0); 138 if(!cap.isOpened()) return -1; 139 140 Mat frame, edges; 141 namedWindow("edges",1); 142 for(;;) 143 { 144 cap >> frame; 145 cvtColor(frame, edges, COLOR_BGR2GRAY); 146 GaussianBlur(edges, edges, Size(7,7), 1.5, 1.5); 147 Canny(edges, edges, 0, 30, 3); 148 imshow("edges", edges); 149 if(waitKey(30) >= 0) break; 150 } 151 return 0; 152 } 153@endcode 154The array frame is automatically allocated by the \>\> operator since the video frame resolution and 155the bit-depth is known to the video capturing module. The array edges is automatically allocated by 156the cvtColor function. It has the same size and the bit-depth as the input array. The number of 157channels is 1 because the color conversion code COLOR\_BGR2GRAY is passed, which means a color to 158grayscale conversion. Note that frame and edges are allocated only once during the first execution 159of the loop body since all the next video frames have the same resolution. If you somehow change the 160video resolution, the arrays are automatically reallocated. 161 162The key component of this technology is the Mat::create method. It takes the desired array size and 163type. If the array already has the specified size and type, the method does nothing. Otherwise, it 164releases the previously allocated data, if any (this part involves decrementing the reference 165counter and comparing it with zero), and then allocates a new buffer of the required size. Most 166functions call the Mat::create method for each output array, and so the automatic output data 167allocation is implemented. 168 169Some notable exceptions from this scheme are cv::mixChannels, cv::RNG::fill, and a few other 170functions and methods. They are not able to allocate the output array, so you have to do this in 171advance. 172 173### Saturation Arithmetics 174 175As a computer vision library, OpenCV deals a lot with image pixels that are often encoded in a 176compact, 8- or 16-bit per channel, form and thus have a limited value range. Furthermore, certain 177operations on images, like color space conversions, brightness/contrast adjustments, sharpening, 178complex interpolation (bi-cubic, Lanczos) can produce values out of the available range. If you just 179store the lowest 8 (16) bits of the result, this results in visual artifacts and may affect a 180further image analysis. To solve this problem, the so-called *saturation* arithmetics is used. For 181example, to store r, the result of an operation, to an 8-bit image, you find the nearest value 182within the 0..255 range: 183 184\f[I(x,y)= \min ( \max (\textrm{round}(r), 0), 255)\f] 185 186Similar rules are applied to 8-bit signed, 16-bit signed and unsigned types. This semantics is used 187everywhere in the library. In C++ code, it is done using the saturate\_cast\<\> functions that 188resemble standard C++ cast operations. See below the implementation of the formula provided above: 189@code 190 I.at<uchar>(y, x) = saturate_cast<uchar>(r); 191@endcode 192where cv::uchar is an OpenCV 8-bit unsigned integer type. In the optimized SIMD code, such SSE2 193instructions as paddusb, packuswb, and so on are used. They help achieve exactly the same behavior 194as in C++ code. 195 196@note Saturation is not applied when the result is 32-bit integer. 197 198### Fixed Pixel Types. Limited Use of Templates 199 200Templates is a great feature of C++ that enables implementation of very powerful, efficient and yet 201safe data structures and algorithms. However, the extensive use of templates may dramatically 202increase compilation time and code size. Besides, it is difficult to separate an interface and 203implementation when templates are used exclusively. This could be fine for basic algorithms but not 204good for computer vision libraries where a single algorithm may span thousands lines of code. 205Because of this and also to simplify development of bindings for other languages, like Python, Java, 206Matlab that do not have templates at all or have limited template capabilities, the current OpenCV 207implementation is based on polymorphism and runtime dispatching over templates. In those places 208where runtime dispatching would be too slow (like pixel access operators), impossible (generic 209Ptr\<\> implementation), or just very inconvenient (saturate\_cast\<\>()) the current implementation 210introduces small template classes, methods, and functions. Anywhere else in the current OpenCV 211version the use of templates is limited. 212 213Consequently, there is a limited fixed set of primitive data types the library can operate on. That 214is, array elements should have one of the following types: 215 216- 8-bit unsigned integer (uchar) 217- 8-bit signed integer (schar) 218- 16-bit unsigned integer (ushort) 219- 16-bit signed integer (short) 220- 32-bit signed integer (int) 221- 32-bit floating-point number (float) 222- 64-bit floating-point number (double) 223- a tuple of several elements where all elements have the same type (one of the above). An array 224 whose elements are such tuples, are called multi-channel arrays, as opposite to the 225 single-channel arrays, whose elements are scalar values. The maximum possible number of 226 channels is defined by the CV\_CN\_MAX constant, which is currently set to 512. 227 228For these basic types, the following enumeration is applied: 229@code 230 enum { CV_8U=0, CV_8S=1, CV_16U=2, CV_16S=3, CV_32S=4, CV_32F=5, CV_64F=6 }; 231@endcode 232Multi-channel (n-channel) types can be specified using the following options: 233 234- CV_8UC1 ... CV_64FC4 constants (for a number of channels from 1 to 4) 235- CV_8UC(n) ... CV_64FC(n) or CV_MAKETYPE(CV_8U, n) ... CV_MAKETYPE(CV_64F, n) macros when 236 the number of channels is more than 4 or unknown at the compilation time. 237 238@note `CV_32FC1 == CV_32F, CV_32FC2 == CV_32FC(2) == CV_MAKETYPE(CV_32F, 2)`, and 239`CV_MAKETYPE(depth, n) == ((depth&7) + ((n-1)<<3)``. This means that the constant type is formed from the 240depth, taking the lowest 3 bits, and the number of channels minus 1, taking the next 241`log2(CV_CN_MAX)`` bits. 242 243Examples: 244@code 245 Mat mtx(3, 3, CV_32F); // make a 3x3 floating-point matrix 246 Mat cmtx(10, 1, CV_64FC2); // make a 10x1 2-channel floating-point 247 // matrix (10-element complex vector) 248 Mat img(Size(1920, 1080), CV_8UC3); // make a 3-channel (color) image 249 // of 1920 columns and 1080 rows. 250 Mat grayscale(image.size(), CV_MAKETYPE(image.depth(), 1)); // make a 1-channel image of 251 // the same size and same 252 // channel type as img 253@endcode 254Arrays with more complex elements cannot be constructed or processed using OpenCV. Furthermore, each 255function or method can handle only a subset of all possible array types. Usually, the more complex 256the algorithm is, the smaller the supported subset of formats is. See below typical examples of such 257limitations: 258 259- The face detection algorithm only works with 8-bit grayscale or color images. 260- Linear algebra functions and most of the machine learning algorithms work with floating-point 261 arrays only. 262- Basic functions, such as cv::add, support all types. 263- Color space conversion functions support 8-bit unsigned, 16-bit unsigned, and 32-bit 264 floating-point types. 265 266The subset of supported types for each function has been defined from practical needs and could be 267extended in future based on user requests. 268 269### InputArray and OutputArray 270 271Many OpenCV functions process dense 2-dimensional or multi-dimensional numerical arrays. Usually, 272such functions take cppMat as parameters, but in some cases it's more convenient to use 273std::vector\<\> (for a point set, for example) or Matx\<\> (for 3x3 homography matrix and such). To 274avoid many duplicates in the API, special "proxy" classes have been introduced. The base "proxy" 275class is InputArray. It is used for passing read-only arrays on a function input. The derived from 276InputArray class OutputArray is used to specify an output array for a function. Normally, you should 277not care of those intermediate types (and you should not declare variables of those types 278explicitly) - it will all just work automatically. You can assume that instead of 279InputArray/OutputArray you can always use Mat, std::vector\<\>, Matx\<\>, Vec\<\> or Scalar. When a 280function has an optional input or output array, and you do not have or do not want one, pass 281cv::noArray(). 282 283### Error Handling 284 285OpenCV uses exceptions to signal critical errors. When the input data has a correct format and 286belongs to the specified value range, but the algorithm cannot succeed for some reason (for example, 287the optimization algorithm did not converge), it returns a special error code (typically, just a 288boolean variable). 289 290The exceptions can be instances of the cv::Exception class or its derivatives. In its turn, 291cv::Exception is a derivative of std::exception. So it can be gracefully handled in the code using 292other standard C++ library components. 293 294The exception is typically thrown either using the CV\_Error(errcode, description) macro, or its 295printf-like CV\_Error\_(errcode, printf-spec, (printf-args)) variant, or using the 296CV\_Assert(condition) macro that checks the condition and throws an exception when it is not 297satisfied. For performance-critical code, there is CV\_DbgAssert(condition) that is only retained in 298the Debug configuration. Due to the automatic memory management, all the intermediate buffers are 299automatically deallocated in case of a sudden error. You only need to add a try statement to catch 300exceptions, if needed: : 301@code 302 try 303 { 304 ... // call OpenCV 305 } 306 catch( cv::Exception& e ) 307 { 308 const char* err_msg = e.what(); 309 std::cout << "exception caught: " << err_msg << std::endl; 310 } 311@endcode 312 313### Multi-threading and Re-enterability 314 315The current OpenCV implementation is fully re-enterable. That is, the same function, the same 316*constant* method of a class instance, or the same *non-constant* method of different class 317instances can be called from different threads. Also, the same cv::Mat can be used in different 318threads because the reference-counting operations use the architecture-specific atomic instructions. 319