[1] C. Wheatstone, “Contributions to the physiology of vision - part the first. on some remarkable, and hitherto unobserved phenomena of binocular vision,” Philosophical Transactions, vol. 128, pp. 371–394, 1838.

[2] I. Sexton and P. Surman, “Stereoscopic and autostereoscopic display systems,” IEEE Signal processing magazine, vol. 16, no. 3, pp. 85–99, May 1999.

[3] A. Redert, R.-P. Berretty, C. Varekamp, O. Willemsen, J. Swillens, and H. Driessen, “Philips 3D solutions: From content creation to visualization,” in Proceedings of the third international symposium on 3D data processing, visualization, and transmission, 2006, pp. 429–431.

[4] J. Ilgner, J. J.-H. Park, D. Labbé, and M. Westhofen, “Using a high-definition stereoscopic video system to teach microscopic surgery,” in Proceedings of the spie, stereoscopic displays and virtual reality systems xiv, 2007, vol. 6490, p. 649008.

[5] A. M. Gorski, “User evaluation of a stereoscopic display for space-training applications,” in Proceedings of the spie, stereoscopic displays and applications iii, 1992, vol. 1669, pp. 236–243.

[6] D. Drascic, “Skill acquisition and task performance in teleoperation using monoscopic and stereoscopic video remote viewing,” in Proceedings of the human factors society 35th annual meeting, 1991, pp. 1367–1371.

[7] N. Inamoto and H. Saito, “Free viewpoint video synthesis and presentation from multiple sporting videos,” in IEEE International Conference on Multimedia and Expo, 2005, p. 4.

[8] C. L. Zitnick, S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, “High-quality video view interpolation using a layered representation,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 600–608, 2004.

[9] W. Matusik and H. Pfister, “3D TV: A scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 814–824, 2004.

[10] A. Vetro, P. Pandit, H. Kimata, A. Smolic, and Y.-K. Wang, “Joint draft 8.0 on multiview video coding.” Joint Video Team (JVT) of ISO/IEC MPEG ITU-T VCEG ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Hannover, Germany, July-2008.

[11] U. Fecker and A. Kaup, “H.264/AVC compatible coding of dynamic light fields using transposed picture ordering,” in Proceedings of the European Signal Processing Conference (eusipco), 2005, vol. 1.

[12] P. Merkle, K. Mueller, A. Smolic, and T. Wiegand, “Efficient compression of multi-view video exploiting inter-view dependencies based on H.264/MPEG4-AVC,” in IEEE International Conference on Multimedia and Expo, 2006, pp. 1717–1720.

[13] J. H. Kim et al., “New coding tools for illumination and focus mismatch compensation in multiview video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 11, pp. 1519–1535, 2007.

[14] C. Fehn, “Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-tv,” vol. 5291. San Jose, USA, pp. 93–104, 2004.

[15] A. Bourge and C. Fehn, “White paper on ISO/IEC 23002-3 auxiliary video data representations.” ISO/IEC JTC1/SC29/WG11/N8039, Montreux, Switzerland, April-2006.

[16] H. Schirmacher, “Efficient aquisition, representation, and rendering of light fields,” PhD thesis, Universität des Saarlandes, 2003.

[17] Y. Morvan, D. Farin, and P. H. N. de With, “Design considerations for a 3D-TV video coding architecture,” in IEEE international conference on consumer electronics, 2008.

[18] Y. Morvan, D. Farin, and P. H. N. de With, “System architecture for Free-Viewpoint Video and 3D-TV,” IEEE Transactions on Consumer Electronics, vol. 54, no. 2, pp. 925–932, 2008.

[19] Y. Morvan, D. Farin, and P. H. N. de With, “Design considerations for view interpolation in a 3D video coding framework,” in 27th symposium on information theory in the benelux, 2006, vol. 1, pp. 93–100.

[20] Y. Morvan, P. H. N. de With, and D. Farin, “Platelet-based coding of depth maps for the transmission of multiview images,” in Proceedings of the spie, stereoscopic displays and virtual reality systems xiii, 2006, vol. 6055, p. 60550K.

[21] Y. Morvan, D. Farin, and P. H. N. de With, “Incorporating depth-image based view-prediction into H.264 for multiview-image coding,” in IEEE international conference on image processing, 2007, vol. I, pp. I–205–I–208.

[22] Y. Morvan, D. Farin, and P. H. N. de With, “Predictive coding of depth images across multiple views,” in Proceedings of the spie, stereoscopic displays and virtual reality systems xiv, 2007, vol. 6490, p. 64900P.

[23] Y. Morvan, D. Farin, and P. H. N. de With, “Multiview depth-image compression using an extended H.264 encoder,” in Lecture notes in computer science: Advanced concepts for intelligent vision systems, 2007, vol. 4678, pp. 675–686.

[24] Y. Morvan, D. Farin, and P. H. N. de With, “Coding of depth-maps using piecewise linear functions,” in 26th symposium on information theory in the benelux, 2005, pp. 121–128.

[25] Y. Morvan, D. Farin, and P. H. N. de With, “Novel coding technique for depth images using quadtree decomposition and plane approximation,” in Proceedings of the spie, visual communications and image processing, 2005, vol. 5960, pp. 1187–1194.

[26] Y. Morvan, D. Farin, and P. H. N. de With, “Coding depth images with piecewise linear functions for multi-view synthesis,” in Proceedings of the European Signal Processing Conference (eusipco), 2005.

[27] Y. Morvan, D. Farin, and P. H. N. de With, “Depth-image compression based on an R-D optimized quadtree decomposition for the transmission of multiview images,” in IEEE international conference on image processing, 2007, vol. 5, pp. V–105–V–108.

[28] P. Merkle, Y. Morvan, A. Smolic, K. Mueller, P. H. de With, and T. Wiegand, “The effect of depth compression on multi-view rendering quality,” in IEEE 3DTV conference: The true vision - capture, transmission and display of 3D video, 2008, pp. 245–248.

[29] P. Merkle, Y. Morvan, A. Smolic, K. Mueller, P. H. de With, and T. Wiegand, “The effects of multiview depth video compression on multiview rendering,” Signal Processing: Image Communication, vol. 24, nos. 1-2, pp. 73–88, January 2009.

[30] Y. Morvan, D. Farin, and P. H. N. de With, “Joint depth/texture bit-allocation for multi-view video compression,” in Picture coding symposium, 2007.

[31] T. Thormählen and H. Broszio, “Automatic line-based estimation of radial lens distortion,” Integrated Computer-Aided Engineering, vol. 12, no. 2, pp. 177–190, 2005.

[32] “Updated call for proposals on multi-view video coding.” Join Video Team ISO/IEC JTC1/SC29/WG11 MPEG2005/N7567, Nice, France, October-2005.

[33] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, pp. 1330–1334, 2000.

[34] C. Shu, A. Brunton, and M. Fiala, “Automatic grid finding in calibration patterns using delaunay triangulation,” National Research Council, Institute for Information Technology, Montreal, Canada, NRC-46497/ERB-1104, Aug. 2003.

[35] R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge University Press, 2004.

[36] A. Fusiello, E. Trucco, and A. Verri, “A compact algorithm for rectification of stereo pairs,” Machine Vision and Applications, vol. 12, no. 1, pp. 16–22, 2000.

[37] D. Scharstein, R. Szeliski, and R. Zabih, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” in IEEE workshop on stereo and multi-baseline vision, 2001, pp. 131–140.

[38] F. Devernay, “Vision stéréoscopique et propriétés différentielles des surfaces,” PhD thesis, Ecole Polytechnique, Palaiseau, France, 1997.

[39] S. Birchfield and C. Tomasi, “A pixel dissimilarity measure that is insensitive to image sampling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 4, pp. 401–406, 1998.

[40] T. Kanade and M. Okutomi, “A stereo matching algorithm with an adaptive window: Theory and experiment,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 9, pp. 920–932, 1994.

[41] A. Fusiello, V. Roberto, and E. Trucco, “Efficient stereo with multiple windowing,” in IEEE conference on computer vision and pattern recognition, 1997, pp. 858–863.

[42] S. B. Kang, R. Szeliski, and J. Chai, “Handling occlusions in dense multi-view stereo,” in IEEE conference computer vision and pattern recognition, 2001, vol. 1, pp. I–103–I–110.

[43] H. Tao, H. S. Sawhney, and R. Kumar, “A global matching framework for stereo computation,” in IEEE international conference on computer vision, 2001, vol. 1, pp. 532–539.

[44] M. Bleyer and M. Gelautz, “A layered stereo matching algorithm using image segmentation and global visibility constraints,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 59, no. 3, pp. 128–150, 2005.

[45] H. Tao and H. S. Sawhney, “Global matching criterion and color segmentation based stereo,” in IEEE workshop on applications of computer vision, 2000, pp. 246–253.

[46] S. Birchfield and C. Tomasi, “Depth discontinuities by pixel-to-pixel stereo,” International Journal of Computer Vision, vol. 35, no. 3, pp. 269–293, 1999.

[47] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.

[48] V. Kolmogorov and R. Zabih, “Computing visual correspondence with occlusions via graph cuts,” in IEEE international conference on computer vision, 2006, vol. 2, pp. 508–515.

[49] J. Sun, N.-N. Zheng, and H.-Y. Shum, “Stereo matching using belief propagation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 7, pp. 787–800, 2003.

[50] C. L. Zitnick and S. B. Kang, “Stereo for image-based rendering using image over-segmentation,” International Journal of Computer Vision, vol. 75, no. 1, pp. 49–65, October 2007.

[51] C. Cigla, X. Zabulis, and A. A. Alatan, “Region-based dense depth extraction from multi-view video,” in IEEE international conference on image processing, 2007, vol. 5, pp. V213–V216.

[52] A. F. Bobick and S. S. Intille, “Large occlusion stereo,” International Journal of Computer Vision, vol. 33, no. 3, pp. 181–200, 1999.

[53] I. J. Cox, S. L. Hingorani, S. B. Rao, and B. M. Maggs, “A maximum likelihood stereo algorithm,” Computer Vision and Image Understanding, vol. 63, no. 3, pp. 542–567, 1996.

[54] P. Kauff et al., “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Processing: Image Communication, vol. 22, no. 2, pp. 217–234, 2007.

[55] S. Yea and A. Vetro, “CE3: Study on depth issues.” ISO/IEC JTC1/SC29/WG11 and ITU SG16 Q.6 JVT-X073, Geneva, Switzerland, 2007.

[56] J.-X. Chai, S.-C. Chan, H.-Y. Shum, and X. Tong, “Plenoptic sampling,” in International conference on computer graphics and interactive techniques, (acm siggraph), 2000, pp. 307–318.

[57] J. B. Roerdink and A. Meijster, “The watershed transform: Definitions, algorithms and parallelization strategies,” FUNDINF: Fundamenta Informatica, vol. 41, 2000.

[58] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981.

[59] M. Levoy and P. Hanrahan, “Light field rendering,” in International conference on computer graphics and interactive techniques, (acm siggraph), 1996, pp. 31–42.

[60] H.-Y. Shum and S. B. Kang, “A review of image-based rendering techniques,” in Proceedings of spie, visual communications and image processing, 2000, vol. 4067, pp. 2–13.

[61] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen, “The lumigraph,” in International conference on computer graphics and interactive techniques, (acm siggraph), 1996, pp. 43–54.

[62] P. E. Debevec, G. Borshukov, and Y. Yu, “Efficient view-dependent image-based rendering with projective texture-mapping,” in Proceedings of the 9th eurographics workshop on rendering 1998, 1998.

[63] J. Shade, S. Gortler, L.-w. He, and R. Szeliski, “Layered depth images,” in International conference on computer graphics and interactive techniques (acm siggraph), 1998, pp. 231–242.

[64] S. M. Seitz and C. R. Dyer, “View morphing,” in International conference on computer graphics and interactive techniques, (acm siggraph), 1996, pp. 21–30.

[65] S. Würmlin, E. Lamboray, and M. Gross, “3D video fragments: Dynamic point samples for real-time free-viewpoint video,” in Computers and graphics, special issue on coding, compression and streaming techniques for 3D and multimedia data, 2004, pp. 3–14.

[66] L. McMillan, “An image-based approach to three-dimensional computer graphics,” PhD thesis, University of North Carolina, Chapel Hill, USA, 1997.

[67] B. Heigl, R. Koch, M. Pollefeys, J. Denzler, and L. J. V. Gool, “Plenoptic modeling and rendering from image sequences taken by hand-held camera,” in Deutsche Arbeitsgemeinschaft für Mustererkennung-symposium, 1999, pp. 94–101.

[68] K. Pulli, M. Cohen, T. Duchamp, H. Hoppe, L. Shapiro, and W. Stuetzle, “View-based rendering: Visualizing real objects from scanned range and color data,” in Proceedings of the eighth eurographics workshop on rendering 1997, 1997, pp. 23–34.

[69] D. Farin, Y. Morvan, and P. H. N. de With, “View interpolation along a chain of weakly calibrated cameras,” in IEEE workshop on content generation and coding for 3D-television, 2006.

[70] P. Merkle, A. Smolic, K. Mueller, and T. Wiegand, “Multi-view video plus depth representation and coding,” in IEEE International Conference on Image Processing, 2007, vol. 1, pp. 201–204.

[71] M. M. Oliveira, “Relief texture mapping,” PhD thesis, University of North Carolina, Chapel Hill, USA, 2000.

[72] G. Wolberg, Digital image warping. IEEE Computer Society Press, 1990.

[73] S. Laveau and O. Faugeras, “3-D scene representation as a collection of images,” in International conference on pattern recognition, 1994, vol. 1, pp. 689–691.

[74] J. Stolfi, Oriented projective geometry. Academic Press, Elsevier, 1991.

[75] W. R. Mark, L. McMillan, and G. Bishop, “Post-rendering 3D warping,” in Symposium on interactive 3D graphics, 1997, pp. 7–16.

[76] “Information technology - mpeg video technologies - part3: Representation of auxiliary data and supplemental information.” International Standard: ISO/IEC 23002-3:2007, January-2007.

[77] A. Vetro and F. Bruls, “Summary of BoG discussions on FTV.” ISO/IEC JTC1/SC29/WG11 and ITU SG16 Q.6 JVT-Y087, Shenzhen, China, October-2007.

[78] M. Magnor, P. Ramanathan, and B. Girod, “Multi-view coding for image based rendering using 3-D scene geometry,” IEEE Transactions on Circuits Systems and Video Technology, vol. 13, no. 11, pp. 1092–1106, November 2003.

[79] E. Martinian, A. Behrens, J. Xin, and A. Vetro, “View synthesis for multiview video compression,” in Picture coding symposium, 2006.

[80] Y. Chen, Y.-K. Wang, K. Ugur, M. M. Hannuksela, J. Lainema, and M. Gabbouj, “The emerging MVC standard for 3D video services,” EURASIP Journal on Advances in Signal Processing, no. 1, January 2009.

[81] A. Kaup and U. Fecker, “Analysis of multi-reference block matching for multi-view video coding,” in Proceedings of 7th workshop digital broadcasting, 2006, pp. 33–39.

[82] J.-R. Ohm, “Stereo/multiview video encoding using the mpeg family of standards,” in Proceedings of the spie, stereoscopic displays and virtual reality systems vi, 1999, vol. 3639, pp. 242–253.

[83] P. Merkle, A. Smolic, K. Mueller, and T. Wiegand, “Comparative study of MVC structures.” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-V132, Marrakech, Marocco, January-2007.

[84] P. Merkle, A. Smolic, K. Mueller, and T. Wiegand, “Efficient prediction structures for multiview video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 11, pp. 1461–1473, Nov. 2007.

[85] Y. Chen, P. Pandit, and S. Yea, “Study Text of ISO/IEC 14496-5:2001/PDAM 15 Reference Software for Multiview Video Coding.” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Busan, South Korea, October-2008.

[86] L. Aimar et al., “Webpage title: X264 a free H264/AVC encoder.” http://www.videolan.org/developers/x264.html, last visited: January 2009.

[87] P. Merkle, A. Smolic, K. Mueller, and T. Wiegand, “MVC: Experiments on Coding of Multi-view Video plus Depth.” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-X064, Geneva, Switzerland, June-2007.

[88] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding of video sequence,” IEEE Journal on Selected Areas in Communications, vol. 5, no. 7, pp. 1140–1154, 1987.

[89] M. Flier, A. Mavlankar, and B. Girod, “Motion and disparity compensated coding for multi-view video,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 7, pp. 1474–1484, 2007.

[90] Y. Su, A. Vetro, and A. Smolic, “Common test conditions for multiview video coding.” ISO/IEC JTC1/SC29/WG11 and ITU SG16 Q.6 JVT-U211, Hangzhou, China, october-2006.

[91] C. Fehn et al., “An advanced 3DTV concept providing interoperabilty and scalabilty for a wide range of multi-baseline geometries,” in IEEE International Conference on Image Processing, 2006, pp. 2961–2964.

[92] R. Krishnamurthy, B.-B. Chai, H. Tao, and S. Sethuraman, “Compression and transmission of depth maps for image-based rendering,” in IEEE international conference on image processing, 2001, vol. 3, pp. 828–831.

[93] M. Maitre and M. N. Do, “Joint encoding of the depth image based representation using shape-adaptive wavelets,” in IEEE international conference on image processing, 2008, vol. 1, pp. 1768–1771.

[94] C. Fehn, K. Schuur, P. Kauff, and A. Smolic, “Coding results for EE4 in MPEG 3DAV.” ISO/IEC JTC 1/SC 29/WG 11, MPEG03/M9561, March-2003.

[95] D. Tzovaras, N. Grammalidis, and M. G. Strintzis, “Disparity field and depth map coding for multiview image sequence,” in IEEE international conference on image processing, 1996, vol. 2, pp. 887–890.

[96] B.-B. Chai, S. Sethuraman, and H. S. Sawhney, “A depth map representation for real-time transmission and view-based rendering of a dynamic 3D scene,” in First international symposium on 3D data processing visualization and transmission, 2002, pp. 107–114.

[97] D. Donoho, “Wedgelets: Nearly minimax estimation of edges,” Annals of Statistics, vol. 27, no. 3, pp. 859–897, March 1999.

[98] R. M. Willett and R. D. Nowak, “Platelets: A multiscale approach for recovering edges and surfaces in photon-limited medical imaging,” IEEE Transactions on Medical Imaging, vol. 22, no. 3, pp. 332–350, 2003.

[99] R. Shukla, P. L. Dragotti, M. N. Do, and M. Vetterli, “Rate-distortion optimized tree-structured compression algorithms for piecewise polynomial images,” IEEE Transactions on Image Processing, vol. 14, no. 3, pp. 343–359, 2005.

[100] P. Prandoni, “Optimal segmentation techniques for piecewise stationary signals,” PhD thesis, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 1999.

[101] P. A. Chou, T. D. Lookabaugh, and R. M. Gray, “Optimal pruning with applications to tree-structured source coding and modeling,” IEEE Transactions on Information Theory, vol. 35, no. 2, pp. 299–315, March 1989.

[102] A. Ortega and K. Ramchandran, “Rate-distortion methods for image and video compression,” IEEE Signal Processing Magazine, vol. 15, no. 6, pp. 23–50, 1998.

[103] E. L. Pennec and S. Mallat, “Sparse geometric image representations with bandelets,” IEEE Transactions on Image Processing, vol. 14, no. 4, pp. 423–438, 2005.

[104] G. Peyré and S. Mallat, “Discrete bandelets with geometric orthogonal filters,” in IEEE international conference on image processing, 2005, vol. 1, pp. I–65–8.

[105] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” IEEE Transactions on Image Processing, vol. 14, no. 12, pp. 2091–2106, 2005.

[106] M. N. Do and M. Vetterli, “The finite ridgelet transform for image representation,” IEEE Transactions on Image Processing, vol. 12, no. 1, pp. 16–28, 2003.

[107] M. Adams and F. Kossentini, “JasPer: A software-based JPEG-2000 codec implementation,” in IEEE international conference on image processing, 2000, vol. 2, pp. 53–56.

[108] T. Koga, K. Iinuna, A. Hirano, Y. Iijima, and T. Ishiguro, “Motion Compensated Interframe Coding for Video Conferencing,” in Proceedings of national telecommunication, 1981, vol. 4, pp. G5.3.1–G5.3.5.

[109] M. O. de Beeck, E. Fert, and Christoph Fehn, and P. Kauff, “Broadcast Requirements on 3D Video Coding.” ISO/IEC JTC1/SC29/WG11 MPEG02/M8040, Cheju, Korea, March-2002.

[110] A. Smolic, “3D Video and Free Videopoint Video - Technologies, Applications, and MPEG Standards.” IEEE workshop on Content generation and coding for 3D-television, Eindhoven, The Netherlands, June-2006.

[111] C. L. Zitnick, S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, “Microsoft Research 3D Video Download.” http://research.microsoft.com/en-us/um/people/sbkang/3dvideodownload, last visited: January 2009.


  1. Video coding standards only define the decoding procedure and the corresponding bit stream but do not specify the encoding algorithms.

  2. The resolution and frame rate of the “Breakdancers” sequence is \(1024 \times 768\) pixels and \(15\) frames per second, respectively.

  3. For clarity, the image coordinate axes are labeled in lower case and the world coordinate axes are labeled in upper case.

  4. For simplicity, we denote \((\boldsymbol{h}^m)^T=\boldsymbol{h}^{mT}\), according to the notation used in [35].

  5. Note that the first multiplication represents an inner product and the second equation leads to a scalar.

  6. assuming a translational motion

  7. Sharing the result of the cost function can be done to enforce spatially consistent disparity values

  8. In this case, the symbol \(\phi\) does not represent an empty set but an undefined element.

  9. This selected reference view is left out from the data set for rendering.

  10. The discussed alternative method [51] presents the depth image of a different viewpoint and time. However, the properties of the sequence do not vary over time and across the views, so that a subjective comparison is still possible.

  11. This remark is similar to a statement of Jean le Rond d’Alembert, a French mathematician, who stated that “Algebra is generous; she often gives more than is asked of her”

  12. The resolution of the “Breakdancers” sequence is \(1024 \times 768\) pixels and the frame rate is 15 frames per second. The compression is performed using an H.264/MPEG-4 AVC encoder with main profile.

  13. In [89], the original MPEG sequences were down-sampled such that a comparison with the results presented within MVC is not directly possible.

  14. The period of inserted intra-coded frames corresponds to the Group Of Pictures (GOP) size.

  15. At the time the presented experiments were performed and published [21]–[23], a simulcast compression constituted the anchor for the coding performance comparisons [32]. In 2007, the reference software JMVM became the anchor for comparisons [85], [90].

  16. To derive this equation, the following properties of series are needed: \(\sum_{i=1}^n i = \frac{n(n+1)}{2}\) and \(\sum_{i=1}^n i^2 = \frac{n(n+1)(2n+1)}{6}\), where \(i\) takes the parameter \(x\) or \(y\).

  17. “Breakdancers” and “Ballet” depth image number 0 of camera 0. Note that the complexity of depth images is not significantly varying over time and across the views, so that including more depth images would not change the results.