Loading...

Heuristic Ray Shooting Algorithms

by

Vlastimil Havran

Submitted to the Faculty of Electrical Engineering, Czech Technical University, Prague, in partial fulfillment of the requirements for the degree of Doctor.

November 2000 Prague

Copyright c 2000, 2001 by Vlastimil Havran. All rights reserved. No part of this publication may be reproduced in any form by print, photoprint, microfilm or any other means without written permission from the author. Revision 1.1 8 May 2001 This minor revision follows the defense of this Ph.D. thesis on 20 April 2001 in Prague and reflects the remarks and comments of oponents and other people involved. I am indebted for comments to oponents Assoc. Prof. Eduard Gr¨oller from Vienna University of Technology, Assoc. Prof. L´aszl´o Szirmay-Kalos from Technical University in Budapest, and Prof. Jaroslav Pokorn´y from Charles University in Prague. Other comments and remarks were kindly supported by Mateu Sbert and Piero Foscari. Spanish abstract was kindly provided by Mateu Sbert.

Revision 1.0 30 November 2000 Submitted to the Faculty of Electrical Engineering, Czech Technical University, Prague, in partial fulfillment of the requirements for the degree of Doctor.

Abstract Global illumination research aiming at the photo-realistic image synthesis pushes forward research in computer graphics as a whole. The computation of visually plausible images is time-consuming and far from being realtime at present. A significant part of computation in global illumination algorithms involves repetitive computing of visibility queries. In the thesis, we describe our results in ray shooting, which is a well-known problem in the field of visibility. The problem is difficult in spite of its simple definition: For a given oriented half-line and a set of objects, find out the first object intersected by the half-line if such an object exists. A na¨ıve algorithm has the time complexity N , where N is the number of objects. The na¨ıve algorithm is practically inapplicable in global illumination applications for a scene with a high number of objects, due its huge time requirements. In this thesis we deal with heuristic ray shooting algorithms that use additional spatial data structures. We put stress on average-case complexity and we particularly investigate the ray shooting algorithms based on spatial hierarchies. In the thesis we deal with two major topics. In the first part of the thesis, we introduce a ray shooting computation model and performance model. Based on these two models we develop a methodology for comparing various ray shooting algorithms for a set of experiments performed on a set of scenes. Consecutively, we compare common heuristic ray shooting algorithms based on BSP trees, kd -trees, octrees, bounding volume hierarchies, uniform grids, and three types of hierarchical grids using a set of 30 scenes from Standard Procedural Database. We show that for this set of scenes the ray shooting algorithms based on the kd -tree is the winning candidate among all tested ray shooting algorithms. The second and major part of the thesis presents several techniques for decreasing the time and space complexity for ray shooting algorithms based on kd -tree. We deal with both kd -tree construction and ray traversal algorithms. In the context of kd -tree construction, we present new methods for adaptive construction of the kd -tree using empty spatial regions in the scene, termination criteria, general cost model for the kd -tree, and modified surface area heuristics for a restricted set of rays. Further, we describe a new version of the recursive ray traversal algorithm. In context of the recursive ray traversal algorithm based on the kd -tree, we develop the concept of the largest common traversal sequence. This reduces the number of hierarchical traversal steps in the kd -tree for certain ray sets. We also describe one technique closely related to computer architecture, namely mapping kd -tree nodes to memory to increase the cache hit ratio for processors with a large cache line. Most of the techniques proposed in the thesis can be used in combination. In practice, the average time complexity of the ray shooting algorithms based on the kd -tree, as presented in this thesis, is about log N , where the hidden multiplicative factor depends on the input data. However, at present it is not known to have been proved theoretically for scenes with general distribution of objects. For these reasons our findings are supported by a set of experiments for the above-mentioned set of 30 scenes.

iii

Resumen La investigaci´on en iluminaci´on global con objetivo la s´ıntesis de im´agenes realistas hace avanzar la investigaci´on en inform´atica gr´afica en su conjunto. El c´alculo de im´agenes visualmente plausibles es costoso y est´a lejos por el momento de ser en tiempo real. Una parte significativa del c´alculo en los algoritmos de iluminaci´on global incluye la computaci´on repetitiva de consultas de visibilidad. En la tesis describimos nuestros resultados en trazado de rayos, problema bien conocido en el campo de la visibilidad. El problema es dif´ıcil a pesar de su simple definici´on: Para una semil´ınea orientada y un conjunto de objetos, hallar el primer objeto intersectado por la semil´ınea, suponiendo que tal objeto exista. Un algoritmo na¨ıve tiene complejidad temporal N , donde N es el n´umero de objetos. El algoritmo na¨ıve es pr´acticamente inaplicable en aplicaciones de iluminaci´on global para una escena con un gran n´umero de objetos, debido a su enorme requerimiento en tiempo. En esta tesis tratamos con algoritmos de trazado de rayos heur´ısticos que usan estructuras de datos espaciales. Resaltaremos el caso de complejidad media y en particular investigaremos algoritmos de trazado de rayos basados en jerarqu´ıas espaciales. En la tesis tratamos principalmente con dos t´opicos. En la primera parte de la tesis, introducimos un modelo computacional de c´alculo de trazado de rayos y un modelo de rendimiento. Bas´andonos en estos dos modelos desarrollamos una metodolog´ıa para comparar distintos algoritmos de trazado de rayos para un conjunto de experimentos realizados sobre un conjunto de escenas. Comparamos consecutivamente algoritmos comunes de trazado de rayos basados en a´ rboles BSP, a´ rboles kd, a´ rboles octales, jerarqu´ıas de vol´umenes englobantes, mallas uniformes y tres tipos de mallas jer´arquicas usando un conjunto de 30 escenas de la Standard Procedural Database. Mostramos que para este conjunto de escenas el a´ rbol kd es el candidato ganador entre todos los algoritmos de trazado de rayos probados. La segunda y m´as extensa parte de la tesis presenta varias t´ecnicas para disminuir la complejidad espacial y temporal en los algoritmos de trazado de rayos basados en a´ rboles kd. Tratamos con los algoritmos de construcci´on del a´ rbol kd y de recorrido del rayo. En el contexto de la construcci´on del a´ rbol kd, presentamos nuevos m´etodos para su construcci´on adaptativa usando regiones espaciales vac´ıas de la escena, criterios de terminaci´on, coste general del modelo para el a´ rbol kd, y heur´ısticas de a´ rea de la superf´ıcie modificadas para un conjunto restringido de rayos. Adem´as, describimos una nueva versi´on del algoritmo recursivo de recorrido del rayo. En el contexto del algoritmo recursivo de recorrido del rayo basado en el a´ rbol kd, desarrollamos el concepto de la sucesi´on m´as larga de recorrido com´un. Esto reduce el n´umero de pasos jer´arquicos de recorrido transversal en el a´ rbol kd para ciertos conjuntos de rayos. Describimos tambi´en una t´ecnica relacionada ´ıntimamente con la arquitectura del computador, a saber el mapeo de los nodos kd a memoria para incrementar el hit-ratio de la cache para procesadores con una l´ınea grande de cache. La mayor´ıa de las t´ecnicas propuestas en la tesis se pueden utilizar en combinaci´on. En la pr´actica, la complejidad temporal media de los algoritmos de trazado de rayos basados en el a´ rbol kd, como se presenta en esta tesis, es aproximadamente log N , donde la constante multiplicativa depende de los datos de entrada. Sin embargo, por el momento no se conoce que se haya probado para escenas con distribuci´on general de objetos. Por estas razones nuestros hallazgos son sustentados por un conjunto de experimentos para el conjunto de 30 escenas mencionado m´as arriba.

iv

Resum´e V´yzkum v oblasti algoritm˚u pro fotorealistickou synt´ezu obrazu ud´av´a smˇer v´yzkumu v oblasti cel´e poˇc´ıtaˇcov´e grafiky. V´ypoˇcet obr´azk˚u, kter´e lze tˇezˇ ko rozpoznat od reality, je velmi cˇ asovˇe n´aroˇcn´y a v souˇcasn´e dobˇe nerealizovateln´y v re´aln´em cˇ ase bez speci´aln´ıch a n´akladn´ych technick´ych prostˇredku. ˚ V´yznamn´a cˇ a´ st v´ypoˇctu˚ algoritm˚u synt´ezy obrazu je tvoˇrena v´ypoˇctem dotaz˚u na viditelnost. V disertaˇcn´ı pr´aci prezentujeme naˇse v´ysledky t´ykaj´ıc´ıch se algoritm˚u vrh´an´ı paprsku jako velmi cˇ asto ˇreˇsen´eho probl´emu viditelnosti. Probl´em vrh´an´ı paprsku je zad´an takto: pro zadanou polopˇr´ımku a mnoˇzinu objekt˚u najdi prvn´ı objekt, kter´y tato polopˇr´ımka prot´ın´a, pokud takov´y objekt existuje. Navzdory jednoduchosti formulace tohoto probl´emu je algoritmus pro jeho efektivn´ı v´ypoˇcet netrivi´aln´ı. Takzvan´y trivi´aln´ı algoritmus m´a pro N objekt˚u ve sc´enˇe line´arn´ı cˇ asovou sloˇzitost. Pro sc´eny, kter´e maj´ı velk´y poˇcet objekt˚u je tento trivi´aln´ı algoritmus nevhodn´y vzhledem k jeho ne´unosn´ym cˇ asov´ym n´arok˚um. V disertaˇcn´ı pr´aci se zab´yv´ame heuristick´ymi algoritmy vrh´an´ı paprsku, kter´e vyuˇz´ıvaj´ı pomocn´ych prostorov´ych datov´ych struktur se zamˇerˇen´ım na pr˚umˇernou cˇ asovou sloˇzitost. Detailnˇe se pak zab´yv´ame algoritmy, kter´e vyuˇz´ıvaj´ı hierarchick´ych datov´ych struktur. V disertaˇcn´ı pr´aci zpracov´av´ame dvˇe hlavn´ı t´emata. V prvn´ı cˇ a´ sti disertaˇcn´ı pr´ace popisujeme nov´y model v´ypoˇctu a v´ykonosti pro algoritmy vrh´an´ı paprsku. S pomoc´ı tˇechto dvou model˚u zav´ad´ıme metodologii pro porovn´av´an´ı r˚uzn´ych algoritm˚u vrh´an´ı paprsku pro mnoˇzinu experiment˚u provedenou na mnoˇzinˇe testovac´ıch sc´en. Pot´e porovn´av´ame dvan´act odliˇsn´ych algoritm˚u vrh´an´ı paprsku, kter´e vyuˇz´ıvaj´ı datov´e struktury bin´arn´ıho strom, kdstromu, oktalov´eho stromu, hierarchie ob´alek, uniformn´ı mˇr´ızˇ ky a tˇrech typ˚u hierarchick´ych mˇr´ızˇ ek a to na mnoˇzinˇe tˇriceti sc´en ze Standard Procedural Database. Ukazujeme, zˇ e v pr˚umˇeru nejrychlejˇs´ı algoritmus vrh´an´ı paprsku ze vˇsech testovan´ych algoritm˚u je ten, kter´y vyuˇz´ıv´a kd-stromu. Druh´a a obs´ahlejˇs´ı cˇ a´ st disertaˇcn´ı pr´ace se zab´yv´a technikami pro zmenˇsen´ı cˇ asov´e a pamˇeˇtov´e n´aroˇcnosti algoritm˚u vrh´an´ı paprsku vyuˇz´ıvaj´ıc´ıch kd-stromu. Zab´yv´ame se jak vlastn´ı konstrukc´ı kdstromu tak i algoritmy pro jeho traverzaci pro zadan´y vstupn´ı paprsek. Z algoritm˚u pro konstrukci kdstromu popisujeme nov´e metody s vyuˇzit´ım voln´ych prostor ve sc´enˇe, krit´eria pro ukonˇcen´ı stavby kdstromu, obecn´y cenov´y model pro stavbu kd-stromu a modifikovan´y algoritmus pro stavbu kd-stromu vhodn´y z hlediska cˇ asov´e sloˇzitosti pro specifick´e mnoˇziny paprsk˚u. D´ale popisujeme nov´y rekurzivn´ı algoritmus pro traverzaci kd-stromu. Dalˇs´ı algoritmy t´ykaj´ıc´ı se kd-stromu zahrnuj´ı koncept a vyuˇzit´ı nejdelˇs´ı spoleˇcn´e sekvence list˚u cˇ i vnitˇrn´ıch uzl˚u kd-stromu v traverzaˇcn´ım algoritmu pro speci´aln´ı mnoˇziny paprsk˚u, coˇz n´am umoˇznˇ uje d´ale sn´ızˇ it poˇcet hierarchick´ych traverzaˇcn´ıch krok˚u. Rovnˇezˇ popisujeme novou techniku pro mapov´an´ı uzl˚u kd-stromu do hlavn´ı pamˇeti poˇc´ıtaˇce s dlouhou ˇra´ dkou vyrovn´avac´ı pamˇeti, kter´a zv´ysˇuje datovou koherenci pˇri proch´azen´ı kd-stromu. Vˇetˇsinu algoritmick´ych technik popsan´ych v disertaˇcn´ı pr´aci je moˇzn´e vhodnˇe kombinovat. Z praktick´eho hlediska dosahuje pr˚umˇern´a cˇ asov´a sloˇzitost algoritm˚u vrh´an´ı paprsku s vyuˇzit´ım kd-stromu log N s t´ım, zˇ e multiplikativn´ı faktor asymptotick´e sloˇzitosti z´avis´ı na vstupn´ıch datech. Nicm´enˇe v souˇcasn´e dobˇe nen´ı zn´am teoretick´y d˚ukaz t´ykaj´ıc´ı se t´eto sloˇzitosti pro sc´eny s libovolnou distribuci objekt˚u a proto jsou v´ysledky pro vˇsechny popisovan´e algoritmy ovˇerˇeny experiment´alnˇe na jiˇz uveden´e mnoˇzinˇe tˇriceti testovac´ıch sc´en.

v

Preface This doctoral thesis presents the research conducted by the author at the Department of Computer Science and Engineering, Faculty of Electrical Engineering, the Czech Technical University in Prague during the period 1996–1999 and at the IGP company during the period 1999–2000. My interest in speeding up visibility computations started while working on my Master Thesis, which dealt with the simulation of the optical phenomena. After graduating in February 1996 from the Czech Technical University in Prague, I became a Ph.D. student with Pavel Slav´ık as my supervisor. At the beginning of my Ph.D. studies I devoted my time to parallel solutions for ray tracing, which developed into an interest in a more general problem – ray shooting. I presented my postgraduate study report Spatial Data Structures for Visibility Computation in June of 1997, and I continued researching in this direction, specializing in ray shooting algorithms over spatial subdivisions. This doctoral thesis covers several new methods and improvements for ray shooting algorithms based on spatial subdivisions. A description of basic and previously developed methods is followed by a description of newly developed methods that decrease the time and space complexity for a ray shooting algorithm based on kd -trees. For most of the duration of my Ph.D. studies I was responsible for designing, implementing, and maintaining the GOLEM rendering system [75] developed as independent software, in which the presented ray shooting algorithms were implemented and tested. Several other people also contributed to the development of the GOLEM rendering system: Ph.D. students Jiˇr´ı Bittner and Tom´asˇ Kopal, and Jan Pˇrikryl from the Vienna University of Technology. Several undergraduate students did their Master Theses utilizing or increasing features of the GOLEM rendering system under my supervision: Filip Sixta, Michal M´asˇa, Libor Dachs, Vladim´ır N´advorn´ık, Petr Ml´adek, and Jaroslav Kˇriv´anek. Many undergraduate students used GOLEM in their undergraduate term projects, particularly, in the course ˇ ara. on Visualization given by Assoc. Prof. Jiˇr´ı Z´ My research during the period 1996–1999 was partially supported by the following grants:

CTU Grant No. 30/98101/336: Adaptive Data Structures for Visibility Computing, 1998. FRVSˇ Grant No. 1252/1998: Solving Visibility in Large Scenes, which was supervised by Assoc. ˇ ara with assistance from Jiˇr´ı Bittner, 1998. Prof. Jiˇr´ı Z´ AKTION Grant No. 1999/17, joint Czech-Austrian scientific collaboration funding between the Vienna University of Technology and the Czech Technical University in Prague, 1999–2000.

vi

Acknowledgements First of all, I would like to express my gratitude to my thesis supervisor, Assoc. Prof. Pavel Slav´ık. He has been a constant source of encouragement during my research. He together with Assoc. Prof. Jiˇr´ı ˇ ara provided me with numerous opportunities for professional advancements. They both stimulated Z´ my research interests in computer graphics, particularly in the early stages of my Ph.D. study. Without their knowledge and help I really could not have progressed. Assoc. Prof. Pavel Tvrd´ık made important suggestions and remarks during the course of my work that shaped my life as a researcher. Many other people active in computer graphics influenced my work. I wish to thank Prof. V´aclav Skala for his impulse for my research during EGWPR’96, held in September 1996 in Bristol. I am grateful to the computer graphics group at Vienna University of Technology for their kindness and for providing me access to research papers that were not available in libraries in the Czech Republic, and especially to Prof. Werner Purgathofer, Assoc. Prof. Eduard Gr¨oller, and Jan Pˇrikryl. I would like to express my appreciation to Assoc. Prof. Jiˇr´ı Matouˇsek from Charles University in Prague, who gave me an invaluable insight into the field of computational geometry. I am much indebted to many people from the computer graphics lab at our department for their comments on my early ideas and research activities, and for keeping life in our lab running. Namely, my thanks belong to Bedˇrich Beneˇs, Roman Berka, Pavel Dibl´ık, Petr Felkel, Petr Hejda, Aleˇs Holeˇcek, Jan Buri´anek, and to Jan Vorl´ıcˇ ek and Martin Brachtl for maintaining the UNIX systems at the graphics lab. I would like to express my special appreciation to my colleague Jiˇr´ı Bittner for his encouragement, comments, common interest, and collaboration within the visibility research field. The staff of our department has provided me a pleasant and flexible environment for my research. Especially, I would like to thank Prof. Boˇrivoj Melichar and Assoc. Prof. Josef Kol´arˇ, both them head of our department, for enabling me to be a Ph.D. student and providing financial support for my research. For financing the research in the last stage of my Ph.D. study, I owe thanks to the IGP company. Concerning my style of writing, I am much indebted to Robin Healey for English proofreading of the final version of my thesis. Last, but not least, I would also like to thank all my personal friends who kept my social and cultural life enjoyable during my studies. Finally, my greatest thanks to my family, whose support was of real importance during all my studies. I would have never finished this thesis without them.

vii

Dedication To all people who have positively influenced my life.

Contents 1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . 1.2 Problem Statement . . . . . . . . . . . . . . . . . 1.3 Basic Terminology . . . . . . . . . . . . . . . . . 1.4 Related Work/Previous Results . . . . . . . . . . . 1.4.1 Computational Geometry . . . . . . . . . . 1.4.2 Computer Graphics . . . . . . . . . . . . . 1.5 Complexity of RSA . . . . . . . . . . . . . . . . . 1.6 Basic Techniques used in RSAs . . . . . . . . . . . 1.6.1 Bounding Volumes . . . . . . . . . . . . . 1.6.2 Bounding Volume Hierarchies . . . . . . . 1.6.3 Spatial Subdivisions . . . . . . . . . . . . 1.6.3.1 BSP Trees and Kd -Trees . . . . 1.6.3.2 Octrees . . . . . . . . . . . . . . 1.6.3.3 Uniform and Non-Uniform Grids 1.6.3.4 Hierarchical Grids . . . . . . . . 1.6.4 Ray-Space Subdivisions . . . . . . . . . . 1.7 Contribution of the Thesis . . . . . . . . . . . . . 1.8 Organization of the Thesis . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . .

1 1 1 2 5 5 6 8 9 10 10 10 11 14 15 16 17 18 19

2 Comparison Methodology 2.1 Motivation . . . . . . . . . . . . . . . . . . . . . 2.2 RSA Computation Model . . . . . . . . . . . . . 2.3 RSA Performance Model . . . . . . . . . . . . . 2.4 Ideal RSA . . . . . . . . . . . . . . . . . . . . . 2.5 Minimum Testing Output . . . . . . . . . . . . . 2.6 Measuring the Minimum Testing Output . . . . . 2.6.1 Software Tool Profiling . . . . . . . . . . 2.6.2 Multiple Run Profiling . . . . . . . . . . 2.6.2.1 Properties . . . . . . . . . . . 2.6.2.2 Corrected Measuring Subset Θ 2.7 Comparison Methodology . . . . . . . . . . . . 2.8 Discussion . . . . . . . . . . . . . . . . . . . . . 2.9 Conclusion . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

21 21 22 24 24 27 28 28 29 30 30 31 33 33

ix

. . . . . . . . . . . . .

3 Best Efficiency Ray Shooting Algorithm 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Project Goals . . . . . . . . . . . . . . . . . . . . . . . 3.3 Scene Complexity . . . . . . . . . . . . . . . . . . . . . 3.3.1 Count Approach . . . . . . . . . . . . . . . . . 3.3.2 Voxelisation Approach . . . . . . . . . . . . . . 3.3.3 Integral Geometry Approach . . . . . . . . . . . 3.3.3.1 Average Number of Intersection Points 3.3.3.2 Probability of Zero Intersections . . . 3.3.3.3 Free Path Statistics . . . . . . . . . . 3.3.4 Information Theory Approach . . . . . . . . . . 3.4 Testing Procedures . . . . . . . . . . . . . . . . . . . . 3.4.1 Definition of Testing Procedures . . . . . . . . . 3.4.2 Invariants . . . . . . . . . . . . . . . . . . . . . 3.5 Results and Discussion . . . . . . . . . . . . . . . . . . 3.5.1 Test Scenes . . . . . . . . . . . . . . . . . . . . 3.5.2 Results . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Discussion . . . . . . . . . . . . . . . . . . . . 3.5.4 Preliminary RSA Selection Algorithm . . . . . . 3.6 Conclusion and Future Work . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

35 35 35 36 36 37 37 38 38 38 38 39 39 40 41 41 41 44 47 48

4 Construction of Kd -Trees 4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Orientation of the Splitting Plane in the Kd -Tree . . . . . . 4.2.2 Positioning of the Splitting Plane . . . . . . . . . . . . . . 4.2.3 Cost Model for Kd -Tree Construction . . . . . . . . . . . . 4.2.3.1 Geometric Probability . . . . . . . . . . . . . . . 4.2.3.2 Basic Cost Model Development . . . . . . . . . . 4.2.3.3 Position of the Splitting Plane . . . . . . . . . . . 4.2.3.4 Position of a Splitting Plane with Minimum Cost . 4.2.4 Termination Criteria . . . . . . . . . . . . . . . . . . . . . 4.2.4.1 Ad Hoc Termination Criteria . . . . . . . . . . . 4.2.4.2 Automatic Termination Criteria . . . . . . . . . . 4.3 Analysis of the Cost Model . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Splitting Geometry of an Axis-Aligned Bounding Box . . . 4.3.2 The Kd -Tree with Minimum Total Cost . . . . . . . . . . . 4.4 Construction of Kd -Trees with Utilization of Empty Spatial Regions 4.4.1 Theoretical Remarks . . . . . . . . . . . . . . . . . . . . . 4.4.2 Early Cutting Off Empty Space . . . . . . . . . . . . . . . 4.4.3 Late Cutting Off Empty Space . . . . . . . . . . . . . . . . 4.4.4 Two-Plane Cutting Off Empty Space . . . . . . . . . . . . . 4.5 Automatic Termination Criteria . . . . . . . . . . . . . . . . . . . . 4.6 Further Results and Problems . . . . . . . . . . . . . . . . . . . . . 4.6.1 Cost Estimate . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Reducing Objects’ Axis-Aligned Bounding Boxes . . . . . 4.7 General Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . .

49 49 51 51 52 53 53 54 55 57 58 58 60 60 60 61 62 63 64 64 66 67 69 69 71 72

x

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

4.7.1 Estimating Blocking Factor . . . . . . . . . . . . 4.7.2 Cost Estimate for Hit and Miss Rays . . . . . . . . 4.8 Preferred Ray Sets . . . . . . . . . . . . . . . . . . . . . 4.8.1 Parallel Projection . . . . . . . . . . . . . . . . . 4.8.2 Perspective Projection . . . . . . . . . . . . . . . 4.8.3 Spherical Projection . . . . . . . . . . . . . . . . 4.8.4 Discussion . . . . . . . . . . . . . . . . . . . . . 4.9 Time Complexity Analysis . . . . . . . . . . . . . . . . . 4.10 Summary of Results and Discussion . . . . . . . . . . . . 4.10.1 Positioning of the Splitting Plane . . . . . . . . . 4.10.2 Termination Criteria . . . . . . . . . . . . . . . . 4.10.3 Cutting Off Empty Space . . . . . . . . . . . . . . 4.10.4 Reducing Objects’ Axis-Aligned Bounding Boxes 4.10.5 General Cost Model . . . . . . . . . . . . . . . . 4.10.6 Preferred Ray Sets . . . . . . . . . . . . . . . . . 4.11 Conclusion and Future Work . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

73 74 76 76 77 78 79 80 80 81 82 83 85 86 87 89

5 Ray Traversal Algorithms for Kd -Trees 5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Basic Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Sequential Ray Traversal Algorithm TAseq . . . . . . . . . . . . . . . . 5.3.2 Recursive Ray Traversal Algorithm TAArec . . . . . . . . . . . . . . . . 5.3.3 Traversal Algorithms with Neighbor-Links . . . . . . . . . . . . . . . 5.3.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3.2 Ray Traversal Algorithm with Single Neighbor-Links TASNL 5.3.3.3 Ray Traversal Algorithm with Neighbor-Links Trees TANLT . 5.3.3.4 Ray Traversal Algorithm for Neighbor-Links . . . . . . . . . 5.3.3.5 Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . 5.4 New Recursive Ray Traversal Algorithm . . . . . . . . . . . . . . . . . . . . . 5.4.1 Traversal Classification . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Analysis of TAArec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Design of a Recursive Ray Traversal Algorithm TABrec . . . . . . . . . 5.4.3.1 Theoretical Considerations . . . . . . . . . . . . . . . . . . 5.4.3.2 Experimental Statistics . . . . . . . . . . . . . . . . . . . . 5.4.3.3 New Recursive Ray Traversal Algorithm TABrec . . . . . . . . 5.4.3.4 Handling Singular Traversal Cases . . . . . . . . . . . . . . 5.4.4 Comparison between TAArec and TABrec . . . . . . . . . . . . . . . . . . 5.5 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

93 93 93 94 94 95 95 95 96 97 98 99 99 99 100 101 101 102 102 103 104 105 108

6 Longest Common Traversal Sequences for Kd -Trees 6.1 Motivation . . . . . . . . . . . . . . . . . . . . . 6.2 Previous Work . . . . . . . . . . . . . . . . . . . 6.3 LCTS Construction . . . . . . . . . . . . . . . . 6.3.1 SLCTS . . . . . . . . . . . . . . . . . . 6.3.2 HLCTS . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

111 111 112 112 113 113

xi

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

6.4

6.5

6.6

6.3.2.1 Traversal Trees . . . . . . . . . . 6.3.2.2 Constructing Initial HLCTS . . . 6.3.2.3 Constructing General HLCTS . . 6.3.3 Further Improvements . . . . . . . . . . . 6.3.3.1 Unification of Empty Leaves . . 6.3.3.2 Termination Object . . . . . . . 6.3.3.3 Initial Leaf Sequence for HLCTS Application of LCTS . . . . . . . . . . . . . . . . 6.4.1 Patch-to-patch Visibility . . . . . . . . . . 6.4.2 Hidden Surface Removal . . . . . . . . . . Results . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Patch-to-patch Visibility . . . . . . . . . . 6.5.2 Hidden Surface Removal . . . . . . . . . . 6.5.3 Discussion . . . . . . . . . . . . . . . . . Conclusion and Future Work . . . . . . . . . . . .

7 Memory Mapping of Kd -Trees 7.1 Motivation . . . . . . . . . . . . . . . . . . . . . 7.2 Preliminaries . . . . . . . . . . . . . . . . . . . 7.2.1 Memory Allocation . . . . . . . . . . . . 7.2.2 Memory Hierarchy . . . . . . . . . . . . 7.3 Representations of the kd -tree . . . . . . . . . . 7.3.1 Random Representation . . . . . . . . . 7.3.2 Depth-First-Search (DFS) Representation 7.3.3 Subtree Representation . . . . . . . . . . 7.4 Time Complexity and Cache Hit Ratio Analysis . 7.4.1 Random Representation . . . . . . . . . 7.4.2 DFS Representation . . . . . . . . . . . 7.4.3 Ordinary Subtree Representation . . . . . 7.4.4 Compact Subtree Representation . . . . . 7.5 Simulation Results . . . . . . . . . . . . . . . . 7.6 Results of Experiments . . . . . . . . . . . . . . 7.7 Discussion . . . . . . . . . . . . . . . . . . . . . 7.8 Conclusion and Future Work . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

113 114 116 116 116 117 117 117 118 118 119 119 120 120 122

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

125 125 125 126 126 127 127 128 128 129 129 129 129 130 131 132 132 133

8 Conclusion and Future Work 135 8.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 8.2 Suggestions for Further Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Bibliography

139

Notation

149

Appendix

153

A – C-pseudocode of TAseq

153

B – C-pseudocode of TAArec

155

xii

C – C-pseudocode of TABrec

157

D – C-pseudocodes of TASNL and TANLT

161

E – Results of Experiments

165

xiii

xiv

Chapter 1

Introduction This thesis deals with heuristic ray shooting algorithms, namely with algorithms based on spatial subdivisions, and particularly with ray shooting algorithms based on the kd -tree. In this introductory chapter we describe the problem in detail together with algorithmic solutions developed in the past.

1.1

Motivation

The principal goal of computer graphics is image synthesis of a scene simulating a real environment. The algorithms for image synthesis are based on various principles influencing the quality of their outputs. A significant research effort in image synthesis involves generation of photo-realistic images. By a photo-realistic image we mean an image indistinguishable from a photograph of a real world. A scene simulating reality is modeled by geometric object primitives in three-dimensional space; it is not exceptional for the number of objects in a scene to reach hundreds of thousands or more. Historically speaking, two main classes of algorithms for photo-realistic rendering have been developed: ray-tracing and radiosity [157, 47]. These both classes are used and sometimes combined together. Recently, the importance of classical radiosity algorithms based on computing form factors between patches has diminished with the coming of Monte-Carlo methods in global illumination [20, 149]. The common property of all these rendering algorithms is their possible high time and space complexity. They spend much time repeatedly computing visibility computations, either ray shooting or visibility for a pair of points. Visibility computations performed within image synthesis correspond to the discrete sampling of n-dimensional space.

1.2

Problem Statement

The visibility for a pair of points problem is more formally defined as follows: two points U x y z and V x y z are mutually visible if the line segment UV with U and V as endpoints does not intersect any object located in the scene. The result of the algorithm solving the problem is of Boolean type: yes (visible) or no (invisible). The computation of the visibility for a pair of points is indispensable to determine correctly the illumination and shading of objects by light sources. The ray shooting problem is a more general visibility problem along a fixed line: For an oriented half-line given by its origin and direction vector, we want to find the closest object intersected by the half-line if such an object exists. The ray shooting problem is depicted in Fig. 1.1.

1

CHAPTER 1. INTRODUCTION

2

Figure 1.1: Ray shooting in IE2 space. The answer for ray R is object B.

1.3

Basic Terminology

In this section we describe the basic terminology used within the thesis. The geometry handled in the thesis uses n-dimensional Euclidean space, further abbreviated as IEn . The basic geometric primitive used here is a ray. The ray R in IEn space is an oriented half-line determined by the point of origin OR and direction vector D R . Any point U lying on the ray half-line can be computed using a parametric representation of a ray: U

OR t D R

OR D R

IEn

(1.1)

where parameter t is called signed distance (t R t 0). It is usually assumed that the direction vector D R is normalized ( D R 1). An object in IEn is a finite region of IEn space with continuous n 1 -dimensional boundaries. There are various shapes of objects; spheres, triangles, general polygons, polyhedra, etc. A surface of the object O is its n 1 -dimensional boundary ∂O, which does not cross itself. A scene N is a set of N objects. We require that both the number of objects and objects themselves are finite. Then we can bound the scene with the finite convex spatial region that contains all the objects. Given a scene N , we can formulate the pair of points visibility problem more precisely. For two points U and V it is stated as follows: points U and V are mutually visible if the line segment UV with U and V as endpoints does not intersect any object from N . Contrarily, the two points are mutually invisible, i.e., there is at least one object from N so that the intersection exists between the object and the line segment UV . Similarly, we can also more formally describe the ray shooting problem. Given a ray R and the scene N , we want to find out the closest object intersected by R if such an object exists. The answer of ray shooting is object Oi , the found intersection point on Oi is also required by most applications. In order to distinguish a ray shooting problem from a particular task given by a specific ray R and scene N , we refer to the latter as ray shooting query. A ray shooting algorithm (abbreviated to RSA in the thesis) is an algorithm that computes the answer to any ray shooting query for a given scene. Usually, in the preprocessing phase an RSA builds up an auxiliary data structure for a set of scene objects. In the execution phase of the RSA the data structure is accessed during computation of the answer to a particular ray shooting query. There is one exceptional RSA that does not perform any preprocessing. It is called a na¨ıve RSA and it uses only a list of scene objects. When answering a ray shooting query, the na¨ıve RSA tests the ray

1.3. BASIC TERMINOLOGY

3

with all the objects and selects the one with the closest intersection found, if such an object exists. The time complexity of the na¨ıve RSA is thus N . In applications in computer graphics where many rays are shot, this na¨ıve RSA is applicable only for scenes with a few objects. For a scene with a high number of objects it makes the application run unacceptably slowly. This property of a na¨ıve RSA led to the research in RSAs. Heuristic RSAs are based on some data structures that cover the distances between the objects in the scene. Since the data structures store the properties and geometric relationships in a space (usually in IE3 space), they are usually called spatial data structures. These spatial data structures for various RSAs describe spatial relationships using a set of spatial regions that are called cells. Further we use as a symbol for an arbitrary cell. The spatial region(s) covered by the cell is given by the cell boundary ∂ . If a cell separates the space into two disjoint parts, we call it a separating cell. Practically, this means that if we denote the parts induced by the cell ρ and τ, then for two points U V , U ρ and V τ must hold that the line segment UV must cross the cell boundary ∂ . An example of a separating cell is an infinite plane that induces two halfspaces. A more common case is when a cell and its boundary is of finite size, in which case we call it closed cell. Then the interior part int of the cell is finite and is completely separated from the exterior of the cell, denoted ext n . Closed cells are commonly used in the spatial data structures that underlie a particular RSA. For the sake of convenience when we speak about a cell we mean the spatial region covered by the interior part of the cell, while the second meaning is the element of a data structure that represents the spatial region. A common example of a closed cell is an axis-aligned bounding box, which is illustrated in Fig. 1.2. This is a parallelepiped with six faces (in IE3 ), each two faces are perpendicular to the coordinate axes (the axes of the standard basis of the space). The size of the axis-aligned bounding box is given by two extreme points A and B. A point U belongs to the axis-aligned bounding box, if Ui Ai and Ui Bi , assuming Ai Bi for i ! x y z " . We further denote by symbol #%$& X the axis-aligned bounding box that tightly encloses entity X , where X stands for an object or a cell. For short, #%$ is also the abbreviation for an axis-aligned bounding box.

Figure 1.2: Axis-aligned bounding box in IE3 .

The cells are organized in spatial data structures. Based on the use of cells in spatial data structures we can distinguish between different types of cells. Further, there can be various relationships between any two cells. Below, we describe some terminology to cover these properties. We call a closed cell elementary if its interior part does not intersect the interior part of any other elementary cell. An elementary cell is intended to contain the list of pointers to the objects that intersect the spatial region covered by the elementary cell. It can also contain other data, for example, the description of the relationship to other cells located in its neighborhood, its size, position, etc. If the elementary cell contains at least one pointer to the object intersecting the interior of the cell, we call it

4

CHAPTER 1. INTRODUCTION

a full elementary cell. Otherwise, we call it an empty elementary cell. Such an empty elementary cell is useful in the sense that the spatial region of the elementary cell contains no object. The emptiness of the spatial region is a piece of information that can be used in a particular RSA, since no ray-object intersection can occur within the empty elementary cell. If a cell is not elementary, we call it a generic cell. A generic cell can contain references to other generic cells, elementary cells, or to the objects or even other data required by a particular RSA. The cells and objects referenced in the generic cell usually intersect the spatial region covered by the cell. As we show later, generic cells are used to form spatial data structures including the hierarchy. Having described the types of cells, we can ask about the geometrical relationship between any two cells, and their purpose. We call two cells neighbors if the boundary of these two cells has a non-finite intersection. The interiors of these two cells are disjoint. Another spatial relationship between two cells is that if a cell 1 is completely contained in the cell 2 , we denote it 1 2 . Different geometric / 1 ( 2 , and relationship between two cells is when 1 partially intersects 2 , i.e., 1 ' 2 ( 0, 2 ( 1. The last possible geometric relationship is when the cells are disjoint, i.e., 1 ' 2 ) 0./ Given a scene with a finite set of objects that are also of finite size, we can construct #%$&*+ as the smallest #%$ that contains all the objects from . For the cell #,$-*. we can form various sets of cells inside #%$&*+ and then construct corresponding spatial data structures. These spatial data structures can be distinguished by the types of the cells and the properties between the cells. A spatial subdivision (SSD) of a cell S is a finite ordered set S of cells, such that for each point A there exists at least one cell , S, A .

S

An elementary spatial subdivision (ESSD) of a cell S is a SSD, which is composed of a finite ordered set SK of closed, disjoint, separating, and elementary cells. Moreover, for each point A S there exists exactly one cell , SK , A int or A lies on at least one cell boundary, A ∂ . A hierarchical spatial subdivision (HSSD) of a cell S is a pair of two finite sets SE and SH , where SE is a set of elementary cells and SH is a nonempty set of generic cells. The cells in SE and SH correspond to the nodes of a graph. Moreover, SE is ESSD and each cell from SE is pointed to at least in one cell from SH . The concept of SSD is the least restrictive and still useful set of cells that can be utilized by an RSA. The only one condition is that a point, possibly the point on the ray, can be located in a cell. The concept of ESSD is more restrictive in the sense that it cannot contain any type of hierarchical relationship. On the other hand, HSSD is required to contain the hierarchical relationship. All these spatial data structures can be the base of various RSAs, and we have described them here just to point out their differences and commonalities. Although all the concepts above can easily be extended to lower or higher dimensional space, in this thesis we deal with RSAs in IE3 only. This is the most important case for computer graphics. We do not discuss the pair of points visibility problem separately, since it can always be converted to the ray shooting problem: If U and V are the points for testing mutual visibility, we construct the ray with its origin in U and direction vector D V U . If no intersection between the ray and scene objects is found between these two points using any RSA, then U and V are mutually visible. Otherwise, they are mutually invisible. The pair of points visibility problem is less demanding than ray shooting, since it does not require us to compute exactly the point of intersection with the object. The result is of Boolean type and must be always correct – no approximative result is allowed. If the points are not mutually visible, the computation can be accelerated by caching the objects that are likely to be intersected on the line segment between these two points. In general, the pair of points visibility problem can thus be less computationally expensive than ray shooting using special techniques including caching of objects [164,

1.4. RELATED WORK/PREVIOUS RESULTS

5

43]. These special techniques are not the subject of this thesis. The speedup achieved by using the special techniques instead of simply converting the pair of points visibility problem to the ray shooting is dependent on the object configuration in the scene and the algorithm used. However, each RSA can be slightly modified for a pair of points visibility problem – when traversing some data structure and checking objects for the intersection with a given ray, we can stop whenever an intersected object lying at some maximum distance is found. Obviously, this modified RSA is more efficient for answering pairs of points visibility queries, and in the worst case it is of the same efficiency.

1.4

Related Work/Previous Results

Ray shooting has been studied by the two research communities; from the perspective of computer graphics and computational geometry. In both groups ray shooting and other visibility problems have attained great interest in the last two decades.

1.4.1

Computational Geometry

Computational geometry traditionally addresses the ray shooting problem with the aim of improving worst-case complexity using -notation, but some attempts have also tried to handle average-case complexity. The research has been particularly active since 1989. More specifically, in computational geometry the ray shooting problem is understood as a special instance of the range-searching problem [6]. A typical range-searching problem is defined as: Let S be a set of N entities (points, objects) in IEd and ϒ be a family of subsets of IEd , where the elements of ϒ are called ranges. The goal of an algorithm solving the range-searching problem is to preprocess S into a data structure so that for query range υ ϒ, the entities in S ' υ can be reported or counted efficiently. Ray shooting is only one type of geometric range-searching problem. The ray shooting problem has been solved in IE2 and IE3 using various approaches that reach similar query time/space/preprocessing time complexity. Computational geometry techniques mostly restrict the shape of objects to have certain properties according to a dimension of space; in IE2 to line segments, simple polygons with N edges, N disjoint simple/convex polygons, in IE3 to convex polytopes with N faces, N convex polytopes, terrain description by continuous surface function (height fields), triangles, spheres, and planar polygons. The corresponding RSA then utilizes the special properties of a given object class. Mathematical tools used inside RSAs includes the parametrisation of oriented/unoriented lines (twoplane, sphere, and especially Pl¨ucker parameterization [166]). The important concepts used by computational geometry cover partition data structures including 1 / r -cutting [7], arrangements [8], geodesic triangulation [65], Steiner triangulation [14], etc. Megiddo’s parametric search technique [8] is often applied; the ray shooting problem is then transformed to a segment emptiness problem. Let us assume N objects (preferably polyhedral and convex) in the scene. A line segment in IE3 (IE2 ) is checked for intersection with objects using some segmentintersection data structure that takes log N time. The detection is applied using a binary search to find out the first object intersected thus resulting in log2 N query time for ray shooting. The parametric search technique has been successfully applied to triangles and spheres in IE3 . Usually, one searching data structure is plugged into another searching data structure, and a parametric search is performed. The space and query complexity of the data structures determine the resulting properties of the RSA. There are several best results from which we can trace the tradeoff between query time and space complexity. Here we discuss IE3 space only, which is of major interest for computer graphics. Mark de Berg introduced the RSA [22] with log N time complexity using N 4 0 ε storage and preprocessing time for any ε 1 0. His Ph.D. thesis includes special cases such as ray shooting with rays from a fixed point or in a fixed direction in a space of axis-parallel polyhedra, c-oriented polyhedra, arbitrary

CHAPTER 1. INTRODUCTION

6

curtains, and general polyhedra. His main idea is to use Pl¨ucker representation of ray space for both the input ray and the representation of objects’ edges and then to apply a point location search in this IE5 space. Agarwal and Sharir [9] presented a method that for M possibly intersecting polyhedra with a total of N faces in 2 M N 2 0 ε storage and preprocessing time reaches log2 N query time. This is a significant improvement over the previous approach given by de Berg, if M 3 N. Optionally, they discussed the approach with N 1 0 ε storage and preprocessing time that reaches M 1 4 4 N 1 4 2 0 ε query time. Mohaban and Sharir [111] presented an algorithm with N 3 0 ε storage and preprocessing time with N ε query time for a set of spheres (or other objects). More recently, Agarwal et al. [5] published an algorithm which, for a set of spheres or more general objects, has N 3 0 ε storage and preprocessing time and reaches log4 N query time. There have also been recent attempts to attack the ray shooting problem from the viewpoint of average-case complexity. This includes the concept of C-complexity for query sensitive RSA [110]. In this case the complexity of a ray shooting query is considered in relation to the scene complexity along the specific ray path. Under simplifying assumptions for object description using ball covering, a PM kd-tree [125], and bounding boxing, it is possible to achieve N log N preprocessing time with N storage, when the ray shooting query time corresponds to the C-complexity of this specific ray. Other approaches aimed at average-case complexity use triangulation, especially minimum weight triangulation and Steiner triangulation [14]. The idea hidden behind the techniques is straightforward; assuming arbitrary rays do not hit any object in space, then the rays cross a minimum number of boundaries of decomposition on average. Unfortunately, the complexity of the algorithms that construct these minimum size decompositions even in IE2 belongs to the NP-hard class or is unknown and thus the approaches have to utilize some heuristics that can produce results that can be far from optimum. There are several algorithmic problems closely related to ray shooting; the existence of stabbing for a given ray and set of objects, the existence of a stabbing line for a given set of objects, a moving line segment among obstacles, etc. The most recent survey papers on ray shooting in the computational geometry field were published by Agarwal and Erickson [6] and Pellegrini [117]. The main problem of ray shooting techniques developed in the field of computational geometry, particularly aiming at worst-case complexity, is their inapplicability in practice. First, this inapplicability is caused by rather difficult implementation of these techniques, the restriction to object classes required by these techniques, the worst-case space complexity reaching at least N 3 , and the unavailability of any practical implementations. Especially, the space complexity for the underlying data structures of RSAs severely limits the use of these techniques to hundred(s) of objects approximately, which is unacceptable for practical use. This restriction holds even for techniques applied to IE2 space. Similarly, to the best of our knowledge, those average-case techniques developed in the field of computational geometry that are potentially promising for some practical use, and that are simpler from the implementation point of view, have not been implemented.

1.4.2

Computer Graphics

The computer graphics community has developed its own RSAs, starting after introduction of ray tracing [160]. The first applications in computer graphics that strongly required some RSA were ray-casting and ray-tracing; at present most new global illumination algorithms [149, 20] aiming at photo-realistic image synthesis also use some RSA. The algorithms for ray shooting developed within the field of computer graphics are of a heuristic nature, and do not follow the concept of worst-case complexity, but rather average-case complexity and practical feasibility. At present there are several known RSAs, a substantial part of which is listed below. In the published papers these RSAs are called acceleration techniques, acceleration methods, acceleration schemes, etc. These RSAs usually assume that a ray-object intersection test is available for each shape of object, which allow us to use general shapes of objects, unlike the RSAs developed in the field of computational geometry. It is also commonly required that an axis-aligned bounding box tightly (or bounding volume in

1.4. RELATED WORK/PREVIOUS RESULTS

7

general, see Subsection 1.6.1) enclosing an object for an arbitrary shape of the object can be computed. The most cited survey on RSAs was given in Glassner’s book [18] by Arvo and Kirk. RSAs were also surveyed by Foley [47] and by Watt and Watt [157]. A more recent and fuller survey on RSAs was presented in Simiakakis’ Ph.D. thesis [132]. The Arvo/Kirk’s survey, although very systematic, is becoming obsolete since it does not cover developments in the last decade. It also does not contain any quantitative comparison of RSAs. The same holds for other surveys. Similarly, the survey by Simiakakis does cover developments since 1995. Further, we present a list of algorithmic techniques, data structures, and issues that have been dealt in the context of RSAs, together with the most important citations1 :

basic spatial data structures – bounding volumes [123] – bounding volume hierarchy [158, 96, 63, 142] – spatial subdivisions

5 5

binary space partitioning (BSP) tree [94, 148, 82, 80] -tree [105, 30, 15, 144, 145, 82, 80] 5 kd octrees (including Octree-R) [59, 139, 126, 118, 45, 159, 54, 147], and many others, (for survey, see [74]) 5 uniform grid [53, 33, 90, 45, 163] 5 non-uniform grid [57] 5 hierarchy of grids [93, 100, 90, 25, 26] 5 Voronoi diagram [106] – ray classification scheme [17, 133, 132, 102] – ray coherence theorem [116, 89] augmenting spatial data structures

– – – – –

macro regions [41] pyramid clipping [156] proximity clouds [38] directed safe zones [124] largest common traversal sequence [78]

additional improvements for spatial data structures – ray cache (mailbox) technique [23, 12, 97] – ray boxing technique [136, 163] special techniques – – – – –

handling CSG primitives [23, 32, 165, 56, 167, 113] plane traversal [49] hierarchy of 1D sorted lists [51] object/ray coherence [64] generalized rays

5 5

beam tracing [87] tracing [11] 5 cone pencil tracing [131] – techniques for restricted sets of rays

5 5

1 The

fixing the origin of rays (including hidden surface removal, i.e., ray casting) [55, 155, 77] and many others. fixing the direction of rays [101, 77]

full list of BibTeX entries can be found at: http://www.cgg.cvut.cz/˜havran

CHAPTER 1. INTRODUCTION

8

specific issues – – – –

termination criteria and heuristics for constructing spatial data structures [146] memory storage/access issues [120, 73] performance prediction of RSAs [122] acceleration techniques to generate the sequence of images [50, 19, 60, 108, 121, 92, 28, 67, 29, 114] – coherence [156, 95] – hybrid spatial data structures (meta hierarchies) [16] – complexity analysis and comparison of RSAs [128, 49, 45, 33, 69, 109, 152, 153, 150, 151, 107, 48, 74, 85, 86]

Some of the techniques listed above can be combined together to get a more efficient RSA. For example, the ray cache technique can be plugged into nearly all RSAs. Moreover, a more general concept known as meta-hierarchies, which combines the basic spatial data structures into one, was proposed by Arvo [16], but no construction algorithm for such a meta-hierarchy has been proposed yet. A detailed review of all these techniques would be rather space demanding and would require a separate review publication of major extent. For this reason we recall in this chapter only the most important facts concerning RSAs, which are necessary for an understanding of the rest of the thesis.

1.5

Complexity of RSA

The theoretical complexity bounds using -notation for ray shooting problem were addressed in a series of papers by Szirmay-Kalos and M´arton (chronologically cited: [107, 48, 150, 151, 152, 153]). Since their results are important for understanding the contribution of this thesis, we make a brief survey of their work here. The interested reader is advised to study the paper [153] for more details. Most of this section follows the main results of this paper. Szirmay-Kalos and M´arton state that worst-case optimal RSAs are both difficult to implement and practically infeasible due to the their prohibitive memory and preprocessing time complexity. They present the concept of a “good” RSA, which should run in sublinear time after sub-quadratic preprocessing with linear space complexity. They show why heuristic RSAs are used in global illumination and their worst-case and average-case time complexity. They propose a complementer plane algorithm that solves the ray shooting problem in log N complexity with N 8 space complexity for a general class of convex objects in IE3 . The main idea is to define ray space using the complementary plane perpendicular to an input ray. The required complementary plane from a set of such planes for a given ray can be found using a point location search. The number of topologically different projections of objects to the plane is finite, and thus the whole search space of complementary planes is also finite. The number of objects projected to any complementary plane is again finite, which also bounds the search space. The number of intersections with objects for a single ray is also finite, which discretises this search space. As a result, all the three search spaces (complementary plane, the position of the ray projected on the complementary plane, and the position of the object along the ray path) can be searched using point location. This is the classical problem in computational geometry, solvable using balanced binary trees in log N time. As a result, the proposed RSA reaches log N time with N 8 space complexity, which is very prohibitive for any practical use. Their next result is the lower bound of space complexity for any worst-case RSA working in log N time complexity. They show that this lower bound is Ω N 4 in worst-case, which is still prohibitive for any practical use. They further show the lower bound of worst-case complexity of ray shooting as a problem itself is Ω log N . They also provide an average-case analysis of heuristic RSAs. They propose the concept of a provocative RSA assuming the ray origin is fixed in the space. The basic idea of the provocative RSA is to sort

1.6. BASIC TECHNIQUES USED IN RSAS

9

objects according to their distance from the origin point of a ray in N log N time. A ray shooting query is solved by checking the intersection with the closest object first and the farthest object last, resulting in N time complexity in the worst case. They analyze the average-case complexity of this provocative RSA for a set of spheres of the same radius uniformly distributed in the scene. They show that the average-case time complexity of a provocative RSA for the scene is formally 1 , and this also holds for a scene with randomly distributed spheres of varying radii [107]. They state that almost every heuristic RSA based on spatial data structures has a common idea in the provocative RSA. The underlying spatial data structures try to represent the distance between the objects, not from a single point. To save the distance from all possible positions would require infinite storage space. The spatial data structures of heuristic RSAs use some approximation that subdivides space into finite spatial regions from where the objects can be roughly sorted according to their distance. They support their theory by results of simulation for randomly distributed spheres for uniform grids [53], ray coherence [116], ray classification [17, 133], and the Voronoi diagram [106]. They conclude with the statement that the fundamental assumption of uniformly distributed objects in the scene can be violated in most cases in practice. We may remark that many scenes that represent the real world require a lot of free space in them. The inhabitable structures formed by humans must contain much free space just to allow movement of humans and animals and transport of things (e.g., streets, corridors, rooms). The unknown multiplicative factor hidden behind -notation and the assumption on uniformly distributed objects in the scene disables fair comparison of the heuristic RSAs using common formal complexity tools – -notation. A significant quantitative difference in performance of heuristic RSAs has been observed in many papers [144, 109, 49, 44], although these RSAs have theoretical 1 averagecase complexity for scenes with randomly distributed objects [153]. The average-case analysis for practical scenes that have a distribution of objects uncatchable by simple mathematical tools typically used for this purpose remains an unsolved problem. If the distribution of objects cannot be described by several parameters, it raises the question of whether an analysis is a solvable problem from a theoretical point of view at all. For this reason, the performance of RSAs is practically compared on set of test scenes, using for example the scenes from Standard Procedural Database (abbreviated to SPD throughout the thesis) introduced by Haines [69]. Papers addressing heuristic RSAs in the past used only a subset of SPD scenes or a private set of scenes, containing typically from 3 to 6 scenes. Research supported by experiments on such a low number of scenes is not statistically relevant. Further, the different implementation of reference RSAs, and the use of hardware-dependent timing statistics rather than hardware-independent characteristics has made a fair comparison of work published in various papers rather infeasible. Additionally, it is not known whether SPD scenes – although being scalable – offer a good representative set of scenes suitable for testing of RSAs. As a result of all the factors mentioned above, the papers about heuristic RSAs published in the past often contain mutually contradictory statements.

1.6

Basic Techniques used in RSAs

In this section we present a short survey of heuristic RSAs, since these are further elaborated in the thesis. For a more detailed study the reader is referred to the surveys by Arvo and Kirk [18] and/or Simiakakis [132], or to the original papers cited here. RSAs based on basic spatial data structures have in practice a lower time complexity than a na¨ıve RSA. They use either spatial subdivisions [18, 157], hierarchical clustering of objects [25], or a combination of these principles [100]. Below we describe all commonly used spatial data structures.

CHAPTER 1. INTRODUCTION

10

1.6.1

Bounding Volumes

A na¨ıve RSA tests every object for intersection with a given ray. The ray-object intersection test itself can be an expensive operation, particularly for some shapes of objects (NURBs and other splines surfaces, polygons with many edges, etc.). Therefore it is advantageous to enclose tightly the object in a bounding volume with a simple ray-object intersection test. Practically, the bounding volume of an object O is a cell for which holds O ' ) O. If a ray intersects the objects’ bounding volume, an intersection between the ray and the object is performed. If a ray does not intersect the bounding volume, it cannot intersect the object and thus a substantial part of the computation can be avoided on average. A simple bounding volume is a sphere, which has a particularly simple ray-object intersection test [61]. The second possibility is to use tight #%$ of the object. Another alternative is to use arbitrarily oriented rectangular parallelepipeds [18], also called slabs. The requirements posed on the properties of a bounding volume are as follows:

Tightness: If a ray intersects the bounding volume, then the probability that the ray also intersects the object is high. Efficiency: The time complexity of the intersection test between a ray and the bounding volume is small. Some tradeoff must be sought for these two requirements. An example of bounding volume is the above-mentioned #%$ . The use of bounding volumes for the data structures of RSAs as described above is more general – bounding volumes can be used both inside bounding volume hierarchies and also in the spatial subdivisions described below.

1.6.2

Bounding Volume Hierarchies

A natural extension to bounding volumes is a bounding volume hierarchy (abbreviated to BVH further in the thesis), which takes advantage of hierarchical coherence. Given the bounding volumes of objects, an n-ary rooted tree of the bounding volumes is created with the bounding volumes of the objects at the leaves. Each interior node v of BVH corresponds to the bounding volume that completely encloses the bounding volumes of the subtree rooted at v. The hierarchy provides naturally the method for testing a ray with the objects in the scene. If the ray does not intersect the enclosing bounding volume at the root node, it cannot intersect any object. Otherwise, the hierarchy is recursively descended again only for those nodes of BVH whose bounding volumes were intersected by the ray. The interesting property of BVH is that although a bounding volume of a node always completely includes its child bounding volumes, these child bounding volumes can mutually intersect. The method for automatic construction of BVH was first described by Goldsmith and Salmon [63]. The construction of BVH proceeds bottom-up and is object-oriented, it does not have the property of ESSD or HSSD, since the bounding volumes overlap.

1.6.3

Spatial Subdivisions

A number of basic heuristic RSAs developed in the past are based on spatial data structures that subdivide the spatial region of a scene into cells, mostly using a method known as divide and conquer. The result is approximative ordering of objects in space according to their distance among the objects. The distance is somehow encoded into the properties of the spatial data structures, which are usually called spatial subdivisions. The quality of the distance encoding and the ease of accessing the distance in spatial subdivisions directly effects the performance of the corresponding RSA, which traverses spatial data structures and finally produces a result for a given ray shooting query. The basic spatial subdivisions were developed by several authors [95, 53, 59] and are described in more detail below. The common

1.6. BASIC TECHNIQUES USED IN RSAS

11

principle of all spatial subdivision structures is to subdivide the cell corresponding to the scene #%$ into a set of cells S 6 by a set of boundaries Sb ∂ 6 . In terms of terminology given in Section 1.2, the principle is to create ESSD or HSSD for initial cell , where each elementary cell contains a list of pointers to objects fully or partially contained in the cell. The geometry of ESSD/HSSD, for known spatial subdivisions excluding Voronoi diagrams [106], is induced by splitting planes perpendicular to one of the coordinate axes. Using spatial subdivisions, the corresponding RSA locates the elementary cells along the ray. If any intersection exists between a given ray and an object belonging to the elementary cell, and if the corresponding intersection point lies inside the elementary cell, then the answer to the ray shooting query is found. If there are more objects found to be intersected inside the cell, the closest one is selected. If no object with an intersection has been found or if the intersection point lies outside the currently processed elementary cell, the computation proceeds to the next elementary cell along the ray. The algorithm identifying the cells along the ray path is called the ray traversal algorithm. An RSA is thus composed of an underlying spatial data structure and a corresponding ray traversal algorithm. Some spatial subdivisions require to have elementary cells of the same shape and size. The common property of spatial subdivisions, unlike BVH, is that the elementary cells are always disjoint (nonoverlapping). The constructed elementary cells are either addressed directly or from the hierarchical cells. The construction of spatial subdivisions usually proceeds in a top-down way. The concept of spatial subdivisions is space-oriented. Since we deal with RSAs based on spatial subdivisions in detail in the rest of the thesis, we continue below with a description of the most commonly used spatial subdivisions. 1.6.3.1

BSP Trees and Kd -Trees

A Binary Space Partitioning (BSP) tree is a spatial subdivision that can be used to solve a variety of geometrical problems. It was initially developed as a means of solving the hidden surface problem in computer graphics [52]. It is a higher dimensional analogy to the binary search tree. The BSP tree has two major variants in computer graphics, which we call axis-aligned and polygon-aligned. The polygon-aligned form [52, 66] chooses a plane underlying the polygon as the splitting entity that subdivides the spatial region into two parts. The scene is typically required to contain only polygons, which is too restrictive for ray shooting applications. We do not deal with the polygon-aligned form of BSP tree here. For a survey and application of the techniques based on the polygon-aligned form of the BSP tree see for example [43, 112]. In the axis-aligned form of the BSP tree the splitting entity is the plane that is always perpendicular to one of coordinate axes. The concept was first used for an RSA by Kaplan [94]. Since the splitting planes are perpendicular to the coordinate axes, the spatial subdivision is also called a rectilinear BSP tree or an orthogonal BSP tree. Since in most of the thesis (Chapter 4–7) we deal with the axis-aligned form of the BSP tree, we describe it in greater detail than other basic spatial data structures. A BSP tree for a set S of objects is defined as follows: Each node ν in the BSP tree is associated with its axis-aligned bounding box #%$& ν , which is a cell. The cell associated with the root of the BSP tree is the axis-aligned bounding box #%$ of all objects from S. Each interior node ν of the BSP tree is assigned a splitting plane Hν that subdivides #%$& ν into two cells. Let Hν0 be the positive halfspace and Hν7 the negative halfspace bounded by Hν . The cells associated with the left and the right child of ν are #%$& ν ' Hν0 and #%$& ν ' Hν7 , respectively. The left subtree of ν is a BSP tree for the set of objects Sv7 s ' Hv7 ( 0/ s Sv " , and the right subtree is defined similarly. Each leaf node νE may contain a list of objects SνE that intersect the axis-aligned bounding box #%$& νE associated with νE . When a leaf contains at least one object, we call it a full leaf. Otherwise, we call it an empty leaf. We call a cell #%$& νE associated with the leaf νE a leaf-cell. For the sake of convenience, let lchild ν denote a left child of node ν and similarly let rchild ν denote its right child. An example of a BSP tree in IE2 is depicted in Fig. 1.3. In terms of terminology given in Section 1.2

CHAPTER 1. INTRODUCTION

12

the leaf of a BSP tree corresponds to the elementary cell of HSSD and the interior node to the generic cell of HSSD.

Figure 1.3: An example of the BSP tree in IE2 .

The orthogonality of splitting planes in the BSP tree significantly simplifies the intersection test between a ray and the splitting planes. The cost of computation of the signed distance corresponding to the intersection point between a ray and the axis-aligned splitting plane is roughly three times lower than for an arbitrary positioned plane (for a detailed explanation, see Section 4.2). A BSP tree is usually constructed hierarchically in top-down fashion, as outlined in the pseudocode, Algorithm 1. At a current leaf ν a splitting plane is selected that subdivides #%$& ν into two cells. The leaf then becomes an interior node with two new leaves. The objects associated with ν are distributed into its two new descendants. The process is repeated recursively until certain termination criteria are reached. Commonly used termination criteria are the maximum leaf depth and the number of objects associated with the leaf. Algorithm 1 BSP tree construction, recursive version of the algorithm. procedure Subdivide(CurrentNode, CurrentTreeDepth, CurrentSubdividingAxis) if ( (CurrentNode contains too many objects) and (CurrentTreeDepth is not too high) ) then Children of CurrentNode CurrentNode’s Bounding Volume Note that child 0 max DividingAxis and child 1 min DividingAxis are always equal. if CurrentSubdividingAxis = X-Axis then child 1 min x mid-point of CurrentNode’s X-Bound child 0 max x mid-point of CurrentNode’s X-Bound NextSubdividingAxis Y-Axis else if CurrentSubdividingAxis = Y-Axis then child 1 min y mid-point of CurrentNode’s Y-Bound child 0 max y mid-point of CurrentNode’s Y-Bound NextSubdividingAxis Z-Axis else if CurrentSubdividingAxis = Z-Axis then child 1 min z mid-point of CurrentNode’s Z-Bound child 0 max z mid-point of CurrentNode’s Z-Bound NextSubdividingAxis X-Axis end if for all objects referenced in CurrentNode do if the object is within children’s bounding volume then add the object to the children’s object list end if end for Subdivide(child 0 , CurrentTreeDepth 1, NextSubdividingAxis) Subdivide(child 1 , CurrentTreeDepth 1, NextSubdividingAxis) end if

9

: ; < : ; <

: ; < < 8 < 8

: ; < : ; <

< 8 < 8

: ; < : ; <

< 8 < 8

: ; : ;

8

<

8

: ; <

8 8

>

>

<

=

1.6. BASIC TECHNIQUES USED IN RSAS

13

An important feature of the axis-aligned form of the BSP tree is its adaptability to the scene geometry that is induced by the possibility to position the splitting plane arbitrarily. Traditionally, the splitting plane is positioned at the mid-point of the chosen axis, and the order of axes is regularly changed on successive levels of the hierarchy [94]. Another method uses adaptive positioning of the splitting planes when the position of the splitting plane is chosen along the whole range using a surface area heuristic [105] (described in Chapter 4). The positioning of the splitting plane is sometimes used to distinguish between the BSP tree and the kd -tree, which was introduced by Bentley [21] in 1975. Conceptually, the BSP tree and the kd -tree are equivalent. The important feature of the kd -tree is that it always has axis-aligned splitting planes unlike the BSP tree. In some literature, the axis-aligned form of the BSP tree is only a type of kd -tree where the splitting plane always lies at the mid-point of the current box resulting in two children with cells of equal size. In some other literature, the axis-aligned form of the BSP tree can have arbitrary positioning of the splitting planes and is thus equal to the kd -tree. In this thesis, a BSP tree refers to a binary space partitioning in the axis-aligned form that always uses mid-point positioning of the splitting planes. The kd -tree can have arbitrarily positioning of the splitting planes. Thus any BSP tree is a kd -tree, but not vice versa. The positioning of the splitting planes can significantly influence the performance of RSAs based on the kd -tree, particularly for sparsely occupied scenes2 . This is described in detail in Chapter 4. Algorithm 2 The BSP tree and the kd -tree ray traversal algorithm, recursive version. function RayTreeIntersect(ray R, node, min, max): ob ject if node is empty then RayTreeIntersect ”no intersection” else if node is a leaf then Intersect ray R with each object referenced in the leaf discarding those farther away than max. RayTreeIntersect the ”object with the closest intersection point” else t the signed distance along ray to the splitting plane of the node Near the child of node for half-space containing the origin of R the ”other” child of node – i.e. not equal to near f ar if t max and t 0 then Whole interval is on near cell – recursion. RayTreeIntersect RayTreeIntersect(R, near, min, max) else if t min then Whole interval is on far cell – recursion. RayTreeIntersect RayTreeIntersect(R, f ar, min, max) else The ray intersects the plane – recursion. hitData RayTreeIntersect(R, near, min, t) Test near cell. if (hitData indicates that there was a hit) then RayTreeIntersect [hitData] else There was no hit in the near cell – test far cell. RayTreeIntersect RayTreeIntersect(R, f ar, t, max) end if end if end if end if end if

8

8

8

8

8

?9 @

A

9B

? B A 8

=

=

9

8

= 9

9

8

=

8

8

=

Given a ray and a BSP tree, the ray traversal algorithm identifies elementary cells along the ray. 2 Sparsely occupied scene (also called unevenly occupied scene) is such a scene, where the distribution of objects in the scene is non-uniform. Most of the space in the scene is then empty.

CHAPTER 1. INTRODUCTION

14

There are a few variants of the ray traversal algorithm for the BSP tree, they are further elaborated in Chapter 5. Here, we shortly describe only the recursive ray traversal algorithm for BSP tree, which is outlined in the pseudocode, Algorithm 2. The basic idea of this algorithm is that for each node we identify one of four possible cases: to traverse only the left child, only the right child, the left child first and then the right child, or the right child first and then the left child. When we descend to the left or right child, we recurse with the same algorithm. 1.6.3.2

Octrees

An octree (Octal tree) in IE3 is a spatial subdivision that is similar to the quadtree [127] in IE2 space. It is also built recursively in top-down fashion, but unlike to the BSP tree the initial cell is not split into two cells but into eight cubic cells (in IEn space into 2n cells). The cubic cells of the octree, often called octants, lie at various depths from the root node and thus they vary in size. The octree itself can serve as the representation of a three-dimensional object [127]; in this case the leaves of the octree are marked either empty or full. The use of the octree for RSA was introduced by Glassner [59]. Cells with high object occupancy can be recursively subdivided into smaller and smaller cells, generating new cells in the octree. The addressing of child nodes when accessing the interior octree node is provided by direct pointing or hashing. In the former case the interior node has to contain eight pointers to its descendants. In the latter case the address of the child nodes used for hashing is usually formed by postfixing or prefixing the parent address by digits from 1 to 8 corresponding to the geometrical position of the child node, as depicted in Fig. 1.4. Numbering the nodes this way (instead of from 0 to 7) loses the octal purity of the original scheme, but improves the hashing itself [59].

Figure 1.4: Octree space partitioning.

The ray traversal algorithm for the octree is more complicated than for the BSP tree or the uniform grid (see subsection below). Each ray intersecting an octree node can visit at most its four of eight descendants, and the computation of their order to be visited along the ray path is thus more involved than for the BSP tree. The time consumed by a ray traversal algorithm for the octree is given by the efficiency and robustness of the algorithm determining which child nodes are to be visited and in which order. Recently, a survey [74] dealing with ray traversal algorithms for the octree has been published. The octree naturally exploits spatial coherence because the objects that are close to each other in space are referenced in leaves that are close to each other in the octree. The common termination criteria for octree construction are the same as for the BSP tree: the maximum allowed depth and the minimum number of references to objects in the leaves. The properties of the octree that are related to the RSA can be compared to the properties of kd -tree (or the BSP tree). However, there are cases where the performance of RSAs based on the octree and the RSA based on the kd -tree significantly differ. We should note that the geometry induced by the octree can be simulated by the BSP tree, but the octree has a more regular structure. Depending on the ray traversal algorithm, the order of all the octants to be visited in the ray traversal algorithm is usually determined even if the ray can terminate already in the first octant. Small occupancy of the leaves for sparsely occupied scenes is the next disadvantage of the octree. The empty neighbor leaves in the octree

1.6. BASIC TECHNIQUES USED IN RSAS

15

have to be determined and traversed, whereas in the case of construction of the BSP tree it is likely they would be represented by a single leaf. The small occupancy of leaves implies relatively large memory requirements for octree representation. There is a variant of the octree called Octree-R [159] that results in arbitrary positioning of the splitting planes inside the interior nodes. The difference between the octree and Octree-R is similar to that between the BSP tree and the kd -tree. The smart heuristic algorithm for positioning the splitting planes inside the octree interior node is applied independently for all three axes. Then the speedup between the octree-R and the octree can be from 4% up to 47%, depending on the distribution of the objects in the scene [159]. 1.6.3.3

Uniform and Non-Uniform Grids

A uniform grid is another common spatial subdivision used in RSAs. It involves the subdivision of initial cell into equally sized elementary cells formed by splitting planes that are axis-aligned regardless of the distribution of objects in a scene. In terms of introductory terminology the uniform grid is ESSD. The n-dimensional grid resembles the subdivision of a two-dimensional screen into pixels. A list of objects that are partially or fully contained in the cell is assigned to each parallelepiped cell (also called a voxel in IE3 space). The uniform grid in the context of RSA was first introduced by Fujimoto [53]. The ray traversal algorithm for a uniform grid has been improved by several researchers, including Hsiung [90] and Endl [45].

Figure 1.5: An example of the uniform grid in IE2 .

Since the uniform grid is created regardless of the occupancy of objects in the voxels, it typically forms many more voxels than the octree or the BSP tree, and therefore it demands necessary storage space. Nevertheless, the ray traversal algorithm for the uniform grid can be performed very efficiently, since the voxels are of the same size. The ray traversal algorithm, often referred to as 3D-DDA, is analogous to the Bresenham algorithm for drawing a straight line in IE2 raster space and thus requires a simple operation for each traversal step (addition, subtraction, and comparison). The disadvantage of the uniform grid is that the occupancy of most voxels can be very small, particularly for sparsely occupied scenes. In this case a ray typically has to traverse many empty voxels before hitting the full voxel. Since in sparsely occupied scenes most objects are located in a relatively small number of all the voxels, the full voxels contain many references to objects. These voxels are costly to check for intersection with the ray. The uniform grid does not adapt to the distribution of the objects in the scene, and in the worst-case the improvement of an RSA based on the uniform grid over a na¨ıve RSA need not be significant. The uniform grid was analyzed C by Clearly and Wyvill [33]; they showed that the optimal subdi vision for one axis is k dvoxel 3 N voxels and the minimum total time per ray then corresponds to

CHAPTER 1. INTRODUCTION

16

C

tmin const 3 N for N objects of the same shape and size that are uniformly distributed in space. The parameter dvoxel denotes voxel density – the required ratio between the number of voxels and the number of objects. The required number of voxels is then nr N dvoxel . We call this method for setting the resolution of the uniform grid a homogeneous method since it disregards the shape of the scene #%$ . The shape of the voxel is a small copy of the scene #%$ . In order to get a more efficient RSA based on a uniform grid, other methods for setting up the uniform grid resolution have been developed (most of them in the context of hierarchical grids as described below). In order to achieve a more cubic shape of voxels Woo [163] uses the method of setting up the uniform grid resolution that we call here Woo’s method. Let x, y, and z be the size of the scene #%$ in all three axes. Then to get nr voxels in the uniform grid the following formulas are used:

E

c MAX x y z D nr N dvoxel u 3 dvoxel nr x y z Ny MAX 1 Nz MAX 1 Nx MAX 1 D D D u c u c u c where Nx , Ny , and Nz is the number of voxels along the x, y, and z-axis.

(1.2)

Similarly, Klimaszewski [100] developed a method to set the resolution of the uniform grid that we call here a heterogeneous method. In order to get nr voxels, the heterogeneous method uses the following formulas: Nz

GF

3

nr z3 x y H

Ny

IFKJ

nr y Nz x H

Nx

GF N n rN H y z

(1.3)

The non-uniform grid was proposed by Gigante [57]. In the non-uniform grid the splitting planes are also axis-aligned, but along the axis they can be positioned arbitrarily. Non-uniform grids can have a better fit of voxels to the scene geometry, since for sparsely occupied scenes they put more splitting planes to the spatial regions with higher object occupancy. The positioning of the planes is performed according to the histogram of objects along the coordinate axes. The disadvantage of the non-uniform grid is the lower efficiency of its ray traversal algorithm, since the 3D-DDA algorithm cannot be used. Gigante showed that for several tested scenes the performance of RSA based on the non-uniform grid is lower than that of RSA based on the uniform grid. In order to improve further the efficiency of ray traversal algorithms, a few techniques of coding empty space around each voxel have been developed. If a voxel is empty, then we can determine the smallest distance from some voxel to the first full voxel in the number of empty voxels in all directions. If the voxel with the encoded distance to the empty voxel is visited, a ray traversal algorithm can determine that all the empty voxels in the direction of the ray can be skipped. This technique was first introduced by Cohen and Sheffer [38]; for each voxel it determines the smallest distance to a full voxel. A similar but directional technique was presented by Semwal and Kvarnstrom [124]. It considers the six possible directions to encode the smallest distance to the first full voxel. This directionality allows us to increase the distance to the first full voxel. Obviously, both techniques suffer from large storage space and preprocessing requirements that strongly depend on the distribution of objects in the scene. 1.6.3.4

Hierarchical Grids

A hierarchical grid is another attempt to avoid the regularity of uniform grids, since this regularity is particularly inconvenient for sparsely occupied scenes. The common principle used in hierarchical grids is to insert uniform grids recursively into other uniform grids. The known methods of several hierarchical grids strongly differ in the construction phase. Hierarchical grids also require a more complicated ray traversal algorithm than uniform grids. Three RSAs based on hierarchical grids have been published. For a survey, more detailed description, and performance comparison, see [135], or for a less detailed survey [85]. We describe here only the basic properties of RSAs based on hierarchical grids.

1.6. BASIC TECHNIQUES USED IN RSAS

17

Recursive Grids Recursive grids (abbreviated to RG further in the thesis) simply bring the concept of recursiveness into uniform grids. The principle is as follows: construct a uniform grid over the set of objects, assign the objects to all voxels where they belong. Then recursively descend; that is, for each voxel that contains more objects than a given threshold, construct a grid again. The construction of grids is terminated when the number of objects referenced in the voxel is smaller than a threshold or some maximum depth for grids is hit. This maximum depth is usually set to two or three. The principle is thus similar to the BSP tree or the octree, but at one step a cell is subdivided into more than two or eight child cells. Hierarchy of Uniform Grids Cazals et al. [25, 26] published an RSA based on a hierarchy of uniform grids (abbreviated to HUG further in the thesis) that is similar to recursive grids. There are two main differences. First, the inserted grids need not align with the parent grid subdivision. Each grid is understood as an autonomous object possibly contained in another grid. Second, the construction of HUG is quite different. In the first step, the objects are filtered according to their size into groups, and the number of groups is usually either two or three. Let us suppose three groups are constructed by filtering. The first group contains largesize objects, the second group middle-size objects, and the last group small-size objects. Within the middle-size and small-size groups we perform clustering according to the distance between the objects in the group. Thus each group contains several clusters. For objects in the large-size group we construct an initial global grid. For all clusters from the middle-size and small-size group having a large enough number of objects we also construct a grid. These grids are inserted into the global grid recursively, so the smaller grids are contained in bigger ones. The author’s statement in the introduction to the papers that the method is fully automatic is true as far as it goes; however, it does require at least two parameters to be set initially (the number of groups and delta-connectivity). Adaptive Grids Klimaszewski and Sedeberg described an RSA based on adaptive grids [100, 98] (further abbreviated to AG). The spatial data structure principally differs from RG and HUG within the construction. The first step of the construction algorithm is clustering of objects according to some criteria based on the distance between two candidates. A candidate can be already existing cluster of objects or a single object. The criteria take into consideration candidates’ and the resulting cluster’s surface areas. Moreover, the resulting cluster must be small enough in comparison with the #%$ of the whole scene. In the second phase of the algorithm, a bounding volume hierarchy (BVH) is built up over the clusters. Grids are then constructed for the clusters (leaves of the BVH). The number of children in the interior nodes of the BVH is small, so it is inconvenient to create grids for these nodes. In the last phase, so called sub-voxel grids are constructed for those voxels where the number of objects is greater than a given threshold. The ray traversal algorithm is thus similar to that for BVH.

1.6.4

Ray-Space Subdivisions

Several RSAs are based on the subdivision of a ray space. The ray space is described by 5-dimensional coordinates. The first three coordinates of the ray space are Euclidean IE3 and correspond to the origin of the ray. The next two coordinates are spherical σ2 and determine the direction of the ray. The ray space subdivision thus uses IE3 L σ2 space. Let us suppose we have N objects in the cell, which are indexed from 0 to N 1 . Then we can define an assignment function for the ray space: Definition 1 The assignment function fA x y z ϕ ρ for a set of N objects is the discrete function f : IE3 L σ2 M Z00 N 1 " that solves the ray shooting problem directly and is defined as follows:

CHAPTER 1. INTRODUCTION

18

fA x y z ϕ ρ O i

P

i k k i T 1

RQ 0 N 1S

i corresponds to the index of the closest object intersected if the ray does not intersect any object in the cell

The ray space, over which function fA is defined, has to be discretised for any practical use. The space IE3 for the origin of the rays is discretised into cells. Similarly, σ2 space is also discretised so that a set of solid angles completely covers the unit sphere. Because of discretisation, one node of the data structure describing the fA does not correspond to a single half-line but to a spatial region called a hyper-cubic region. The assignment function fA should return the candidate list of indexes of objects instead of one index to an object that can be visible from the origin point of a ray. For a particular ray shooting query the correct candidate list is searched and all the objects in the candidate list are tested for intersection with the ray. The object with the closest intersection point is then chosen, if such an object exists. The ray classification scheme was suggested by Arvo and Kirk [17] and further elaborated by Simiakakis [133, 132]. Ray space can be subdivided using some variant of binary space partitioning. This is performed for primary rays and also for higher order rays in ray tracing. The geometry primitive corresponding to the approximation of ray space is the polyhedral volume defined in IE3 , also called a beam. Whatever the improvements of the algorithm based on the ray space subdivision over na¨ıve RSA, the method suffers from an algorithmic paradox: the construction of the candidate lists is much more computationally demanding than with spatial subdivisions. Arvo and Kirk report the high time complexity of detecting polyhedral intersections and suggest an approximation where hypercubic regions are bounded by cones [18]. Kwon et al. [102] presented an approach aimed at decreasing the space complexity of ray classification by removing one dimension of ray space. Nevertheless, the results presented by Simiakakis [132] show that the performance of ray classification strongly depends on the distribution of the objects in the scene. The results of Kwon et al. [102] support this conclusion for six scenes with uniform grids and octrees. Several other similar RSAs based on the directionality of rays have been published. The best known technique, by Ohta and Maekawa [116], is based on the ray coherence theorem, further elaborated by Horvath et al. [89]. This theorem bounds the angle between the rays starting on the first object and hitting the second object. If we consider two spheres of radius r1 and r2 and if the distance between their centers is d12 , then the maximum size of the angle φ is: cos φ

J 1 0

r1 r2 d12

RSAs based on the ray-coherence theorem for each object and several primary directions (usually six corresponding to three coordinate axes in positive and negative directions) compute the potential set of objects that can be intersected. The objects in these sets are sorted according to the distance from the base object. When a ray shooting query is to be answered, the ray is tested for intersection with all the objects within the set, starting from the closest object. This technique has at most N 2 space complexity. Unfortunately, all RSAs based on ray-space subdivision are actually an approximation of worst-case RSAs, hence they exhibit high storage space and preprocessing time complexity.

1.7

Contribution of the Thesis

This thesis covers two major topics. In the first part we develop an RSA computation model. This model allows mapping of any RSA to this computation model by using commonalities of underlying data structures and ray traversal algorithms of all RSAs. Based on the properties of the computation model we develop an RSA performance model. Based on these two models, we present the methodology for

1.8. ORGANIZATION OF THE THESIS

19

comparing various RSAs based on the results of experiments for a set of scenes. Using this methodology, we compare 12 different RSAs using a set of 30 scenes. The second part of the thesis describes several new methods for improving RSAs based on the kd -tree. It covers both construction and ray traversal algorithms. Emphasis is put on the practical applicability of the achieved results in computer graphics applications. More specifically, the contribution of the thesis is:

General issues concerning RSAs

– design of a new computation model and performance model for RSAs. These models have enabled the development of a new methodology for comparing various RSAs by experiments on a set of scenes [83]. – practical comparison of 12 RSAs for a set of 30 scenes, the design of testing procedures to shoot rays in a scene in this comparison [84, 79]. Construction algorithms for kd -trees

– a new construction method that uses empty spatial regions in the scene, resulting in improved performance of the kd -tree for RSA. – a new termination criteria algorithm for kd -tree construction. – a new principle useful in kd -tree construction that splits the #,$ of the object intersected by the splitting plane into two new #%$ s. – a new general cost model for kd -tree construction. – a new construction algorithm for the kd -tree based on the general cost model for a preferred set of rays fixing either the origin or direction of a ray [77]. Ray traversal algorithms for kd -trees

1.8

– a new fast robust recursive ray traversal algorithm for a kd -tree with the minimum number of conditions to be performed [82]. – a new concept of a longest common traversal sequence for a set of rays based on kd -trees, particularly suitable for hidden surface removal [78]. Memory mapping of the nodes of the kd -tree to increase the cache hit ratio, and thus performance, for all traversal algorithms for computer architectures with a large cache line [72, 73].

Organization of the Thesis

The thesis includes several chapters describing the author’s development together with the description of some previous work. These chapters have been arranged into an order that will be more logical for the reader than the chronological order in which the author’s papers were originally published. The thesis should be read for best understanding in linear order, starting with the introductory chapter. The chapters include the motivation for the problem, previous work, possible definitions and terminology used, the detailed elaboration of problems, the description of new algorithms, and a summary of results. The main content of the thesis is presented in Chapters 2–7. In Chapter 2 we deal with the RSA computation model and performance model. These models allow us to develop a methodology for comparing various RSAs based on experiments with a set of scenes. Chapter 3 presents a comparison of 12 RSAs for a set of 30 scenes with different numbers of objects. The performed comparison shows that for the set of scenes the statistically best RSA is that based on the kd -tree. Chapter 4 is devoted to kd tree construction algorithms in detail. It describes the adaptive positioning of the splitting plane using a cost model, utilization of empty spatial regions in the scene, termination criteria, and the construction for preferred ray sets fixing either the origin or direction of a ray. In Chapter 5 we describe several ray traversal algorithms for the kd -tree in detail. It includes a sequential ray traversal algorithm, a recursive ray traversal algorithm, and two variants of a neighbor-links based ray traversal algorithm. Chapter 6 deals with the improving the ray traversal algorithm for the kd -tree to decrease the number of traversal

20

CHAPTER 1. INTRODUCTION

steps for restricted ray sets. Chapter 7 deals with a more hardware specific issue; how to map the nodes of the kd -tree to the physical memory to improve the data locality during execution of a traversal algorithm working over the kd -tree. Finally, Chapter 8 concludes the thesis with a short summary of all the results and several possible topics for further research.

Chapter 2

Comparison Methodology In this chapter we develop an RSA computation model that allows us to map any particular RSA to the computation model. Further, we develop an RSA performance model that establishes the correspondence between the computation model and the running time of the RSA for a sequence of ray shooting queries. Based on the RSA computation and performance models, we propose a set of parameters describing the use of RSA in applications that allow us to make a fair comparison of various RSAs for the same set of input data. This chapter follows the paper [83].

2.1

Motivation

One of the main problems in research on RSAs is how compare them qualitatively and quantitatively. This should be done on a technically sound basis, it should define time and memory complexity, suitability for various type of scenes, and particular features for an RSA. In spite of two decades of research on RSA in the computer graphics community, it is not yet clear if some particular RSA is more convenient and/or more efficient than any other RSA. Some contradictory statements about RSAs have appeared with the introduction of new types of RSA in the published papers. Only a few papers devoted to the comparison of various RSAs [109, 45, 49] have been published until now. Moreover, each paper that introduces a new RSA must, or at least should, compare the proposed algorithm with some reference algorithm. There is no common choice for the reference algorithm, but in most cases a uniform grid (see Subsubsection 1.6.3.3) was used. The quantitative comparison between a reference RSA and newly proposed RSA always depends on the software implementation and on a particular hardware platform. For this reason any cross comparison of the results presented in the different papers has been rather problematic or even impossible. In this chapter we will try to decrease the gap in understanding of the functionality of various RSAs by finding out their commonalities. The commonalities found in all RSAs allow us to describe the RSA computation model in a general way that allows us to map a particular RSA to this computation model. Further, we will define the performance model that establishes the connection between the running time of the application and the different algorithmic operations that are the subject of the computation model. Then we will develop an “ideal RSA” that allow us to compute the answers to ray shooting queries in constant time. The “ideal RSA” results in the smallest possible time that can ever be achieved using a certain hardware and a set of ray-object intersection routines. We use the time consumed by the “ideal RSA” as a reference time value for all the parts of the computation in a specific RSA. These parts cover the time needed to traverse the data structure on which RSA is based, ray-object intersection tests, and the remaining time consumed by the application. The design of the computation model and performance model allow us to define the set of thirteen parameters referred to as the minimum testing output that should be reported for one experiment, given a scene, a particular RSA, and a sequence of ray shooting queries. Further, we describe how to get the minimum testing output. The definition of the 21

CHAPTER 2. COMPARISON METHODOLOGY

22

two models and the minimum testing output is the basis for a methodology for making a fair comparison of various RSAs. The concepts proposed below form a comparison methodology that allows us to compare various RSAs independently of the hardware and the implementation used. The initial concept of the comparison ˇ ara [86] for the kd -tree, and the methodology based on experiments was described in Havran and Z´ technique presented here is a generalised and extended version for any RSA.

2.2

RSA Computation Model

In this section we introduce the RSA computation model. We will show that any RSA currently known or developed in the future can be mapped to this computation model. The computation model is based on the definition of algorithmic operations in an RSA. These algorithmic operations must always be performed due to the nature of the ray shooting problem. We then use the computation model to describe the set of parameters to be reported, when an RSA is tested experimentally. The ray shooting problem can be understood as an instance of geometric range-searching [6], which implies that some data structure is built to answer the specific query. The definition of the ray shooting implies that every RSA contains somewhere pointers to objects that are to be tested for intersection against a given ray. This means that each RSA is separated into two parts (like all algorithms, see [10]): the data structure (further abbreviated to DS) containing at least pointers to the scene objects and the ray traversal algorithm working over DS. The lifetime of an RSA is composed of two phases, the first one is called preprocessing phase and it involves the construction of the initial DS. The second phase of an RSA is called the execution phase. Within the execution phase the RSA answers given ray shooting queries. More theoretically, an RSA can be described as a special case of a general RAM model [10, 154], where any memory cell can be accessed in constant time or through a series of pointers. A DS is composed of some data entries, here referred to as nodes, which contain some data. It usually involves the pointers to the scene objects, the pointers to the other nodes of DS, the description of cells, etc. Nodes of a DS can be divided into two groups: elementary nodes are intended to contain only pointers to objects (and, if RSA requires it, some other data), whereas generic nodes are all other nodes, which point to other generic and/or elementary nodes. A special case of elementary nodes are empty elementary nodes that do not contain any pointers to objects and act as “free space containers” within the DS. When answering a ray shooting query in a particular RSA, the computation proceeds as follows. Given a ray R, a ray traversal algorithm begins at a special starting node of a DS and performs a sequence of the following operations:

TRAVERSAL STEP: visit a new node of the DS, NEW NODE:

create a new node of the DS,

DELETE NODE:

delete a node from the DS and unlink all pointers to the node from the remaining nodes of the DS,

TEST OBJECTS:

when accessing an elementary node of the DS, test objects pointed to in this node for the intersection with the ray R,

finally finding the closest intersected object if such an object exists. There are two cases: a DS is or is not changed by a ray traversal algorithm. If a DS underlying the RSA is not changed by the ray traversal algorithm, then the operations “NEW NODE” and “DELETE NODE” are not performed during the execution phase. Such an RSA is referred to as the RSA based on a static data structure. There are several RSAs that modify the underlying DS on the fly within the execution phase, for example, ray space subdivision techniques [17]. The operations “NEW NODE” and “DELETE NODE”

2.2. RSA COMPUTATION MODEL

23

can be used within the preprocessing phase to build up some initial DS, however, this DS is modified during the execution phase. Such an RSA is referred to as RSA based on a dynamic data structure. Since every RSA can be mapped to this general RSA computation model, this enables us to define a common set of parameters to be reported when any RSA is performed on an input scene containing N objects over an input sequence of ray shooting queries. The sequence of ray shooting queries induced by the application for an input scene is associated with a testing procedure. The symbol for the testing procedure is TP. The testing procedure is an algorithm in the application that generates a sequence of ray shooting queries to be answered by a particular RSA. A particular testing procedure TP can be the result of a global illumination algorithm such as ray tracing, etc., or just an artificial algorithm shooting rays to obtain some required distribution of rays in space [84]. We propose to organize the set of parameters resulting from the use of a particular RSA on the input scene and given a TP into three subsets, the first two of them hardware/implementation/compiler independent:

RSA parameters related to static properties of data structure DS:

– If an RSA is based on a static data structure, the parameters depend on the scene only, and they are evaluated at the end of the preprocessing phase. – If an RSA is based on a dynamic data structure, the parameters depend on the scene and the testing procedure TP, and they are evaluated during the execution phase as maximum values reached. – maximum number of generic nodes in DS, NG U WV NE U WV – maximum number of elementary nodes in DS, NEE U WV – maximum number of empty elementary nodes in DS (NEE NE ), NER U WV – maximum number of the pointers to objects in all the elementary nodes of DS (NER N).

RSA parameters related to dynamic properties of data structure DS, i.e., the use of DS within an execution phase of an RSA. The parameters depend on the scene and the testing procedure TP, they are evaluated at the end of the execution phase: rIT M U WV

–

N˜ T S U WV N˜ ET S U XV N˜ EETS U XV

– – –

ratio of ray-object intersection tests performed to minimum number of intersection tests (rIT M 1 0, assuming at least one object is intersected given and TP), average number of all DS nodes accessed per ray (N˜ T S 1 0), average number of elementary DS nodes accessed per ray (N˜ ET S N˜ T S ), average number of empty elementary DS nodes accessed per ray (N˜ EETS N˜ ET S ).

RSA hardware/implementation/compiler dependent parameters. Obviously, these parameters also depend on the scene and testing procedure TP: TB U sV TR U sV

– –

time to build initial DS for the RSA in the preprocessing phase, running time of the application that uses the RSA. It involves the execution phase of the RSA and possibly other computations. (Obviously, TB is not included in TR .)

We consider the parameters in the first two subsets as the minimum hardware/implementation/compiler independent parameters to be reported. It is certainly possible to extend the set of parameters by others (for example, the variance of number of objects in leaves), but we want to keep this set of the smallest possible size that still characterizes an RSA via the computation model. The parameters TB and TR depend not only on the hardware used, but also on the quality of implementation (and programming language), the compiler used and its version, the optimization switches

CHAPTER 2. COMPARISON METHODOLOGY

24

used for compilation, etc. For this reason, all these experimental conditions should be described in detail. The treatment of these parameters related to the implementation makes the problem of comparing various RSAs rather difficult; we describe our solution to the problem in more detail below.

2.3

RSA Performance Model

The RSA computation model enables us to count the number of basic algorithmic operations performed on average in an RSA. The RSA computation model does not define any cost1 of these operations in terms of running time, it only covers the description of TB and TR . In order to establish the relationship between hardware/implementation dependent and independent parameters, we further develop an RSA performance model, which separates the cost of a ray traversal algorithm and the cost of ray-object intersection tests. The concept of the performance model for an RSA was first introduced by Cleary and Wyvill [33] in the context of uniform grid analysis. We present here a more general performance model for any RSA that is derived from the RSA computation model described above. The RSA performance model is based on the decomposition of the running time TR of the application that uses an RSA into three parts:

computing ray-object intersection tests, traversing the DS of the RSA, and the remaining computation effort required by the application.

We bind the time-dependent and independent characteristics by means of cost consumed by specific algorithmic operations. Then we can express TR as: TR

rIT M rSI C˜IT

N˜ T S C˜T S D Nrays Tapp

(2.1)

where rSI is the ratio of the number of rays intersecting objects to the number of all rays (rSI 1 0), thus the average number of ray-object intersection tests per ray is N˜ IT rIT M rSI . Further, C˜IT U sV is the average cost of a ray-object intersection test, C˜T S U sV is the average cost of the traversal step of a ray traversal algorithm among the nodes of DS, Nrays is the total number of rays induced by a testing procedure TP, and Tapp U sV is the remaining time of the application. The time Tapp covers another computation effort performed in the application, for example, in a rendering application Tapp might cover the time consumed to compute the ray reflection, lighting, texturing, and other material calculations. Thus Tapp is always constant for a particular scene and testing procedure TP, provided the same implementation and hardware is used. We can refine the performance model if we consider the ratio of successful ray-object intersection tests to all intersection tests: TR

U N˜ ITsucc C˜ITsucc

f ail D rSI N˜ T S C˜T S VY N˜ rays Tapp N˜ ITf ail C˜IT

(2.2)

succ is the average number of successful ray-object intersection tests per ray, C˜ succ sV is the where N˜ IT IT U f ail ˜ average cost of successful ray-object intersection tests, NIT is the average number of failed ray-object f ail intersection tests per ray, and C˜IT U sV is the average cost of failed ray-object intersection tests.

2.4

Ideal RSA

Having described the refined performance model, we can now introduce the “ideal RSA” as an RSA that has the best possible performance. The concept of the “ideal RSA” serves us as the ultimate but in 1 For

Z[

the sake of convenience, we further use the term cost s as the running time to perform some particular algorithmic operation.

2.4. IDEAL RSA

25

practice unachievable goal. However, it is important since the running time of the “ideal RSA” is used as the reference time value for comparing various RSAs. Definition 2 An “ideal RSA ” is an RSA that for a given ray shooting query computes the answer in 1 time independently of whether an intersected object exists or not. The multiplicative factor hidden behind the -notation is very small. Since Szirmay-Kalos and M´arton [153] proved that any RSA works at least at time Ω logN in the worst-case, we can ask whether the definition of an “ideal RSA” makes sense. Inspired by the idea of Parametrized Ray Tracing [129], we can construct the “ideal RSA” provided the same testing procedure TP is repeatedly performed for the same scene . Further, it is required that the application code is deterministic in the sense that the testing procedure TP in the application always generates the same sequence of ray shooting queries for a given scene. This can require the setting of initial seeds in pseudo-random generators to the same value in the application, etc. Further, we describe the two procedures that form the “ideal RSA”. The first assumption that enables us to execute the “ideal RSA” is that the application is run at least twice using the same TP and . Each object is assigned the identification tag ID (integer) in the range Q 0 N 1 S . Then we construct the array AT where objects are addressed directly using IDs of objects in 1 time. In the first application run we use some conventional RSA to compute the answers to given ray shooting queries. The results obtained by the conventional RSA for the sequence of input ray shooting queries generated by TP are saved linearly to a temporary array AS using the objects’ IDs. When no object is intersected, the corresponding array entry is set to a special ID value (IDspec T 1). Since the number of ray shooting queries can be high, it may be necessary to save the results of the conventional RSA to external memory. The procedure that must be used in the first application run and the interface between the application and the conventional RSA is outlined in the pseudocode, Algorithm 3. Algorithm 3 The first run of an “ideal RSA” that saves the results of ray shooting queries. Preprocessing phase " Assign each object a unique ID in the range Q 0 N 1 S . Allocate the array AS to store IDs of objects, the number of entries in AS must be greater than or equal to the number of all ray shooting queries generated by TP. The index into the array – order of ray shooting query " i\ 0 Execution phase " function ShootRay(ray R): object Compute the result of the i-th ray shooting query by some other specific RSA. " Compute the result for R using some conventional RSA. Object O \ the result of a conventional RSA for R if object O was found then AS U iV]\ ID of object O else AS U iV]\ IDspec end if i \ i 1 ShootRay \ object O Postprocessing phase " Possibly save AS to external memory. In the second (repetitive) application run, instead of calling a specific RSA, we read the correct answer to the ray shooting query from the array AS provided that repetitive run(s) of the application results in

CHAPTER 2. COMPARISON METHODOLOGY

26

the same testing procedure TP and uses the same scene . If we get the object’s valid ID, we get the address of the object through the array AT and compute the ray-object intersection point exactly by at most one ray-object intersection test. This computation is required to get the correct signed distance for the current ray shooting query. If the object’s ID has the value IDspec , then the answer to the ray shooting query is “no object”, and no ray-object intersection test is computed. Since the ray-object intersection test is computed at most once for each ray shooting query, the “ideal RSA” runs in O 1 time. The “ideal RSA” performed in a repetitive run of the application is outlined in the pseudocode, Algorithm 4. Algorithm 4 The second run of an “ideal RSA” reading the results of ray shooting queries. Preprocessing phase " Assign each object its unique ID in the range Q 0 N 1 S . These IDs correspond to the first run of “ideal RSA”. " Allocate the array AS to store IDs of objects. Possibly read AS from the external memory. Allocate the array AT to store the pointers to objects, size of AT is the number of objects. for each object O is specified by its ID. do AT U IDV^\ address of the object O end for The index into the array – order of ray shooting query " i\ 0 Execution phase " function ShootRay(ray R): object ID of object \ AS U iV i \ i 1 if ID ( IDspec then Object O \ AT U IDV Compute the signed distance for the ray R and the object O. else Object O \ “no object” end if ShootRay \ object O For the repetitive run(s) of the “ideal RSA” the time TR becomes the minimum application running time TRMIN : MIN Tapp TRMIN U sV] TRSA

(2.3)

MIN is the minimum time devoted to ray shooting only, further called the ideal ray shooting where TRSA time: MIN TRSA U sV] C˜ITsucc Nrays rSI

(2.4)

If external memory is used to save the array AS , we should avoid the time consumed to transfer the data from this external memory to internal memory to minimize the repetitive running time of the application TR . Practically2 , array AS is read from a file by blocks to internal memory, and the time for reading the blocks is not included in TRMIN . 2 From

the implementation point of view, the “ideal RSA” is fairly easy to implement in the application.

2.5. MINIMUM TESTING OUTPUT

2.5

27

Minimum Testing Output

The results of experiments published in the papers introducing new RSAs were often restricted only to times TB and TR and some other parameters. Based on these hardware dependent parameters, we could not fairly compare newly introduced RSAs with those published in the past. It follows from the description of the computation and performance model that experiments allowing us to fairly compare various RSAs must be performed for the same scene and testing procedure TP. For this purpose Haines introduced a Standard Procedural Database [69]. This database enables us to procedurally generate a set of scenes with various numbers of objects. It also defines some standard sizes of the scenes that should preferably be used for testing RSAs. However, the use of SPD scenes for testing RSAs has also been violated, and research papers often show results for testing performed on private scenes, or on only a small subset of SPD scenes. Such a researcher’s behavior is a direct violation of research etiquette, since the nature of science is that every research paper should describe new techniques and experiments that will be reproducible and verifiable by all following researchers [37]. Therefore, whenever possible, qualitative properties of algorithms should always be tested on non-private input data. Let us discuss why the comparison of various RSAs based only on time TR consumed by the whole application is rather incorrect. The first reason is that TR also includes Tapp , which is constant. If we want to compare the ratio of performances of various RSAs on the same hardware and with the same implementation, instead of comparing TR1 for RSA1 and TR2 for RSA2 it is more correct to compare TR1 Tapp with TR2 Tapp , since then we consider only the time consumed by the RSA. Obtaining the value of Tapp can be difficult, as it usually requires profiling of the application by some software tool. We propose a way avoiding the use of a profiler in the section below. The value of TR can be used correctly only for ranking of RSAs, but it cannot be used to express how much an RSA is faster than another RSA. The SPD package [69] also recommends that some time-independent characteristics should be succ N reported: Nrays , N˜ IT rays, N˜ T S Nrays. We follow this approach by extending this set of hardware/implementation independent characteristics. In order to avoid mutually contradictory statements in further papers concerning RSAs, we define a set of parameters to be reported from the experiments. We call the set of parameters the minimum testing output. This consists of three subsets as already presented: RSA parameters that relate to static properties of DS, RSA parameters that relate to dynamic use of DS, and RSA parameters dependent on hardware/implementation. The hardware/implementation dependent characteristics TB and TR are supplied by three other parameters. We normalize the time portions devoted to the particular phases to the MIN to allow us to make a fair comparison among different implementations ideal ray shooting time TRSA and different hardware used for testing. Our main goal is that the parameters in the minimum testing output should allow us to compare the performance of various RSAs independently of hardware and implementation issues. We define the minimum testing output for an RSA as: subset Σ of parameters describing the static properties of a DS within the RSA: Σ

NG NE NEE NER " ,

subset ∆ of parameters describing the dynamic use of the data structure DS within the RSA, which also depend on the input scene and testing procedure TP: ∆

rIT M N˜ T S N˜ ET S N˜ EET S " ,

and subset Θ of hardware/implementation/compiler dependent parameters of the RSA that concern running time, which also depend on the input scene and testing procedure TP:

CHAPTER 2. COMPARISON METHODOLOGY

28

Θ

TB TR ΘAPP Θrat ΘRUN "W

(2.5)

TB TR Tapp MIN TRSA

f ail succ C˜ succ N ˜ ITf ail C˜IT Nrays _ N˜ IT IT TR Tapp TR TAPP "` MIN TRSA

The parameter ΘAPP expresses the ratio of the remaining application time to the ideal ray shooting MIN , which is not necessary for the comparison of RSAs, however, suitable for other reasons as time TRSA we will show later. The parameter Θrat (Θrat aQ 0 1 S ) is the ratio of time required for computing the ray-object intersection tests to time consumed only by an RSA. The parameter ΘRUN gives the ratio of MIN . time consumed by an RSA to TRSA MIN can be difficult to measure. In the The time portions related to the ideal ray shooting time TRSA MIN enables us to compare different next section we deal further with this problem. The value of TRSA hardware/implementation/compiler dependent characteristics. Subset Θ contains the value of the ideal MIN only indirectly, since it can be computed as: ray shooting time TRSA MIN TRSA

2.6

TR

ΘAPP

ΘRUN

Tapp MIN TRSA

b b

Nrays N˜ T S C˜T S MIN TRSA

TR

bdc

b

0

b

f ail ˜ f ail succ C˜ succ N ˜ IT Nrays N˜ IT CIT IT MIN TRSA

e

(2.6)

Measuring the Minimum Testing Output

The minimum testing output allow us to make a fair comparison of various RSAs. We pay for it by additional effort needed to get this set of thirteen parameters for one experiment. The counters to get the subsets Σ and ∆ must be coded inside the RSA in its preprocessing and execution phase, which is fairly easy to implement. It is advantageous to check these counters for verification purposes as well, since they can indicate to us an implementation error of a particular RSA. In order to have a correct implementation of a particular RSA given some testing procedures TP and scene , the parameters Nrays and rSI must have correct values when the application run is over. Although Nrays can be considered as an independent input quantity, it is often the case that the number of rays generated is connected with the use of an RSA and thus Nrays is dependent on the correctness of the RSA. For example, this is the case for higher order rays in various global illumination algorithms. Reference values of Nrays and rSI can be obtained by running another RSA that is known to be correct. The simplest way is to implement na¨ıve RSA, although the na¨ıve RSA is inefficient. To obtain subset Θ we need the running time TR to be decomposed into the three portions: the time for the ray traversal algorithm performed within the RSA, the time of the ray-object intersection tests performed within the RSA, and the remaining application time Tapp . There are two ways to obtain subset Θ; two profiling methods are described below.

2.6.1

Software Tool Profiling

One way to get subset Θ is to use a software profiler tool. This is a common method for solving performance issues in software applications. It enables us to distinguish the times consumed within particular software functional units, such as functions, procedures, or even the lines of a source code. Then we

2.6. MEASURING THE MINIMUM TESTING OUTPUT

29

can sum the time devoted to ray-object intersection test routines, the time consumed by traversing the nodes of a DS, and the remaining application time. We also require to profile the run of the application with “ideal RSA”. In this case we need to sum MIN . only the time consumed by ray-object intersection tests, which gives us TRSA Software tool profiling should be preferred for getting Θ, since it provides or at least should provide precise values. However, under certain conditions this is not possible, for one of the following reasons: the profiler is not available, the profiler does not work correctly, the profiler cannot determine the time portions of the required RSA parts within a given implementation, the profiler needs some compiler switches to be used, which influences TR (a debugging switch is usually required, and this can increase the application time considerably) and the different time portions of TR . Therefore we propose an alternative to obtain subset Θ without using a software profiler tool below.

2.6.2

Multiple Run Profiling

This profiling method involves running the application several times and computing the unknown variables in Eq. 2.2 from linear equations. Eq. 2.2 contains four unknown variables that express the costs succ , C˜ f ail , C˜ , and T of distinct algorithmic operations in some RSA application: C˜IT TS app , IT In order to obtain the four unknown variables we need four different application runs. These have to use the same sequence of ray shooting queries, but they have to result in different total running times. For this purpose we utilize the concept of the “ideal RSA”, and modify the ray-object intersection tests to be performed K-times. The first used equation comes from the common application run, described by Eq. 2.2. The second used equation is for the application run, when the ray-object intersection test is performed K-times, resulting in the running time: succ ˜ succ ˜ f ail ˜ f ail CIT NIT CIT TR K U sV]f K _ N˜ IT

N˜ T S C˜T S D Nrays Tapp

(2.7)

The third used equation is the time of the “ideal RSA”, Eq. 2.3. The fourth used equation is for the case when the ray-object intersection test in the “ideal RSA” is performed K-times, resulting in the running time: succ ˜ (2.8) TRMIN K U sVO K C˜IT Nrays rSI Tapp The portions of time consumed for these four runs of RSA application is visualized in Fig. 2.1.

Figure 2.1: Visualisation of the running times for four runs of an RSA application. (a) Normal application run (Eq. 2.2). (b) Normal application run, ray-object intersection test K-times (Eq. 2.7). (d) “Ideal RSA” run (Eq. 2.3). (d) “Ideal RSA” run, ray-object intersection test K-times (Eq. 2.8). Notation: TIT – time devoted to ray-object intersection tests, TT S – time devoted to traversing, Tapp – remaining application time, TRV F – time required to read the visibility data from a file. succ , C˜ f ail , C˜ , and T Assuming C˜IT TS app are of the same value in these four application runs, we can IT compute these unknown variables by solving system of linear equations. From Eqs. 2.3 and 2.8, we get

CHAPTER 2. COMPARISON METHODOLOGY

30 succ and T C˜IT app as follows: succ C˜IT

Tapp

TRMIN K TRMIN Nrays _ K 1 D rSI

TRMIN

(2.9)

succ Nrays rSI C˜IT

(2.10)

f ail From Eqs. 2.2 and 2.7 we derive the C˜IT and C˜T S : f ail C˜IT

C˜T S

g

Ig

TR K TR Nrays _ K 1

TR K i TR Nrays

succ ˜ succ C˜IT NIT h

1 N˜ succ

(2.11)

IT

succ ˜ succ NIT C˜ITf ail N˜ ITf ail h C˜IT

N˜1

(2.12)

TS

We call the profiling method based on the four equations multiple run profiling. It is oriented only to the software applications that use RSA. 2.6.2.1

Properties

Multiple run profiling has one big advantage, it does not require any software profiler tool. However, it suffers from several disadvantages. First, it requires the multiple ray-object intersection test to be implemented in routines for all the object types in the application. Second, the time of the multiple ray-object intersection test is affected by the cache behavior of the processor used during experiments. Even if the ray-object intersection test is performed K-times, instead of being K-times slower it is only K times slower, where 0 K K. Further, we have found out during the testing of an “ideal RSA” that caching and branch prediction within the processor also influences the time of ray-object intersection tests within the “ideal RSA”. Since in this case ray-object intersection test is always positive (Eqs. 2.3 succ . Our experiments had and 2.8), the branches are always well predicted, resulting in lower cost C˜IT best matching with the software tool profiling using K 2. Third, the application must be run at least five times over the same input scene. In addition to the four application runs described above, one run is required to save the results of RSA into the visibility array AS for the “ideal RSA” not to influence TR in the application run corresponding to Eq. 2.2. succ , C˜ f ail , and T The second way is to get directly some other estimates of C˜T S , C˜IT APP . Some of these IT may be known or well estimated for given hardware, implementation, and compiler independently of MIN , we can also assume TP and for some previous runs of the application. Provided TAPP j TRSA f ail succ MIN TAPP TR . Third, we can obtain the estimate for C˜IT and C˜IT when we use the same RSA with a different setting used for the construction of data structure underlying the RSA. For example, if a kd -tree is used, we can set the maximum depth allowed to various constants and then we get a set of equations of type 2.2, which allows us to compute C˜T S . Although multiple run profiling has several disadvantages, it remains the only known way when software tool profiling is not possible for some reason. Below, we improve this method using some correction parameters. 2.6.2.2

Corrected Measuring Subset Θ

To obtain more precise profiling results we can modify the equations of application runs to model the behavior of caching and branch prediction to some extent. We propose to use three correction parameters in this modified multiple run profiling. They express the time between the operation that is rep hit , r f ail , all of them in the range expected to be cached and the time of uncached operation: rcorr , rcorr corr 0 0 1 0 . First, we correct C˜IT provided that the ray-object intersection test always succeeds: C˜Ik T

rep C˜IT rcorr

(2.13)

2.7. COMPARISON METHODOLOGY

31

Second, we correct C˜IT of the repetitive successful ray-object intersection test:

k succ K l C˜IT _ 1 C˜IT

hit K 1D rcorr

(2.14)

Third, we correct C˜IT of the repetitive failed ray-object intersection test:

k f ail K m C˜IT _ 1 C˜IT

f ail K 1D rcorr

(2.15)

Then we can express the corrected three equations as follows: TRMIN TR K U sVn and

rep succ Nrays rSI rcorr C˜IT Tapp

(2.16)

hit o N˜ ITsucc C˜ITsucc _ 1 K 1D rcorr 2 f ail N˜ ITf ail C˜ITf ail _ 1 K 1D rcorr 2 N˜ T S C˜T S p Nrays

succ rep TRMIN K U sV] C˜IT rcorr _ 1

hit K 1D rcorr D Nrays n˜IT

Tapp

Tapp

(2.17)

(2.18)

succ , C˜ f ail , and T Similarly to Eqs. 2.9–2.12 we can derive the formulas to obtain C˜T S , C˜IT APP . Corrected IT measuring is more precise, but it requires us to set the correction parameters. These can be estimated by using a software profiler when compiling without using optimization switches, or for a different setting of K. Another way is to use various ray traversal algorithms, since the correct setting of the correction parameters ΘIT remains the same, because the number of ray-objects does not differ and ΘT S depends on the ray traversal algorithm used.

2.7

Comparison Methodology

The establishment of the subsets Σ, ∆, and Θ of the minimum testing output enables us to compare different features of RSAs. For the use of an RSA in an application, when we are concerned in space and time complexity of the RSA, there are several different features for us to distinguish: the complexity of input scene is important, for example, some RSA can be efficient for scenes with a small number of objects, although slow for scenes with a higher number of objects3 . The scene influences all parameters in Σ, ∆, and Θ.

:

RSA:

the idea behind RSA has a major impact on performance. RSA influences Σ, ∆, and Θ.

TP:

the testing procedure is specific to the application used, and the use of RSA can vary greatly. It only influences ∆ and Θ for an RSA based on static data structure, otherwise, it also influences Σ.

HW:

type of hardware used – this influences all parameters in subset Θ, particularly TB and TR .

COMP: the compiler, its version, and the switches used can influence TB and TR significantly, and thus all parameters in subset Θ. (For example, setting optimization switch “-O2” of the C++ compiler in the UNIX operating system can decrease the running time by half compared with “-O0”.) IMPL:

3 The

implementation – the actual coding of the algorithm also has a great impact on performance, depending on the programmer’s experience, etc. Various implementations of the same ideas can exhibit significant differences in performance. It influences only subset Θ. When the RSA is (re)implemented correctly, the parameters in subsets Σ and ∆ are not influenced. notion of scene complexity is also a specific issue, dealt with in Section 3.3.

CHAPTER 2. COMPARISON METHODOLOGY

32

We note that HW, COMP, and IMPL can be intertwined to some extent, since a certain implementation can better fit to a certain hardware, etc. It is obvious that so many dimensions of freedom make the comparison of various RSAs rather difficult in general, especially for subset Θ. For example, if we want to compare two different RSAs, we have to fix as many other possible dimensions as possible, in this case , TP, HW, COMP, and IMPL. As the minimum requirement, we can require the same scene and the testing procedure within the application to be used. The existence of dimensions HW, COMP, and IMPL disable the direct use of TB and TR for comparing various RSAs. Some parameters in subset Σ, ∆, and Θ allow us to compare even such cases, due to the generality of the underlying RSA computation and performance model. In general, we can perform the following comparisons for one experiment using the same TP and scene for two ray shooting algorithms RSA1 and RSA2 , where values for RSA1 are denoted by superscript 1 , for RSA2 by superscript 2 :

memory complexity, we compare NG1 NE1 with NG2 NE2 . To a constant factor given by an implementation of a particular RSA, it expresses the different memory requirements. use of hierarchy, we compare NG1 / NE1 and NG2 / NE2 . 1 / N 1 and N 2 / N 2 . use of empty space, we compare NEE E EE E time complexity, we have several choices depending on the conditions for comparison: 1 with T 2 T 2 for performance ratio, T 1 with T 2 for ranking only – time can – TR1 Tapp app R R R be used directly for comparison, when HW, COMP, and IMPL attributes of RSAs to be compared are the same. For all experiments the HW/COMP/IMPL attributes must always be stated explicitly. – Θ1RUN with Θ2RUN – concerns the time required for ray shooting in the application related to ideal ray shooting time. Assuming that the implementation of ray-object intersection tests is practically the same, this enables a really fair comparison independent of HW, COMP, and IMPL attributes. The parameter ΘRUN defines how far the tested RSA is from the “ideal RSA”, and thus the maximum portion of the time that could possibly be reduced by some RSA with higher performance. It can be considered as the main index of the performance. 1 with T 2 T 2 , it enables us to compare various different RSAs Unlike comparing TR1 Tapp app R virtually independently of HW, IMPL, and COMP. – Θ1rat with Θ2rat – we can compare how much of the time for an RSA is devoted to computing ray-object intersection tests. – Θ1RUN Θ1rat with Θ2RUN Θ2rat – concerns the portion of time for ray-object intersection tests. It can be used virtually independently of HW, COMP, and IMPL. – Θ1RUN _ 1 Θ1rat with Θ2RUN _ 1 Θ2rat – concerns the portion of time for traversing and manipulating with data structures. It can be used virtually independently of HW, COMP, and IMPL. 1 2 – rIT M with rIT M – an efficient RSA should have a ratio of ray-object intersection tests performed to the minimum number of intersection tests, as close to 1.0 as possible. – N˜ T1 S with N˜ T2 S – an efficient RSA has the number of traversal steps per ray as small as possible. 1 ˜2 ˜2 ˜1 – N˜ EET S /NET S or NEET S /NET S – this shows us the utilization of empty space within the execution phase. It can have a great impact on RSA performance.

Based on these developments, we can formulate a comparison methodology for two or more RSAs. First, we map each tested RSA to the RSA computation model described in Section 2.2. When performing a set of experiments for a set of scenes we have to measure the minimum testing output for all the experiments. (The scenes should be publicly available, SPD scenes are suitable.) The testing procedure used within the application has to be the same for one scene and any RSA and must be well described. (Four testing procedures are described in the next chapter.) This guarantees the same sequence of ray

2.8. DISCUSSION

33

shooting queries and thus the correctness and reproducibility of experiments. Then, we can compare various features of tested RSAs as described above, for each scene used and also as a whole set of scenes using basic statistics tools (e.g., minimum, maximum, average, variance). It is necessary to report fully the minimum testing output for each scene and each experiment in the research work, for example, when introducing a new RSA.

2.8

Discussion

The proposed minimum testing output organized into three subsets has a total of thirteen parameters, which can be considered a high number. Nonetheless, we consider this as the minimum set of parameters that shows different features of an RSA, since it is based on the general computation model that fits any RSA. The minimum testing output contains both hardware/implementation independent and hardware/implementation dependent characteristics that allow us to make mutual comparisons of various RSAs under certain conditions. The disadvantage of this comparison methodology is the underlying assumption that the costs of the ray-object intersection tests are of the equal efficiency for various shapes of objects on different implementations. Fortunately, the ray-object intersection tests for objects’ shapes in the SPD package are more or less standardized [3, 18, 88]. There is a set of standard scenes, and a well-defined testing procedure, namely ray tracing in the SPD package. However, we show that at least the same scene and the same testing procedure TP must be used to validate the comparison. Let us now discuss if it is possible to manipulate the minimum testing output by changing the quality of the implementation. It is not possible to influence the parameters in subsets Σ and ∆, assuming that the implementation of statistics counters and RSA itself is correct. If less efficient or more efficient rayobject intersection tests are applied, then practically all the parameters in Θ are influenced, excluding TB . However, it is virtually impossible to influence the value of ΘRUN Θrat .

2.9

Conclusion

In this chapter we have shown the concept that is common for all RSAs, i.e., an RSA computation model and performance model. We have described the “ideal RSA” that provides us with the reference value for comparing two RSAs. Further, we have presented a methodology for comparing various RSAs. However, after the analysis we performed it is clear that an experimental comparison of various RSAs still remains a difficult problem in general. The comparison methodology presented here enables us to compare various RSAs, assuming that each application uses the same sequence of ray shooting queries for the same set of objects. The construction of an “ideal RSA”, which gives us the minimum time devoted to ray shooting only, also shows us the time in the best possible and ideal case given a hardware and implementation. The minimum application running time expresses the minimum time of a particular application that uses an “ideal RSA” for a given scene and testing procedure. A by-product of this development is that we can measure how far we are from the minimum application running time ever achievable in dependence on the RSA used, given the application implementation that computes the set of ray shooting queries on the tested hardware. For example, it can then be shown whether or not it is possible to compute a particular global illumination task such as ray tracing in real time, given a certain hardware and a certain software implementation.

34

CHAPTER 2. COMPARISON METHODOLOGY

Chapter 3

Best Efficiency Ray Shooting Algorithm In this chapter we present an experimental efficiency study of several RSAs. For this purpose we propose four testing procedures that induce various sets of ray shooting queries simulating the use of an RSA in global illumination algorithms. We use the testing procedures to produce statistics for comparison based on the developments put forward in the previous chapter. We compare various RSAs that have been reimplemented following the published literature. For reasons of the space we report only the main results of 1440 experiments for 30 SPD scenes and 12 RSAs.

3.1

Motivation

Looking at the literature addressing RSAs we find that many recent results are either limited to a subset of SPD scenes, or have been measured on private scene sets. Additionally, it is not known if SPD scenes – although they are scalable – is a good representative set of scenes suitable for testing various RSAs. The time complexity of most developed heuristic RSAs is unknown, or it is formally 1 , which can be used for no valuable quantitative or qualitative comparison, since the timings for performing experiments vary greatly. Further, the same RSA on different hardware can have better or worse implementations. As a result of the factors mentioned above, many papers published about RSAs contain mutually contradictory statements. Researchers involved in RSAs have tried long to find the best efficiency RSA, i.e., an RSA that outperforms all other known RSAs for any set of ray shooting queries and any input scene. As might be supposed, no such RSA has been found to be generally the best until now. We propose an alternative in the search for the globally best-performing RSA: based on statistics provided by number of different RSAs tests we will try to find the statistically best RSA. The work presented in this chapter is part of an ongoing long-term undertaking called the BES (Best Efficiency Scheme) project, announced on the globillum mailing list [58] in October 1999 [79]. Owing to the long-term nature of the project we report in this thesis only the first results of experiments based on 30 SPD scenes of different complexity. This chapter is further structured as follows: Section 3.2 gives more details on the goals of the BES project. Section 3.3 recalls the known scene complexity measures. In Section 3.4 we describe the design of the testing procedures used. Section 3.5 presents the results from experiments on 30 SPD scenes of different complexities and a possible interpretation of the obtained results with regard to scene complexity measures. Finally, Section 3.6 makes some conclusions, and proposes an outline for future work on BES.

3.2

Project Goals

The basic idea of the BES project is to collect a reasonable set of test scenes of varying complexity, and to use these scenes to measure the hardware/implementation/compiler independent characteristics 35

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

36

of various RSAs. The collected scenes will not exhibit self-similar behavior exposed by the SPD scenes that are generated algorithmically. The scenes and measured results will be made available to the computer graphics community in a suitable form. The results of the project for collected scenes will enable us to evaluate the properties of SPD scenes for testing. The BES project will help in clarifying the following points:

3.3

BES Existence. The question whether a best efficiency RSA does or does not exist will be answered using a statistically relevant set of scenes. We suppose that the answer will be negative for currently known RSAs, but this still has to be verified. Alternative BES Formulation. A proposal for an alternative definition of a best efficiency scheme, based on hardware/implementation independent statistics for a relevant number of scenes, will be given. BES Testing Procedures. Simple testing procedures with moderate time requirements will be defined. These procedures can be used by other researchers to present the properties of a new RSA. These procedures will not be directly global-illumination algorithms requiring other computation than ray shooting – they will just induce different sets of ray shooting queries. BES Comparison. A concise summary and comparison of currently used RSAs will be provided. In order to minimise discrepancies, all tested RSAs are implemented within a uniform framework (GOLEM rendering system [75]). BES Repository. A collection of freely available test scenes of varying complexity will be made available for the scientific community. This can make future research in global illumination and the visibility field easier and more verifiable, as the lack of commonly used scenes makes it impossible to verify the results when reimplementing previously published methods for reference purposes. BES Prediction. We will check how the proposed definitions of scene complexity can help us to predict which RSA should be optimally selected in advance for a given scene. If such a prediction does not exist, the way to define such predictors will be opened. Hybrid Methods. We will try to find out if it pays off to construct hybrid spatial data structures for some spatial regions in a scene. The concept of meta-hierarchies [16] was defined ten years ago, but we are not aware of present-day implementations.

Scene Complexity

As we would like to be able to predict which RSA is the most appropriate for a given scene, we need a means characterising the scene with respect to running times for different RSAs. This can be accomplished by measuring different scene complexity characteristics and examining the correlation between the complexity characteristics and running times for different RSAs for a set of scenes. Then we can try to formulate an RSA selection algorithm which suggests the use of a particular RSA for a given scene using a certain complexity measure. The RSA selection algorithm is expected to select such a RSA that is likely to perform best of all RSAs. There are several ways to estimate scene complexity. We can take into account, for example, the number of objects, scene sparseness, sparseness variance and standard deviation, non-uniformity, and so on. We have made use of several methods of evaluating scene characteristics that were introduced for this purpose [98, 27, 46].

3.3.1

Count Approach

The prevalent way to characterize scene complexity is to take the number of objects N in the scene, often referred to as scene size. Although this is a very simplistic definition of scene complexity – N

3.3. SCENE COMPLEXITY

37

objects –, it raises the question, whether the use of this complexity measure is not exaggerated. To the best of our knowledge no answer to this problem has been given.

3.3.2

Voxelisation Approach

A method of scene characterization based on the presence of objects in voxels of a uniform grid has been proposed by Klimaszewski [99, chapter 4]. C The resolution of the grid is selected according to the 3 N rule, using homogeneous method: resolutionx

resolutiony

resolutionz

E

Gq

3

dvoxel r N 0 5st

(3.1)

where N is the number of objects and dvoxel is the voxel density. (It is usually assumed dvoxel 1 0.) Then the number of voxels in the uniform grid is NV resolutionx resolutiony resolutionz . The ratio between the number of empty voxels NE to all voxels expresses the coefficient known as sparseness Ψ: Ψ

NE NV The mean n˜ gives the average number of objects in a voxel: n˜

(3.2)

NV

∑ ni

1 NV

(3.3)

u

i 1

where ni is the number of objects in the i-th voxel of a grid comprising NV voxels. Variance v, the variability of the data around the mean, and standard deviation σ are given as: v

NV

1 NV

σ

ni n˜ 2

(3.4)

v

(3.5)

1 i∑ u1

C

To measure the unevenness of a distribution, the nonuniformity coefficient λ is used. The larger it is, the larger the disparities of voxel occupancy. λ is defined as: λ σ / n˜

(3.6)

Additionally, higher order moments reported are known as skewness: s 1 / NV and kurtosis: k

U 1/ NV

NV

∑

u

i 1

NV

∑

u

i 1

ni n˜ σ

ni n˜ σ

4

(3.7)

Vt 3

(3.8)

3

The objects should be assigned to the voxels using the intersection of the object surface with the voxel, not the intersection of the object’s #%$ with the voxel.

3.3.3

Integral Geometry Approach

Cazals and Sbert [27] investigated several integral geometry tools that characterize average case scene properties. Their strategy consisted in probing the scene with random entities (lines and planes) paying special attention to those statistics that may reveal the spatial distribution of scene objects. A global line in this context is a ray with its origin outside the scene. When shooting a global line, the goal is to find out not only the closest intersected object, but all objects intersected along the ray path. We selected all the random line-based tests for our experiments, which allowed us to determine the following characteristics: average number of intersection points for a global line crossing the scene nG int , probability of not intersecting any object in the scene p0 , and relative average length of a line span that lies in free space slen .

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

38 3.3.3.1

Average Number of Intersection Points

When casting a global line through the whole scene, an average number of intersections with scene objects nG int may be determined a priori as:

nG int

2 SAtotal SA *#%$&*+p

N

SAtotal

∑ SA Oi D iu 1

(3.9)

where SA *#%$&*+v is the surface area of a tight scene axis-aligned bounding box and SA Oi is the surface area of the i-th object in the scene. A posteriori, we can compute this characteristics by casting n global lines to the scene: 1 n enG (3.10) inti int n i∑ u1 where inti is the number of intersections with objects for i-th global line. Standard deviation σenG int of this characteristics is also reported in [27] for casting the set of global lines. 3.3.3.2

Probability of Zero Intersections

The probability of a ray not intersecting any object in the scene, denoted p0 , is quite hard to determine analytically. As we have to cast global lines anyway in order to compute other complexity characteristics, we compute the probability as: n0 (3.11) p0 ntotal where ntotal is the total number of global lines cast and n0 is the number of global lines that did not intersect any object in the scene. 3.3.3.3

Free Path Statistics

While tracing a global line through a scene, every intersection adds an additional “span” to the traced line. The average length of spans may give us an insight into the spatial density of the scene under investigation. For every global line ni cast we sum up the span lengths lk and also identify the maximum span length lmax . For the total of nspans spans the free path statistic is then given as: slen

1

nspans r lmax

nspans

∑ li

(3.12)

u

i 1

Similarly to enG int , standard deviation for free path statistics σslen is in [27] also reported from the experiments.

3.3.4

Information Theory Approach

Feixas et al. [46] describe scene complexity as a task of determining the mutual information transfer. In their paper they present a number of complexity measures from information theory quantifying how difficult it is to compute visibility in the scene accurately. While working with a scene discretised into patches, the paper also contains a definition of scene continuous mutual information, which is mutual information independent of any discretisation of the scene. We have used continuous mutual information ISC to characterize our scenes. The scene continuous mutual visibility information can easily be determined using Monte-Carlo integration with global lines. The whole quite involved formula boils down to: ISC

w

1 nPP SAtotal cos θU cos θV log x ∑ y nPP i u 1 πd U V 2

(3.13)

3.4. TESTING PROCEDURES

39

where nPP stands for the total number of point pairs observed, SAtotal is the total scene surface area, U V is the point pair under investigation, d U V is the distance of those two points and θU θV are the angles between the direction vector and the corresponding normal vector to the objects’ surface at U and V . As objects in the SPD scenes used for tests here do not overlap, for the purposes of testing in this chapter we have SAtotal ∑Niu 1 SA Oi , where SA Oi is the surface area of i-th object in the scene. In the case of more complex geometry (CSG objects), the stochastic area estimation method proposed by Wilkie et al. [161] can be used.

3.4

Testing Procedures

The design goal of the RSA testing procedures is to emulate the use of ray shooting in rendering algorithms. For the design we prefer to take only surface geometry into account and do not perform any lighting or material calculations. This way most of the computation time in the tests is really devoted to ray shooting, which is the subject of our main concern. We propose four different testing procedures. The first three tests are general methods simulating the use of ray shooting in global illumination rendering algorithms. The last testing procedure is ray tracing an image as defined in SPD as an example of simple and real rendering task. We do not use ray tracing just to get a visually appealing image; it also allows us to find errors when implementing a new RSA, and it also allows subjective evaluation of scene properties. When a test scene has quite a varying object distribution, performing ray shooting tests for a single origin of a ray located somewhere in the scene may reveal only local properties of the tested RSA. In order to test RSA behavior over the whole scene, the use of uniformly distributed global rays is more appropriate. In order to obtain equal ray distributions for all tested RSAs the same initial seeds for random number generators have to be used in each testing procedure. The usual way of generating global lines is to generate two uniformly distributed points on the scene bounding sphere and to shoot the ray between these two points. This method would however be unfair to those RSAs that use a ray space subdivision method, assuming the successive ray queries are somehow similar [17, 132], see Section 1.6.4. Algorithm 5 shows an alternative uniform global ray generation scheme that preserves the ray coherency of subsequent rays at the same time. The basic idea of the algorithm is to split a sphere into the B bands of the same width (and thus the same surface area) and assign the same number of points to each band. This preserves the same surface area is also assigned to each point. Then casting the global lines among all pairs of points has uniform line density.

3.4.1

Definition of Testing Procedures

The first of four testing procedures, denoted TPA , test the properties of RSA for rays distributed in the whole scene. The second testing procedure TPB test the properties of RSA in the space, where the most objects are present. The third testing procedure TPC simulates a random walk in the space were most of the scene objects are present. The last testing procedure TPD is the already mentioned ray-tracing according to SPD. The testing procedures are designed as follows: Testing procedure TPA : Shoots only primary rays that are generated using Algorithm 5 on a sphere circumscribed to the #%$ of the scene. The #%$ of the scene is computed as a union of the #%$ s of all objects in the scene. Testing procedure TPB : Assigns a tight #,$- Oi to each object Oi in the scene. For each #,$- Oi computes its center point Qi and computes the center of the sphere as Q 1 / N r ∑Niu 1 Qi . Using a binary search finds the minimum radius of a sphere with center Q containing 90% of all Qi . Shoots only primary rays generated on this sphere using Algorithm 5. Testing procedure TPC : The same as TPB , but the rays are randomly reflected using uniform distribution over the hemisphere given by the surface normal at the hit point. The bounces continue until

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

40

Algorithm 5 Systematically casting n r n 1 rays uniformly distributed on sphere.

9 Compute n points uniformly distributed on the unit sphere. = B 8 number of bands 9 9 Subdivide by ? B z 1 A planes z { const zi { 1 z 2 ? i > 1 A| B, i } 0 ~ B z 2 == . 9 Create B sphere the same surface area on the sphere. = 9 bands having b 8 0 current band index = 9 SAone 8 9 4 | 3 π | n required surface area of one region = α 8 0 the end angle of the current region = for all i } n points 9 do SAcurr 8 0 Current region has zero area. Always generate point at the end of a new region. = Integrate bands by increasing α or/and b until SAcurr { Sone < Pi < z { 0 < 5 o? 1 z? 1 z 2i A| n A R { 1 < 0 z ? Pi < z A 2 , Pi < x { R cos ? αi A , Pi < y { R sin ? αi A end for Transform n points to world space given the sphere center C and radius R. for all i } n points do for all j } n points do if i9 { j then For TPA , TPB , and TPC . = Shoot ray between points i and j on the sphere in the world space. 9 Only (primary) for TPC spawn higher order rays. = end if end for end for

the maximum depth of recursion drec 4 is reached (primary rays have drec 0) or until the ray leaves the scene. Testing procedure TPD : Recursive ray tracing exactly as defined in SPD. This task requires a camera to be set and surface materials to be defined. Depth of recursion is the same as for TPC , and the number of primary rays cast is 513 L 513. All other details can be found in the Readme.txt file in the SPD distribution [69]. Obviously, it is always possible to construct an artificial scene where our testing procedures will not fulfill the proposed goal. However, scenes that are used for practical purposes will not pose difficulties to our tests.

3.4.2

Invariants

Given a testing procedure TP and a scene we can determine a set of parameters for every tested scene that will remain constant regardless of the RSA used. These invariants can be used to verify the results of RSA implementation. However, we will be aware that this verification does not impose a globally correct RSA implementation, it merely proves that the results are correct for the particular scene. The invariants are: Nhit : hit : N prim Nsec : hit : Nsec

number of primary rays hitting the scene axis-aligned bounding box (applies for TPA -TPD ), number of primary rays hitting any object (applies for TPA -TPD ), number of secondary rays (reflected rays for TPC , reflected and refracted rays for TPD ), number of secondary rays hitting any object (reflected rays for TPC , reflected and refracted rays for TPD ), Nshad : number of shadow rays (applies for TPD only), and hit : number of shadow rays hitting opaque objects (TP only). Nshad D In practice we observed that the invariants are equal or that they differ just very slightly due to numerical precision problems. The relative error for shooting 106 rays using single-precision floating point

3.5. RESULTS AND DISCUSSION

41

arithmetic was always below ε 10 7 4 in our experiments, which is acceptable under the assumption that no differences between images obtained as a result of TPD for different RSAs are visible. Given the finite precision arithmetic, we should consider that tuning an RSA implementation to get exactly the same results can even be impossible to achieve, since the RSA need not be robust to numerical imprecision.

3.5

Results and Discussion

In this section we describe the scenes used for the experiments, their scene complexity measures, and the results for all testing procedures.

3.5.1

Test Scenes

The process of collecting and preparing test scenes started in October 1999 and now in November 2000 is still under progress. We decided to group the collected scenes according to the number of objects, creating 7 groups: GX , X 0 22 5, each GX containing 15 scenes with 10X 1, 10X 1 objects, and G6 , 10 scenes with more than 106 objects, which is 100 scenes in total. There are many WWW sites offering 3D models usable for our purposes, but we have encountered two problems: First, models are usually available in proprietary formats and conversion into open formats (VRML’97 [2] in our case) does not usually work very well. This results in scenes having corrupted faces, invalid normals, missing textures, and so on. Second, one can never estimate the scene size before actually downloading the model. We observed that most of the 215 scenes downloaded until now typically contain 5 103 –5 104 objects. We did not found any suitable scenes having less than 100 or more than 5 105 objects on WWW. While small scenes can be modeled, composing meaningful large scenes is quite a demanding task. As a result, groups supposed to contain scenes with higher numbers of objects are still incomplete. Therefore, the project cannot progress at present. In the experiments presented here we have used three groups of SPD scenes with different object counts. Since SPD scenes are scalable, we decided to generate individual scenes with object counts as close as possible to maximum counts required for scene size: group G3SPD (103 objects), G4SPD (104 objects)1 , and G5SPD (105 objects). The scene size in generated SPD scenes depends on an optional size factor SF , where the ratio of scene sizes for SF and SF 1 varies from 1 2 for “teapot” to 8 0 for “jacks”. Table 3.1 lists one scene size in the SPD scenes for size factor SF 1 22 6. The bold typeset numbers of objects denote scenes selected into groups: G3SPD , G4SPD , and G5SPD . A SPD scene further in the thesis is referred to as “sceneX”, where X is the value of the size factor. We decided not to use the scene entitled “shells” for our experiments, as this is the only SPD scene containing densely overlapped objects, which should not be the case for a correctly modeled scene. Naturally, such a scene causes problems for any RSA that we tested. As soon as the RSA characteristics measured on the downloaded scenes are available, they can be compared to data measured on these SPD groups. This way we can verify the suitability of SPD scenes for testing RSAs. Table 3.2 shows selected scene complexity measures for the 30 SPD scenes. Comparing the computed complexities with the testing results presented below, it unfortunately seems that there is no direct correlation between existing complexity measures and RSA performance.

3.5.2

Results

We implemented the following RSAs, which are all based on the static data structures: 1 When

SPD scene are used for testing an RSA, G4SPD corresponds to the standard set of scenes that is used for testing.

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

42

SF 1 2 3 4 5 6

balls

gears

jacks

lattice

11 92 821 7382 66341 597876

147 1169 3943 9345 18251 31537

9 81 657 5265 42129 337041

20 81 208 425 756 1255

Scene mount rings 12 36 132 516 2052 8196

sombrero

teapot

tetra

tree

1922 7938 32258 130050 522242 –

57 244 561 1008 1585 2292

4 16 64 256 1024 4096

7 15 31 63 127 255

61 301 841 1801 3301 5461

Table 3.1: Number of objects of SPD scenes according to the size factor SF . SceneX

Group

N

SAtotal

Ψ

n˜

σ

λ

s

balls3 balls4 balls5 gears2 gears4 gears9 jacks3 jacks4 jacks5 lattice6 lattice12 lattice29 mount4 mount6 mount8 rings3 rings7 rings17 sombrero1 sombrero2 sombrero4 teapot4 teapot12 teapot40 tetra5 tetra6 tetra8 tree8 tree11 tree15

G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G3SPD G5SPD G3SPD G4SPD G5SPD

821 7382 66431 1169 9345 106435 657 5265 42129 1225 8281 105300 516 8196 131076 841 8401 107101 1922 7938 130050 1008 9264 103680 1024 4096 65536 1023 8191 131071

588.57 591.71 594.85 32.61 59.61 126.16 26.72 57.26 118.33 14.72 24.47 52.73 9.25 10.16 10.52 362.58 2817.90 32717.00 72.86 73.51 73.70 115.56 116.73 116.87 13.86 13.86 13.86 10006.00 10007.00 10008.00

0.49 0.97 0.99 0.81 0.78 0.76 0.47 0.58 0.63 0.00 0.00 0.02 0.65 0.84 0.93 0.69 0.72 0.74 0.70 0.80 0.92 0.64 0.85 0.92 0.63 0.76 0.90 0.50 0.67 0.80

1.72 1.28 1.01 1.58 2.27 2.82 2.84 2.63 2.58 2.33 3.69 3.59 3.06 2.36 1.88 2.40 2.69 2.43 2.78 2.44 1.91 3.05 2.20 1.78 3.27 2.74 1.99 1.53 1.37 1.60

12.23 19.20 23.20 5.01 6.08 6.70 3.37 3.61 3.76 1.35 1.68 1.68 5.29 6.89 9.38 5.41 5.46 4.93 4.60 5.32 6.96 7.85 9.74 13.19 5.11 5.70 7.02 23.04 42.97 100.10

7.12 15.00 22.91 3.18 2.68 2.37 1.19 1.37 1.45 0.58 0.45 0.47 1.73 2.91 4.99 2.25 2.03 2.03 1.65 2.19 3.64 2.58 4.43 7.42 1.56 2.08 3.52 15.02 31.41 62.59

10.01 17.68 26.68 3.89 2.88 2.40 0.80 0.96 1.01 0.96 0.03 0.10 1.68 3.24 5.76 2.44 1.94 2.00 1.31 1.93 3.60 4.85 9.21 19.40 1.44 2.15 4.12 22.54 45.86 87.38

Complexity k 99.91 344.85 780.77 15.47 7.49 4.62 -0.76 -0.61 -0.58 -0.20 -1.12 -1.01 1.73 10.34 35.20 5.23 2.45 2.99 0.35 2.07 11.40 36.36 125.22 625.33 1.03 3.47 18.44 509.36 2188.41 8379.39

nG int

enG int

σenG int

p0

slen

σslen

ISC

0.92 0.92 0.93 1.36 2.48 5.25 1.32 2.50 4.87 4.17 7.49 16.92 0.68 0.74 0.79 1.16 2.46 5.95 0.81 0.82 0.82 1.01 1.02 1.02 1.15 1.15 1.15 0.94 0.94 0.94

0.92 0.92 0.93 1.36 2.48 5.25 0.95 1.78 3.34 3.13 5.48 11.42 0.68 0.74 0.79 0.82 1.53 3.47 0.81 0.82 0.82 1.01 1.02 1.02 1.15 1.15 1.15 0.94 0.94 0.94

0.43 0.47 0.51 1.65 3.65 8.46 1.72 2.89 5.05 2.63 4.07 7.65 1.09 1.23 1.51 1.38 2.62 5.57 0.78 0.80 0.79 1.11 1.13 1.13 1.75 1.89 2.14 0.23 0.24 0.24

0.995 0.995 0.994 0.787 0.723 0.691 0.749 0.661 0.604 0.299 0.172 0.087 0.865 0.849 0.847 0.867 0.748 0.642 0.966 0.963 0.962 0.795 0.791 0.791 0.610 0.638 0.681 1.000 1.000 1.000

0.040 0.036 0.033 0.110 0.060 0.025 0.092 0.078 0.064 0.086 0.069 0.046 0.144 0.119 0.095 0.119 0.071 0.043 0.104 0.114 0.117 0.232 0.255 0.255 0.059 0.043 0.030 0.087 0.091 0.086

1.16 1.12 1.09 0.40 0.23 0.14 0.31 0.27 0.19 0.18 0.13 0.07 0.38 0.36 0.32 1.70 2.12 2.56 0.43 0.42 0.42 1.19 1.20 1.20 0.25 0.24 0.22 7.05 8.04 7.74

10.194 11.167 11.518 4.833 9.003 12.728 4.512 6.843 9.437 5.471 7.581 10.819 5.636 6.193 7.649 3.763 6.718 10.412 5.245 4.858 4.622 7.101 7.408 7.382 5.006 6.205 8.747 7.220 7.548 7.587

Table 3.2: Scene complexity according to Section 3.3 for G3SPD , G4SPD , and G5SPD scenes.

BSP:

The BSP tree [94] using an efficient recursive ray traversal algorithm [82]. The splitting plane always creates equally-sized children. The maximum allowed depth was 16, maximally 2 objects were allowed in a node (see Subsubsection 1.6.3.1),

KD:

The kd -tree, similar to the BSP tree, but the splitting plane is put according to the surface area heuristic [105]. The construction of the kd -tree is discussed in the following chapter. The same termination criteria as for the BSP tree were used,

UG:

The uniform grid [53], with resolution according to [85] (Woo’s method) with voxel density dvoxel 3 0 (see Subsubsection 1.6.3.3),

BVH: The bounding volume hierarchy built with cost function [63] (see Subsection 1.6.2), AG:

The adaptive grid, BVH over uniform grids [100] (see Subsubsection 1.6.3.4),

RG:

The recursive grid, a grid recursively put in the parent grid voxels again [93] (see Subsubsection 1.6.3.4),

3.5. RESULTS AND DISCUSSION

43

HUG: The hierarchy of uniform grids [26] (see Subsubsection 1.6.3.4). O84:

The octree with a sequential ray traversal algorithm built using midpoint subdivision [59] (see Subsubsection 1.6.3.2),

O84A: The octree-R using surface area heuristic [159] with the sequential ray traversal algorithm [59] (see Subsubsection 1.6.3.2), O89:

The octree using neighbor finding for the ray traversal algorithm [126] using midpoint subdivision (see Subsubsection 1.6.3.2),

O93:

The octree using the recursive ray traversal algorithm [54] (see Subsubsection 1.6.3.2),

O93A: The octree-R using surface area heuristic [159] with the recursive ray traversal algorithm [54] (see Subsubsection 1.6.3.2). The detailed description of parameter settings to build up underlying data structures of all RSAs is beyond the scope of the thesis, since it supposes detailed knowledge of all RSAs. We have consistently used the best settings that we found during previous experiments [85, 74], with one exception – if we ran out of memory, we allowed three iterations of modifying RSA construction parameter settings to require less memory and then tested again. This occurred for RG, AG, and HUG. Failures of this iterative parameter settings are reported in Table 3.4. There are two cases when results are not reported: either the testing procedure did not finish in 10 hours, or the computer memory was still exhausted even after three iterations of setting parameters for construction. It should be clear that manual tuning of these parameters to construct failure-proof data structures for all 12 given RSAs and 30 scenes is quite impractical. All the tests presented in this chapter were conducted on PCs running Linux, kernel version 2.2.1220, processor Intel Pentium II, 350 MHz, 128 MB RAM. Test program in the GOLEM rendering system was compiled using egcs-1.1.2 with “-O2” optimization switch. The total number of experiments was 1440 (12 RSAs by 30 scenes by 4 testing procedures). With 10 reportable parameters for every experiment (see Section 2.5) we measured 11520 hardware/implementation independent and 2880 hardware/implementation dependent RSA parameters. We omitted to find out the values ΘA , ΘIT , and ΘT S due to the time requirements that are necessary to get these values, since we are satisfied with ranking of tested RSAs. Due to space limitations it is not possible to report all the results from experiments here. We have therefore selected the main characteristics from all the experiments and we present here a short summary, the results for TPD are given in Appendix E, lines 48–58. All measured statistics are available on the WWW site of the BES project [79]. Table 3.3 reports for each tested scene the time TB needed to build the underlying data structure for the fastest RSA and the minimum time TR over all RSAs needed to run all testing procedures. Table 3.4 reports the average running time T˜R for a given RSA and testing procedure for all the scenes and summary times for columns and rows. The parameter m is the number of tasks where experiment failed due to memory limits, f denotes the number of cases when tests were not finished within the time limit. The RSAs are sorted into T˜R for the total sum including all testing procedures. We can see that the winner in the tests on SPD scenes is the RSA based on the kd -tree, while the RSA based on the BVH has the worst average running time, being in some tests even more than two orders of magnitude slower than the former. The total running time of the whole experiment was about 400 hours on a single processor. Tests TPA –TPC used 106 primary rays (exactly 1009 1008 1017072 rays), and the number of bands in Algorithm 5 was 8161. The graphs in Fig. 3.1 and Fig. 3.2 show a summary of the hardware/independent parameters. Parameters for each RSA are summed over all tested scenes and testing procedures, and are normalized to the highest value for all RSAs. The graph in Fig. 3.3 shows for each RSA its total running

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

44

Testing Procedure SceneX balls3 balls4 balls5 gears2 gears4 gears9 jacks3 jacks4 jacks5 lattice6 lattice12 lattice29 mount4 mount6 mount8 rings3 rings7 rings17 sombrero1 sombrero2 sombrero4 teapot4 teapot12 teapot40 tetra5 tetra6 tetra8 tree8 tree11 tree15

RSA

TPA TB s

TR s

KD KD KD KD KD KD UG UG KD UG UG UG KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD RG AG RG

0.30 1.75 16.2 0.6 2.45 22.79 0.04 0.3 12.68 0.1 0.6 8.6 0.09 1.49 25.9 0.47 2.36 23.55 0.32 1.37 26.39 0.38 2.18 22.4 0.1 0.49 10.9 0.13 1.75 358.5

3.72 3.51 3.82 5.01 5.71 7.91 12.59 16.38 21.76 9.47 12.02 14.53 6.83 9.11 18.45 8.14 12.61 22.23 6.18 6.82 11.83 5.65 6.65 9.54 6.6 7.74 13.69 3.72 3.82 3.72

RSA

TPB TB s

TR s

KD KD KD KD KD UG UG UG UG UG UG UG KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD AG HUG

0.30 1.75 16.2 0.6 2.45 7.24 0.04 0.3 2.9 0.1 0.6 8.6 0.09 1.49 25.9 0.47 2.36 23.55 0.3 1.37 26.39 0.38 2.18 22.4 0.1 0.49 10.9 0.34 1.8 10.4

13.63 17.64 31.41 10.1 12.94 18.36 30.51 38.87 46.45 13.36 16.88 19.35 11.18 15.71 37.07 30.01 38.09 61.13 8.00 9.16 16.88 11.56 13.89 23.46 9.33 10.9 19.47 20.66 29.96 76.37

RSA

TPC TB s

TR s

KD KD KD KD KD KD UG UG UG UG UG UG KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD AG

0.30 1.75 16.2 0.6 2.45 21.89 0.04 0.3 2.9 0.1 0.6 8.6 0.09 1.49 26.9 0.47 2.36 23.55 0.31 1.37 26.39 0.38 2.18 22.4 0.1 0.47 10.9 0.34 2.04 380.0

50.25 62.87 102.2 34.24 57.49 91.01 74.05 118.8 167.4 48.56 72.96 93.45 26.65 39.3 114.7 80.93 139.5 260.4 18.21 20.99 40.61 35.25 43.12 74.58 19.6 21.96 36.65 34.23 47.28 68.71

RSA

TPD TB s

TR s

KD KD KD KD KD KD UG UG UG UG UG UG KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD KD AG AG

0.30 1.75 16.2 0.6 2.45 22.97 0.04 0.3 2.9 0.1 0.6 8.6 0.09 1.49 25.82 0.47 2.36 23.55 0.33 1.37 26.39 0.38 2.18 22.39 0.1 0.47 10.9 0.34 1.8 380.0

21.4 27.06 42.0 38.96 36.24 40.59 10.46 19.82 30.69 33.24 39.95 43.65 18.88 21.06 25.14 40.39 64.76 106.6 3.82 4.0 6.9 13.94 15.66 23.85 2.48 2.66 3.57 18.39 20.61 43.38

Table 3.3: The RSAs with minimum TR s for TP A B C D , hardware/implementation dependent characteristics TB and TR .

time TR and the build time TB and the sum of both. These characteristics are summed over all testing procedures and scenes. (120 experiments if all the tests were completed successfully.)

3.5.3

Discussion

Below, we comment on the characteristics of each RSA tested. We sorted all RSAs according to their time T˜R summed over all tests, we start our discussion with the slowest RSA and finish with the fastest one. We use T˜R just for ranking according to Section 2.7. Comparing the results presented in Tables 3.3 and 3.4, and Figures 3.1, 3.2, and 3.3 together with all the extra data [79], we can comment on the tested RSAs:

BVH: Has rather poor results for all testing procedures compared to other RSAs. We see the main problem in the nature of the construction of BVH. It does not keep track of spatial coherency – when inserting a new object into the existing hierarchy there is no global spatial information

3.5. RESULTS AND DISCUSSION

RSA

45

Testing Procedure TPB TPC m f W T˜R m f

T˜R

TPA m f

W

T˜R

KD O93A O84A RG HUG AG UG O93 BSP O89 O84 BVH

9.98 15.01 15.41 35.09 39.63 63.31 11.74 16.86 28.63 14.49 17.44 1903

0/0 0/0 0/0 3/0 0/0 3/0 0/0 0/0 0/0 0/0 0/0 0/0

22 0 0 2 0 1 5 0 0 0 0 0

26.29 40.91 41.72 64.44 99.52 108.6 372.7 1114 1291 1127 1132 4569

0/0 0/0 0/0 3/3 0/0 3/0 0/0 0/0 0/0 0/0 0/0 0/2

21 0 0 0 1 1 7 0 0 0 0 0

∑all RSA

2170

6/0

30

9986

6/5

30

Total

W

T˜R

TPD m f

W

76.27 106.8 109.0 134.4 268.2 278.9 525.5 257.5 325.9 1421 1437 5111

0/0 0/0 0/0 0/2 0/0 3/1 0/0 0/0 0/0 0/0 0/0 0/6

23 0 0 0 0 1 6 0 0 0 0 0

29.42 38.59 39.08 47.84 77.38 136.5 145.4 392.8 560.3 381.7 400.2 3376

0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/0 0/0

22 0 0 0 0 2 6 0 0 0 0 0

10050

3/9

30

5625

0/1

30

W

142.0 201.3 205.2 281.8 484.7 587.3 1055 1781 2206 2944 2987 14960

0/0 0/0 0/0 6/5 0/0 9/1 0/0 0/0 0/1 0/0 0/0 0/8

88 0 0 2 1 5 24 0 0 0 0 0

27832

15/15

120

∑T P A B C D m f T˜R

Table 3.4: Average running time T˜R , memory (m) and time limit ( f ) failures, and number of “wins” W for all tested acceleration algorithms and testing procedures.

Figure 3.1: Parameters NG , NE , NEE , and NER summed for all scenes and each RSA, normalized to the worst RSA. A description of the parameters is in Section 2.2.

about other still uninserted objects.

O84:

The octree with a sequential ray traversal algorithm requires many traversal steps from the root node. Subdividing at midpoints does not work particularly well for sparse scenes (“treeX”).

O89:

Has a slightly better traversal algorithm that outperforms O84, especially for scenes with higher numbers of objects.

46

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

Figure 3.2: Parameters NIT , NT S , NET S , and NEET S summed for all scenes for each RSA and normalized to the worst RSA. A description of the parameters is in Section 2.2.

Figure 3.3: Total times TB s , TR s , and TB TR s for each RSA. The number gives the total build+running time (TB TR) for all tests for a particular RSA.

BSP:

Although conceptually the same structure as the kd -tree, subdividing at midpoints again results in poor performance for sparse scenes. For densely occupied scenes the BSP tree performs comparably to the kd -tree.

O93:

Due to the most efficient ray traversal algorithm, this outperforms O84 and O89 even if constructed using the midpoint subdivision. We can see that for all midpoint subdivision octrees, shooting rays inside the octree (TPB , TPC , and TPD ) is very time demanding. This is the price we pay for traversal down to the leaf when the intersected object is inside or very close to

3.5. RESULTS AND DISCUSSION

47

this node.

UG:

Classical RSA. The employed smart algorithm for heterogeneous grid resolution setting results in the best performance of UG for several scenes. These scenes are densely occupied with mostly regular structure (“jacksX”, “latticeX”, and “mountX”). In this kind of scene the down traversal phase for hierarchical spatial data structures is expensive, since the ray intersects the object very close to the origin of a ray. For sparsely occupied scenes the UG has rather poor performance as it lacks a sense of hierarchy.

AG:

As a combination of the BVH with UG this has average performance, but the prediction of memory needed to construct the underlying data structure is difficult for G5SPD scenes. For one scene the computation failed the time limit due to swapping. Tweaking of the construction parameters was necessary to get some G5SPD scenes to work on available memory.

HUG: Consists of UGs arbitrarily positioned in other UGs. It has not only a slightly better performance than AG, but also smaller and more predictable memory usage. RG:

UG inserted in voxels of UG recursively shows negligible performance improvement especially for TPC . Its performance varies, but five tests were failed on the time limit. Tuning the construction parameters to keep the test within the memory limit was difficult as memory consumption was rather unpredictable.

O84A: We can see that although a simple ray traversal algorithm was used, a more appropriate subdivision process improved the total performance by one order of magnitude compared with O84. The improvement is particularly apparent on sparse scenes in G5SPD . Unfortunately, the build time TB rises rapidly for G5SPD as well. O93A: The same improvement as between O84 and O93 can be observed due to the more efficient ray traversal algorithm. Again, the build time TB rises rapidly with the number of objects in the scene. KD:

Although kd -tree is in principle the BSP tree, the positioning of the splitting plane using the surface area heuristic and fast ray traversal algorithm makes this hierarchical spatial data structure into a winner, even if the improvements from O93A are not significant. The kd -tree is beaten in several cases: for regular artificial scenes (“jacksX”, “latticeX”, and “mountX”) by UG, for G5SPD sparse scene (“treeX”) by AG and HUG. However, the differences in performance in all these cases are small. The only disadvantage is that the build time TB for G5SPD scenes, which is comparable to the build times of O93A and O84A, can be rather high in comparison with the build time of UG. Therefore, if the number of ray shooting queries is low, using the kd -tree probably does not pay off.

In general, we observe that using surface area heuristic [105] pays off for both the octree and BSP tree to get the Octree-R and kd -tree. Also, RSAs based on hierarchical data structures win over the RSAs based on non-hierarchical ones in most cases, especially for sparse scenes. Whether these results and the ranking of algorithms according to ∑T P A B C D TR will be maintained for the planned collection of 100 downloaded scenes is not currently known.

3.5.4

Preliminary RSA Selection Algorithm

Examining the results of Tables 3.2 and 3.3, we would like to present a preliminary proposal of an algorithm for selecting an RSA to be used given a scene: First, construct a uniform grid over the % s of objects using the heterogeneous resolution setting with voxel density dvoxel 1 0. Then compute the sparseness parameters δ, s, and k. When δ, s, and k are low (see Table 3.2), then use an RSA based on

48

CHAPTER 3. BEST EFFICIENCY RAY SHOOTING ALGORITHM

UG. If these three parameters are in the middle range, use an RSA based on kd -tree. If these parameters are very high, then consider either an RSA based on kd -tree or RG/AG for even better performance, but be aware of the possible high or even unrealizable memory requirements for RG/AG. If none of the conditions above are applicable, then use the RSA based on kd -tree. Here, we should stress that this preliminary RSA selection algorithm is derived from the results of experiments for 30 SPD scenes with a fractal nature. Its validity has to be verified on larger a set of scenes. Knowledge of whether most rays will be shot inside the scene will also be helpful for selecting the RSA.

3.6

Conclusion and Future Work

In this chapter we outlined the goals of the BES project and its status in late November 2000. We have proposed four testing procedures for testing RSAs, the RSA invariants for particular testing procedures, and an algorithm for systematically shooting uniformly distributed rays. Based on the measured data we have also outlined a heuristic to select a suitable RSA given a statistics based on scene sparseness. Even if the heuristic predicts reasonably well for the tested SPD scenes, we do not know if the algorithm in its present form will be applicable to general scenes, and its success ratio. It is only clear that for a small number of rays to be shot no construction of an underlying data structure for a particular RSA pays off at all. After testing 12 RSAs over 30 SPD scenes of different complexities and 4 testing procedures, we can conclude that using RSAs based on hierarchical spatial data structures, particularly on the kd -tree, definitely pays off – except for densely occupied scenes. This observation supports our opinion that it is very unlikely that there will be a single optimal RSA for general use. The BES project has not yet come to an end. In order to provide a sound basis that will help us to avoid further speculations about the design and use of different RSAs, several tasks have to be completed: First, we have to make our collection of 100 practical scenes complete and run all the tests again over this set. This will provide us with a vast amount of statistical data that will have to be analyzed together with the results of experiments already presented here. When the results are ready, we will be able to conclude whether the results for the SPD scenes presented in this chapter correlate with the results obtained for the practical scenes. This will also reveal how well the distribution of objects in the scenes from SPD actually simulates the distribution that occurs in the practical scenes.

Chapter 4

Construction of Kd -Trees In this chapter we describe several new methods for constructing the kd -tree for RSAs that are based on this spatial data structure. The construction algorithms use the cost model that estimates the average cost of traversing an arbitrary ray through the kd -tree, combining the cost of the traversal step and the cost of a ray-object intersection test. The estimated cost is then used to govern the position and orientation of the splitting plane during kd -tree construction, which proceeds in top-down fashion. The structure of this chapter is as follows. We describe our motivation for selecting an RSA based on the kd -tree for detailed research, previous work, and several contributions of ours in separate sections.

4.1

Motivation

Starting with this chapter, the rest of the thesis is devoted primarily to the kd -tree. The first phase of the BES project presented in the previous chapter showed us that the kd -tree is statistically the best from common heuristic RSAs, at least for tested 30 SPD scenes. There are also other reasons for selecting the kd -tree as the winning candidate of the first phase of the BES project for detailed research:

The kd -tree does not suffer from exhaustive memory complexity requirements; We can observe from the first phase of the BES project that the number of elementary and generic cells is roughly linear with the number of objects. Moreover, the limitation on maximum memory usage given by hardware available for the kd -tree can be encoded in the termination criteria for its construction. The kd -tree has also been studied within the field of computational geometry, with promising results for IE2 and higher dimensions. The research in computational geometry is connected with term partition strongly related to the kd -tree. A partition is a binary space partitioning tree with a linear splitting entity (line in IE2 , plane in IE3 ) in general position such that each leaf of the constructed partition contains at most one object. d’Amore and Franciosa [36] show that is possible to construct a partition for a set of disjoint isothetic rectangles in IE2 . Mark de Berg et al. [39] show that for IE2 it is possible to construct a partition of linear size for a set of objects under certain conditions for input data. Agarwal et al. [4] discuss practical techniques for constructing of the kd -tree for orthogonal rectangles in IE3 . These approaches with N log N preprocessing and with linear storage achieve log N time complexity for point-location queries. However, for general shapes of objects, no algorithm for constructing a partition with linear storage and N log N preprocessing has been found, even for IE2. These results promote the use of the kd -tree for RSAs, disregarding the worst-case complexity measures. The kd -tree allows flexible positioning of the splitting planes, which results in various sizes of the elementary cells. The cells adapt well to the geometry of the scenes. According to the results of the first phase of the BES project, this feature is particularly important for sparsely occupied scenes, and we can quantify the significant difference in performance between an RSA based on 49

CHAPTER 4. CONSTRUCTION OF KD-TREES

50

the BSP tree and an RSA based on the kd -tree. Although the BSP tree is in principle the same spatial data structure as the kd -tree, fixing the position of the splitting planes in the center of the cell caused the performance of an RSA based on the BSP tree sometimes to be worse by order(s) of magnitude than that of an RSA based on the kd -tree. We observed this for experiments performed within the first phase of the BES project, when the performance of an RSA based on the kd -tree was always better than or equal to an RSA based on the BSP tree.

The kd -tree is principally scalable to IEn space for arbitrary n. This holds for the kd -tree construction and ray traversal algorithms described in this thesis.

The kd -tree can be used to model topologically many other spatial subdivisions. This means that the kd -tree can be constructed in such a way that it corresponds to the spatial topology of elementary cells for the uniform grid, non-uniform grid, both elementary and generic nodes of an octree, Octree-R, and BSP tree. On the other hand, the spatial topology of the kd -tree can be mapped to a more general spatial data structure: the bounding volume hierarchy (BVH). Unfortunately, the ray traversal algorithm for the BVH is much less efficient than for the kd tree, due to the general nature of BVH, since the cells of the child nodes of BVH referenced in one node can overlap. Further, this overlapping of cells within a hierarchical data structures is not suitable for efficiency reasons, see the point below and the results presented in the previous chapter.

The kd -tree contains no overlapped elementary cells, and thus no two descendants referenced in one interior node of the kd -tree. This overlapping of cells occurs in the hierarchy of uniform grids (HUG) by Cazals [26], adaptive grids (AG) [100], and BVH [63]. The overlapping of cells, which is the overlapping of spatial regions, always induces an elementary/generic cell is to be referenced in more than one other generic cell. First, such multiple referencing leads to repetitive testing of elementary cell for the intersection with a ray. It is usually necessary to solve the ray query for the whole cell as if it were the scene itself. Repetitive computation for can be reduced by a ray-cache (also called a mailbox [23, 12]), where the result of intersection between the ray and is cached. Nonetheless, the time needed to access the ray-cache cannot be completely eliminated. Second, since an elementary cell can also be overlapped with another elementary cell ¡ (or several such cells), then both of these must be checked for intersection within the overlapping spatial region, since the object with the closest ray-object intersection must be chosen. Therefore it may occur that a part of the spatial region covered by checked for intersection with the ray is not used at all, that is, the computation within part of cell was useless. Whether the computation is useless for a particular cell cannot be determined until the second cell is tested. It would be theoretically possible to traverse more than one cell in parallel within a ray traversal algorithm, but this would be rather complicated. We are not aware that any such ray traversal algorithm has been published. The results of the first phase of the BES project support our view on cell overlapping: it should be avoided, in order not to decrease the quality of encoding the distance among the objects in spatial data structures. The kd -tree is a hierarchical spatial data structure, i.e., it enables us to deal with objects of various sizes using the level of detail concept. It is particularly required for scenes with unevenly distributed objects. The use of a hierarchy in data structures for RSAs was considered by some researchers in the past as a disadvantage [53] in comparison with RSAs based on non-hierarchical data structures such as the uniform grid. Unfortunately, these RSAs based on non-hierarchical spatial data structures are very inefficient for scenes with unevenly distributed objects. As we have seen in the previous chapter, they can slightly outperform RSAs based on hierarchical data structures for densely occupied scenes with a regular structure, but in general, they suffer from performance problems. There are several efficient ray traversal algorithms for the kd -tree. The efficiency of the ray traversal algorithm for the kd -tree is given by the simple representation of information in the

4.2. PREVIOUS WORK

51

kd -tree node – since the interior node contains the splitting plane and thus two child nodes, we have to decide between four cases: traverse only left child, only right child, the left child first and then the right one, or the right child first and then the left one. Efficient ray traversal algorithms for the kd -tree have been developed in the context of the BSP tree, and are further dealt with in Chapter 5. The rest of this chapter is structured as follows. First, we discuss why an RSA based on the kd -tree is more efficient than an RSA based on the BSP tree with arbitrary oriented splitting planes. The problem of top-down construction of kd -trees is in fact simply formulated in two issues. First, we have to decide if to declare the current node containing the references to the objects as a leaf. When the answer to this question is negative, we have to put the splitting plane, and the second issue is: where to position the splitting plane. We discuss methods for positioning the splitting plane, and we describe in detail the concept of a cost model, and its use. Further, we show how the empty spatial regions in the scene can be utilized in kd -tree construction to increase the performance of an RSA based on the kd -tree. Then, we deal with the termination criteria – whether or not the current node of the kd -tree should be already declared as a leaf. Further, we generalize the cost model and show several possible kd -tree efficiency improvements, and also their limitations. We conclude the chapter by a summary of results from experiments that we have performed.

4.2

Previous Work

In this section we describe the work concerning kd -tree construction performed in the past. First, we recall several basic facts. The recursive construction of kd -trees in top-down fashion is similar to that for the BSP tree described in Subsubsection 1.6.3.1, namely Algorithm 1. We are not aware of any method published that uses a bottom-up approach, even if this might be possible using some clustering method and redistributing the objects straddling the splitting plane back down to the children. Obviously, the top-down construction is more straightforward. The recursiveness of the top-down approach is encoded so that in each step of the construction the set of objects is taken as the whole scene and the splitting plane position and orientation is determined.

4.2.1

Orientation of the Splitting Plane in the Kd -Tree

Here, we show why we use for RSA the kd -tree, which has axis-aligned splitting planes, instead of the BSP tree with arbitrarily oriented splitting planes. The main reason is imposed by the ray traversal algorithms for these two spatial data structures, which require that we compute the signed distance from the origin of a ray to a splitting plane. For example, a plane Πa perpendicular to the a-axis (a x ¢ y ¢ z) is described by the formula: Πa : a ap const. A ray R is described by its origin OR and the direction vector D£ R . The computation of the signed distance t for the intersection of an arbitrary ray with the plane is: a p ¤ ORa t ¢ (4.1) DRa assuming DRa ¥ 0. Note that inverse of DRa can be precomputed that is practically important for speed of computation. For an arbitrarily oriented plane given by the equation Π : a x ¦ b y ¦ c z ¦ d 0 the computation of the signed distance with the ray is more computationally demanding: t

a ORx ¦ b ORy ¦ c ORz ¦ d a DRx ¦ b DRy ¦ c DRz

(4.2)

For the arbitrarily oriented plane, the number of elementary arithmetic operations is about 3 times higher than for the axis-aligned plane. Note that unlike Eq. 4.1 the denominator in Eq. 4.2 cannot be

CHAPTER 4. CONSTRUCTION OF KD-TREES

52

precomputed. The computation of the signed distance forms only a portion of the cost of a traversal step in the ray traversal algorithm, but it is significant enough from the viewpoint of the total running time of the ray traversal algorithm. To justify the use of a BSP tree with arbitrarily oriented splitting planes for RSA we could require that we decrease the total running time significantly (about 2-3 times) by reducing the number of traversal steps or ray-object intersection tests. To the best of our knowledge no such method has been published until now. Whether or not the objects belong to one halfspace induced by the generally oriented splitting plane is also much more computationally demanding. Moreover, the number of possible positions of the splitting plane would increase from N to N 3 ` As a result, the build time of the kd -tree increases considerably, which is rather unacceptable. For these reasons we do not deal with generally oriented splitting planes in the rest of this thesis, and thus we stay with the kd -tree. There are several ways to select the orientation of the splitting plane provided that it is perpendicular to one of the coordinate axes. In the axis-aligned form of the BSP tree (Section 1.6.3.1) the orientation of the splitting plane is changed in cyclic order (x ¢ y ¢ z ¢ x ¢222 ) with the increasing depth of the node in the kd -tree. The starting axis for the splitting is not specified, but it is usually the x-axis [94, 148] regardless of the shape of scene % . There are also several other ways to orientate the splitting plane, but since the used method is intertwined with the positioning of the splitting plane, we describe it below.

4.2.2

Positioning of the Splitting Plane

During construction of a kd -tree in top-down fashion we have to solve algorithmically the problem of splitting the % of the current node into two new child nodes. This also covers the assigning of objects into the child nodes. One of our assumptions is that each object Oi has a finite size and thus it also has finite ,- Oi . Since the splitting planes in the kd -tree are axis-aligned and we have to decide whether the objects lie on the left side or the right side of the splitting plane, it is also advantageous to use the , s of the objects for this decision. There are several known methods for positioning and orientating the splitting plane in the kd -tree:

Spatial Median: In BSP tree construction the splitting plane always splits the % associated with the current node into two halves. Since the result of the splitting is two % s of the same size, then these % s have to be smaller with the increasing depth of the node in the kd -tree. This approach balances the space on the sides of the splitting plane. Object Median: Another way is to position a splitting plane so that the number of objects lying on its left side and right side is equal. Some objects can be assigned to both children, since their , s straddle the splitting plane. This method balances the number of objects on the sides of the splitting plane. This may seem natural, since it resembles the way of constructing a balanced binary search tree over a set of numbers in IE1 . One can then suppose that the construction of such a balanced binary search tree could be advantageous for an RSA. Unfortunately, this is a rather incorrect intuition, and the method has severe performance deficiencies compared with other methods, as we will show by the results of our experiments. We remark that the kd -tree constructed for RSA is not the binary search tree used for a common range search within a one-dimensional interval. The main difference is that in search structures in IE1 such as range trees [40], the search is finished in log N time. Within a ray traversal algorithm for the kd -tree after we descend to the first leaf of the kd -tree, this does not mean that the search is finished; when a ray does not intersect any object referenced in the first leaf, the ray traversal algorithm continues finding the leaves along the ray path. Cost Model:

This method is based on a further refinement of the computation model introduced in Section 2.2 for the kd -tree. This approach is based on a cost model, which estimates

4.2. PREVIOUS WORK

53

the average cost of traversing an arbitrary ray through the kd -tree during the construction. The method outperforms both spatial median and object median methods for all tested scenes that we have tested until now, as implemented within the GOLEM rendering system. Historically speaking, the method was first introduced by MacDonald and Booth during the Graphics Interface conference [104] in June 1989, and was further revised and published in [105]. This cost model was also researched by Subramanian and Fussel [144, 145], who also published the an experimental comparison of the theoretical cost and measured cost of the hierarchy based on its termination criteria [146]. We deal with this construction method in deep detail below. The construction of the kd -tree for the case of the spatial and object median needs no further discussion. This is not however the case for the cost model, which we recall and discuss below in detail, following MacDonald and Booth [105].

4.2.3

Cost Model for Kd -Tree Construction

The cost model is a theoretical model that estimates the cost of a ray passing through a kd -tree under several assumptions. Such an estimate includes both the cost for visiting the interior nodes and leaves of the kd -tree and the cost of computing ray-object intersection tests. In the following text, let Xˆ denote an estimate of quantity X , that is, we cannot determine a value of quantity X exactly in advance, but we can only somehow estimate its value. The development of the cost model is enabled by several simplifying assumptions. One of these assumptions uses geometric probability. 4.2.3.1

Geometric Probability

The development of the cost model is connected with the following observation, known from geometric probability theory. For more mathematical details, see [138] or the survey by Cazals and Sbert [27], which is more oriented to applications in global illumination algorithms. Geometric probability tools for the construction of underlying data structures for RSAs were first used for BVH by Goldsmith and Salmon [63]. Their approach is also outlined in the survey by Arvo and Kirk [18]. Let X and Y be spatial regions of convex shape such that X contains Y , i.e., X § Y Y . We want to express the conditional probability pY ¨ X that an arbitrary ray intersects the spatial region Y assuming it intersects the spatial region X . The arbitrary ray is a ray that has uniform distribution of the origin and direction of the ray, so the line density induced by rays in the spatial region X and hence Y is constant. Further, it is supposed that the arbitrary ray has a direction outside the spatial region X . The situation is depicted in Fig. 4.1.

Figure 4.1: Computing the conditional probability that an arbitrary ray hits spatial region Y once it passes through spatial region X . Stone in [140] (cited according to [63]) showed that this probability pY ¨ X is proportional to the surface area of the convex spatial region Y divided by the surface area of the convex spatial region X : pY ¨ X

SA Y SA X

(4.3)

CHAPTER 4. CONSTRUCTION OF KD-TREES

54

If we restrict our observation to X and Y to be % s, then we can express pY ¨ X as follows: pY ¨ X

Y Y ¦ Xww Xhh ¦

Yw Yd ¦ Yh Yd Xw Xd ¦ Xh Xd

¢

(4.4)

where subscripts w¢ d ¢ and h denote the width, depth, and height of % . 4.2.3.2

Basic Cost Model Development

The development of the cost model here follows the paper by MacDonald and Booth [105]. The cost model is based on several rather unrealistic assumptions for ray shooting in order to apply the formula 4.4 in the estimated quantities:

all rays intersect the , associated with the root node of the kd -tree, the distribution of rays is uniform, all rays do not intersect any object.

The last assumption is very unrealistic, since it contradicts the purpose of any RSA. Nevertheless, under these assumptions we can estimate the following quantities of a kd -tree as follows: Number of interior nodes of the kd -tree traversed per ray: Nˆ T I

SA i ∑ SA root ¢ i 1 Ni

(4.5)

number of leaves of the kd -tree traversed per ray: Nˆ T L

N SA l ∑ SA root ¢ l 1 l

(4.6)

number of ray-object intersection tests per ray: Nˆ IT

Nl

∑ l 1

SA l D N l SA root

¢

(4.7)

where the other quantities are: Ni – number of interior nodes of the kd -tree, Nl – number of leaves of the kd -tree, (Nl Ni ¦ 1), N l – number of objects stored in leaf l of the kd -tree, SA i – surface area of interior node i, SA l – surface area of leaf node l, SA root – surface area of the , of the whole scene. The estimate of the number of operations above performed during the ray traversal algorithm can be used to estimate the average total cost of ray shooting under the assumptions above, if we know the costs of specific operations of the ray traversal algorithm. For further development we assume a recursive ray traversal algorithm for the kd -tree as outlined in Subsubsection 1.6.3.1. The cost of these operations is connected with a given implementation and can be obtained experimentally. Then from a general definition of the performance model (Eq. 2.2) we can estimate the total cost CˆT for shooting an arbitrary ray: CˆT s

CˆT I Nˆ T I ¦ CˆT L Nˆ T L ¦ CˆIT Nˆ IT

1 CˆT I SA root

Ni

∑ SA i^¦

i 1

CˆT L

(4.8) Nl

∑ SA l ^¦

l 1

CˆIT

Nl

∑ SA l D N l ©Y¢

l 1

(4.9)

4.2. PREVIOUS WORK where

CˆT I s CˆT L s CˆIT s

– – –

55

the estimated cost of traversing an interior node of the kd -tree, the estimated cost of traversing a leaf node of the kd -tree, the estimated cost ray-object intersection test.

The estimate of the total cost is rather the upper bound, since it assumes that all arbitrary rays do not intersect any object, but at the same time it assumes that ray-object intersection tests are computed. MacDonald and Booth also discuss the variant with a ray-cache that avoids ray-object intersection computation for a ray more than once for the same object. The ray traversal algorithm tests a ray only once against a particular object, and the result of the ray-object intersection test is stored in the ray-cache. For the next ray-object intersection test with the same object the result is retrieved from the cache. Several research papers report that use of the ray-cache does not always increase the performance of RSAs in a ray tracing algorithm [147]. The use of the ray-cache also increases the ray-object intersection test cost CˆIT of all objects. Even if the ray-cache helps for several scenes, the impact on performance can be negligible and takes a few percent at maximum of the computation time consumed by an RSA. Since our observation on the use of the ray-cache in the RSAs in our previous experiment [80] confirmed that the impact on performance is rather questionable, we will not further discuss this extension here. MacDonald and Booth [105] claim the validity of the presented estimates for Nˆ T I , Nˆ T L , and Nˆ IT by a simulation performed on arbitrary scenes with arbitrarily built kd -trees for arbitrary rays. Assuming that the estimate of the total cost is accurate enough, we can use it to govern the construction of the kd -tree. This means that we choose the positions and orientations of the splitting planes in the kd -tree so as to minimize its total estimated cost (Eq. 4.9). MacDonald and Booth call any algorithm that tries to minimize the estimated cost of a kd -tree for an RSA a surface area heuristic. We should remark here that surface area heuristic does not necessarily find the global minimum of the estimated cost, but it rather tries to decrease the estimated cost. 4.2.3.3

Position of the Splitting Plane

In top-down construction of the kd -tree we always consider some set of objects pointed to in an % . The kd -tree construction algorithm is then reformulated to find the position and orientation of the splitting plane for the % associated with the currently processed interior node. MacDonald and Booth [105], in order to minimize Eq.4.9, proceed as follows, we quote: We assume that only major planes1 are used as splitting planes and we ignore the possibility of an object straddling a splitting plane (a case of practical importance, but one we ignore nevertheless). We have to choose a parameter b to position the splitting plane, where b 0 corresponds to the lower limit of the splitting plane and b 1 is the upper limit. Choosing b 0 5 is equivalent to selecting the spatial median. Let us look at the cost as a function of this parameter b. We observe that the internal node and leaf node components of this cost savings function are constant with respect to b. For the purposes of minimizing cost2 , we can minimize the function: f b

LSA b D NL b v¦ RSA b D_ N ¤ NL b 2

¤

SA N ¢

(4.10)

where N is the number of objects in the node, NL b is the number of objects to the left of the plane at b, and N ¤ NL b 2 is the number to the right of the plane because of our assumption that no objects straddle the plane. The surface area of the left and right subnodes are LSA b and RSA b , respectively, and the surface area of the node itself is SA. The first term 1 By

major plane they mean the axis-aligned plane. cost corresponds to the estimated cost in Eq. 4.9.

2 This

CHAPTER 4. CONSTRUCTION OF KD-TREES

56

represents the probability that a ray intersects the left subnode multiplied by the number of intersection tests performed in the left subnode. The second term is a similar quantity for the right subnode. The SA N term is the amount of work required if the node were not subdivided and thus is an amount of work saved by changing the original node from a leaf to an internal node, hence the minus sign. This last quantity is a constant with respect to b, so it may be removed from the function, resulting in the following function to be minimized: (4.11) f b LSA b D NL b v¦ RSA b D_ N ¤ NL b 2D¢ End of quotation. We consider that the description of the surface area heuristic is rather simplified and unclear, so we will elaborate it in more detail below. From now on we present another view of the positioning of the splitting plane. Let us assume the situation before splitting the % associated with the interior node by a splitting plane, e.g., at the root node of the kd -tree. The % associated with the node intersects N objects. If the node is not subdivided, it is actually a leaf of the kd -tree – then all its objects have to be pointed to in this leaf and they have to be tested for intersection with a ray. Let CIT i be the cost of the ray-object intersection test for the i-th object. The cost of such an unsubdivided node νE for a ray shooting query is then: N

CˆνE

∑ CIT i i 1

(4.12)

Figure 4.2: One subdivision step in a kd -tree If the % is subdivided as depicted in Fig. 4.2, then it is replaced by a new tree structure – the interior node νG with two leaves νEL and νER . The estimated cost of the new tree structure CˆνG is given as the sum of three terms – CˆT S , CˆL , and CˆR . The term CˆT S is the estimated cost of traversing the interior node of the kd -tree, either the leaf or the interior node. It does not incorporate any ray-object intersection tests, but the decision whether to visit either the left child or the right child or both children. Moreover, CˆT S involves the pointing to the child nodes within a ray traversal algorithm. The costs of visiting the left and right children should contain a factor with the conditional probability that a ray hits the % s of the leaves νEL or νER once it visits the parent node νG . The estimated cost CˆνG of the interior node νG is then expressed as follows: (4.13) CˆνG CˆT S ¦ pL CˆL ¦ pR CˆR where

CˆT S pL ¢ pR CˆL ¢ CˆR

– – –

estimated traversal cost of interior node νG , probability of a ray intersecting the left or right child node, respectively, estimated cost of the left and right subtree, respectively.

Under the assumption of uniformly distributed rays we can compute the probability of a ray hitting the % of the left and right child node using Eq. 4.1, i.e., pL SA *,- lchild νG 2`2ª SA *%& νG ` , pR b SA *%& rchild νG 2«2ª SA *,- νG « . Further, we can estimate the cost of the left and right subtree, supposing they will be constructed. The simplest input for the estimate is to consider the

4.2. PREVIOUS WORK

57

number of objects contained in the spatial regions corresponding to the subtrees to be constructed CˆL fC NL and CˆR fC NR , where NL and NR are the number of objects in the left and the right child, respectively. If we further choose the estimate to be linear function, e.g., fC k k, and suppose the objects do not straddle the splitting plane, we can rewrite the estimated cost in Eq. 4.13 to be Eq. 4.11. For the sake of convenience, we give the name cost function to any formula that estimates the cost of a kd -tree, e.g., Eq. 4.13. 4.2.3.4

Position of a Splitting Plane with Minimum Cost

As described above, surface area heuristic corresponds to the computation of cost function for all possible positions of the splitting plane for all its three possible orientations, and to the selection of the position that has the lowest estimated cost. MacDonald and Booth [105] claim that the position of the splitting plane with the lowest estimated cost has to lie between the spatial and object median, assuming that no objects are intersected by a splitting plane at the same time. In this case they prove that the cost value f bOM f bSM , where bOM and bSM are the positions of the object and spatial median. For further discussion, let the median interval be an interval between the object and spatial median, and let the minimum cost splitting plane be a splitting plane that minimizes the value of some cost function. The proof of f bOM f bSM is simple. The value of f bOM f 0 5 N LSA 0 5 because LSA 0 5 RSA 0 5 . Since the value of LSA b ^¦ RSA b 2 LSA 0 5 is a constant independent of b, the value at the spatial median is f ¬ bOM LSA b O¦ RSA b ©Y N ª 2 N LSA 0 5 . It can also be proved that the value of the cost function in the median interval is lower than the value of cost function outside the median interval, so the minimum is inside the median interval. The assumption that objects do not overlap in the projection to the coordinate axis is unrealistic for a general scene. If we want to minimize the cost function (Eq. 4.11) correctly for objects that possibly overlap in projection, the full range of the % along the axis should be searched. Although the range is continuous, certain discrete points can be used to simplify the search for the minimum cost. Without loss of generality, let us consider only one possible orientation of the splitting plane, for instance perpendicular to the x-axis. Objects’ , s can also be used instead of real objects. Fig. 4.3 shows an example of a scene with four objects and the corresponding graph of the estimated cost.

Figure 4.3: The value of the cost function in IE2 for four objects along the axis. Objects’ boundary positions play the key role in selecting the position of the splitting plane with minimum cost.

Assuming the number of objects straddling the splitting plane is NSP and a linear cost estimate is used, we can formulate another variant of the cost function for surface area heuristic as:

CHAPTER 4. CONSTRUCTION OF KD-TREES

58

CνG

1

SA *,- νG D

SA *,- lchild νG 2pD_ NL ¦

NSP ^¦ SA *%& rchild νG 2DD_ NR ¦ NSP ©Y¢

(4.14)

The cost function that uses a linear estimate of the cost for child subtrees is a piece-wise continuous linear function. The discontinuity points are given by the objects’ boundaries along the axis. The number of objects between two adjacent object boundary positions remains constant, and the cost function depends on the projected surface area only, which is a linear function in respect to the position of the splitting plane, which is b. This implies that the minimum value of the cost function can be found just at positions corresponding to object boundaries, i.e., using a finite number of splitting plane positions. Since the cost function has discontinuities of the first order, its value has to be evaluated to be minimum at the discontinuity point that corresponds to taking the minimum number of objects in , s associated with the left or the right child nodes. We call the algorithm for positioning a splitting plane that minimizes the cost function given by Eq. 4.14 an ordinary surface area heuristic (abbreviated to OSAH). The algorithm computes the value of the cost function Eq. 4.14 for each position of all object boundaries within the % to be split in all three axes. As a result, it selects the position of the splitting plane with the minimum value of the cost function.

4.2.4

Termination Criteria

The cost model gives us a recipe for positioning the splitting plane in this interior node, but it does not tell us whether we should proceed to subdividing the node further, or should declare the node as a leaf. Any node ν in the kd -tree has some basic characteristics: its depth d ν from the root node, %& ν associated with ν, and the number of objects N intersecting %& ν . The construction of the kd -tree implies that the number of objects intersecting the node’s % decreases with increasing depth of the node. The positioning of the splitting plane on the object median implies that the number of objects will be one half of it, but this construction method, as we have remarked, is not suitable for an RSA. The number of objects also need not decrease significantly in each subdivision step, since some objects can straddle the splitting plane. We now face the question: to subdivide or not to subdivide? This issue in the context of hierarchical spatial subdivisions is usually called termination criteria. We can view the termination criteria in terms of the kd -tree cost model (Eq. 4.9). We want to get an average height of the kd -tree that results in some “pseudo-minimum” total cost of the kd -tree. When the average height of the kd -tree is increased, the average number of traversal steps is increased and the number of ray-object intersection tests is expected to decrease – it may be possible to reach some pseudo-minimum cost point depending on the maximum allowed depth of the kd -tree. Below we discuss some common termination criteria. 4.2.4.1

Ad Hoc Termination Criteria

Ad hoc termination criteria were developed with the introduction of RSAs based on the BSP tree [94] and octree [59]. They are easily formulated, as already mentioned: the current node ν becomes a leaf when the number of objects intersecting the %& ν is lower than or equal to a fixed constant Nmax , or its depth d ν in the kd -tree reaches another fixed constant dmax . These two constants are specified by a user. The values of these two constants Nmax and dmax are left to the user’s experience and practice in rendering systems based on ray shooting (Mental Ray [137]). Nmax is usually one [94], we are not aware of any recommendations for maximum leaf depth dmax . In software packages some default values are provided, Mental Ray [137] has the default values dmax 24 and Nmax 4 regardless of the number of objects in the scene and other scene complexity characteristics (see Section 3.3).

4.2. PREVIOUS WORK

59

Figure 4.4: The cost in dependence on dmax , measured values for TPD and G4SPD . Left top shows ratio of ray-object intersection tests performed to minimum number of intersection tests. Left bottom shows the number of traversal steps per ray. Right top shows the average running time per ray, and right bottom shows the cost normalized to the ideal ray shooting time. (See Chapter 2 for details.)

Setting the maximum leaf depth dmax limits the memory used by the kd -tree, since it restricts the number of kd -tree nodes to a constant, more precisely to 2dmax . When dmax is too low, the number of objects in the leaves remains high even if further subdivision steps could bring performance improvement. It has been shown by Subramanian and Fussel [146] that the estimated cost of a kd -tree for a particular scene has some critical point with regard to dmax . The dependence of the measured cost on dmax is shown in Fig. 4.4 for SPD scenes for G4SPD . We can see from the graphs that increasing the maximum leaf depth dmax behind the critical point (dmax 16 ® 2), which is specific for each scene, does not bring any significant improvement in the total cost. It can even happen that the total cost increases, as it clearly does for the scene “rings”. Subramanian and Fussel [146, 143] do not discuss any method how for detecting the critical point or any algorithm for termination criteria utilizing this property of the kd -tree. We deal with this issue below.

CHAPTER 4. CONSTRUCTION OF KD-TREES

60 4.2.4.2

Automatic Termination Criteria

Subramanian and Fussel [146, 143] coined the term automatic termination criteria, which should not require any user specific constants, but they did not propose any particular algorithm. Motivated by the idea of automatic termination criteria, we have designed an automatic termination criteria algorithm based on the cost model, and we will describe it in Section 4.5.

4.3

Analysis of the Cost Model

In this section we perform the initial analysis of the cost model that is required for a better understanding of the following sections. The analysis concerns the pure geometry view of the splitting and minimization of the cost of the whole kd -tree.

4.3.1

Splitting Geometry of an Axis-Aligned Bounding Box

The cost function of surface area heuristic can be investigated from the purely geometric viewpoint of splitting the %& ν associated with the interior node ν into two halves. Until now we assumed that rays are uniformly distributed in space, which enables us to use Eq. 4.4 for the geometry of the split %& ν , as depicted in Fig. 4.5. Let the size of %& ν be described by width w, height h, and depth d. The splitting plane is positioned to be perpendicular to the coordinate axis that corresponds to the width, so the % associated with the left child has its right boundary at a distance wL from the left side of ,- ν .

Figure 4.5: Geometry of split node’s , . Let us denote the surface areas of all , s that are induced by the geometry of the split %& ν : SA *%& ν ¯ 2 _ w h ¦ h d ¦ d w be the surface area of the node ,- ν , SA *%& lchild ν 22 2 _ w h ¦ h wL ¦ wL d the surface area of %& lchild ν 2 associated with the left child of ν, SA *%& rchild ν 2o 2 _ w h ¦ h _ w ¤ wL `¦ d _ w ¤ wL 2 the surface area of the %& rchild ν 2 associated with the right child of ν, and SA SP 2 d h the surface area of the splitting plane restricted to ,- ν . Obviously, it holds for the notation above that b wl ª w. We can compute the probabilities of all four possible traversal cases that occur in the recursive ray traversal algorithm:

where

pLO pLR pRO pRL

– – – –

pLO

pRO

SA *%& lchild ν2 ¤ SA *%& rchild ν2 ¤

pLR

pRL

1 SA SP 22ª SA *%& ν° 2 1 SA SP 22ª SA *%& ν° 2

1 SA SP 2ª SA *%& νpD¢ 2

(4.15) (4.16) (4.17)

probability of a ray hitting the left child node only, probability of a ray hitting the left child first and the right child afterwards, probability of a ray hitting the right child node only, probability of a ray hitting the right child first and the left child afterwards.

4.3. ANALYSIS OF THE COST MODEL

61

Obviously, pL ¦ pR ¦ pLR ¦ pRL const ± 1 0. The size of ,- ν can be arbitrary, provided h ± 0, d ± 0, and w ± 0. Assuming objects associated with ,- ν do not overlap in the projection to the coordinate axes and the objects are uniformly distributed in space, we can investigate the value of the cost function with regard to the choice of orientation of the splitting plane. Without loss of generality we further assume h ² d ² w. If %& ν is split on the largest side of size w, then the average number of child nodes hit by a ray w NTV is as follows: w NTV

SA *%& lchild ν2° w ^¦ 2 h d ª SA *%& ν ^¦ 1

SA *%& rchild ν 2°° w 22ª SA *%& ν p

(4.18)

Similarly, if the splitting plane is oriented at ,- ν so that it splits the side of size d and h, we get the average number of child nodes hit by a ray as: d NTV

h NTV

SA *%& lchild ν2° d v¦ 2 w h ª SA *,- ν v¦ 1

SA *%& rchild ν 2° d 22ª SA *%& ν

(4.19)

SA *%& lchild ν2° hv¦ 2 w d ª SA *,- ν v¦ 1

SA *%& rchild ν 2° h 22ª SA *%& ν

(4.20)

w ² N d ² N h . For example, for the Since we assumed h ² d ² w, we can easily prove that NTV TV TV w d h size of the % d 2 h ¢ w 3 h we get NTV 4 ª 22 ¦ 1 1 18, NTV 6 ª 22 ¦ 1 1 27, and NTV 12 ª 22 ¦ 1 1 55. We can then suppose that it should be advantageous to split the , in the axis corresponding to the largest side of the % . Since the distribution of objects in the projection to the chosen axis is not generally uniform, and objects may overlap, we cannot rely on this assumption. However, we see that the estimated cost is biased with the geometry given by %& ν . The orientation of the splitting plane that results in the minimum estimated cost cannot be chosen in advance; the cost function must be evaluated in all three axes for the boundaries of the all objects, thus possibly at 3 2 N 6 N positions for N objects. At least we have shown that the selection of an axis of the splitting plane is crucial for minimizing the estimated total cost of the kd -tree, and thus this orientation cannot be chosen in advance. The orientation of the splitting plane changed in cyclic order [94, 148] in BSP tree construction, starting with the x-axis, is thus only one possible way, which may be rather inconvenient especially for the oblong shape of the scene % .

4.3.2

The Kd -Tree with Minimum Total Cost

Even if the cost model is based on some unrealistic assumptions, we can ask whether it is possible to construct a kd -tree that according to Eq. 4.9 has the minimum estimated cost from all kd -trees. Obviously, this task is not solved by selecting the minimum cost splitting plane using Eq. 4.14 in the currently processed interior node, since it uses a linear estimate for the cost of the subtrees that have not yet been constructed. The linear estimate is valid only for the case when the subtree is not further subdivided, i.e., for a leaf. Therefore, it is possible to compute the estimated cost of the node ν correctly according to Eq. 4.9 after the process of the kd -tree construction is finished for the whole subtree rooted at ν. In order to get the global minimum estimated cost for the whole kd -tree according to Eq. 4.9, it is necessary to take into account each possible position of the splitting plane for all nodes within the construction (at most 6 N positions for N objects pointed to in the current node). For all these positions we have to construct left and right subtrees, both of them again with the minimum estimated cost. Then we can combine the obtained costs using Eq. 4.13 to compute the estimated cost of the current node. As the final step, we must select a splitting plane that corresponds to the minimum total estimated cost. This

CHAPTER 4. CONSTRUCTION OF KD-TREES

62

algorithm is of a recursive nature, and this causes the problem of obtaining the kd -tree with minimum total cost according Eq. 4.9 to be NP-hard. Proving this statement formally correctly would be rather difficult and lengthy, it requires polynomial time reduction from thhe kd-tree construction algorithm to a problem whose NP-hardness has already been proven.

4.4

Construction of Kd -Trees with Utilization of Empty Spatial Regions

The assumption that splitting planes do not intersect objects limits the position of the splitting plane with minimum cost to the median interval. It can be supposed that both sides of the splitting plane contain some object(s), which obviously holds for the root node of the kd -tree, because we use the tight , s of all objects. It is not the case for non-root interior nodes of the kd -tree; the splitting plane need not have any objects on one of its sides. If a splitting plane is positioned so that one child node contains no objects, the child is not further subdivided and it is declared a leaf. We call such a positioning of the splitting plane cutting off empty space. The complementary view to cutting off empty space is that the splitting plane is the face of a bounding volume enclosing the objects in the node. Subramanian and Fussel [144, 145] first noticed the utilization of empty space in the kd -tree, but their view of the use of empty space within the hierarchy is inconsistent. First, they present the following opinion, quoted from their paper [145]: To sum things up, the creation of empty voxels is itself a source of inefficiency in the first place since no ray-object intersection can be found in such regions, but the fact that they are not subdivided any further is a desirable factor since it helps the structure adapt itself to the locations and densities of the primitives in the input scene. .... The void space represents a measure of three dimensional space that contains no useful information. End of quotation. They also provide a “void area measure” in the kd -tree that we consider rather incorrect. They propose to construct tight , s for all the objects pointed to in the left and right child, and to compute the “void area measure” as the difference of volume of the % of the original node minus the volumes of the tight % s of the child nodes. Since the child nodes are further subdivided, some empty space within the hierarchy can be included in the total sum for the whole kd -tree several times (at most six times, since % has six faces in IE3 ). They also claim that their “void area measure” is a useful measure of performance of the kd -tree when used for ray shooting. On the other hand, they apply tight % s included into the interior nodes, we again quote the paper [145]: ..... By surrounding collections of objects completely with bounding volumes, large sections of object space can be pruned away, drastically reducing the ray search space. This is because if a ray misses this bounding volume, its contents need not be examined. Bounding volumes are more effective than space partitioning structures in this respect, since, in general, they provide tighter enclosures around object collections. .... Space partitioning structures are adaptive, concentrating the partitioning in the vicinity of objects. Since all the partitioning planes are axis-aligned, they have the potential to create large void spaces which are a source of inefficiency. Bounding volume hierarchies, on the other hand, optimize ray tracing by culling away large sections of object space and provide a compact representation for the objects at every node of the hierarchy. ... Thus bounding volumes should be used only where they result in culling a large amount of void space.

4.4. CONSTRUCTION OF KD-TREES WITH UTILIZATION OF EMPTY SPATIAL REGIONS 63 End of quotation. Their two views to a use of empty space are contradictory. First, they want to use empty space using % s included into the interior nodes, which can be relatively costly to check for intersection with a ray. Second, they want to avoid the creation of empty leaves, even if this can be understood as another variant of the % s in the interior nodes.

4.4.1

Theoretical Remarks

Before we describe improved versions of kd -tree construction using cutting off empty space, we present our motivation for this concept. The recursive ray traversal algorithm on the kd -tree constructed with “good” utilization of empty space should behave for any ray as follows: after descending not very deep from the root node to the first leaf containing the origin of a ray, only empty leaves of possibly large size should be traversed until the first full leaf with one object is hit and the ray intersects this object. In this case the ray need not be tested for intersection with any other objects. With increasing size of the empty leaves, the ray must traverse only a few leaves to get to the first full leaf; so the cost of the kd -tree for ray shooting is decreased. The empty leaves contain useful information exactly by virtue of their emptiness; since they are empty, the spatial region that they cover can be skipped within a ray traversal algorithm. No objects can be intersected in empty leaves, and this is the source of potential efficiency improvement. The basic underlying idea in this context of visibility is as follows: when an object from a viewpoint is visible (along a half-line), there must be empty space in front of this object. Let us describe how kd -tree construction with utilization of empty spatial regions could proceed. After localizing empty space in kd -tree we can organize empty space to stay in the leaves in upper levels of the kd -tree hierarchy and then split the leaves, where objects are located and the probability that a ray will intersect the object is high. One can then propose a general algorithm for constructing the kd -tree with “good” utilization of empty spatial regions: 1. For a given scene ³ of N objects, identify all “empty space” within the scene % . Represent this empty space in the set SER of non-overlapping , s. For such a set SER holds ∑i ´ SER Vol *% i const, where Vol *% i is the volume of % i . The optimization criteria to search the optimum set SER can be derived from Eq. 4.4, since for an arbitrary ray we want to get the minimum number of empty spatial regions to be intersected: opt SER

¶µ°·

SER ;

´

∑

opt i SER

SA *%

i

¹¸ ∑ j´ S

SA *%

j

«º

ER

opt opt of empty , s from SER 2. select the subset sel SER , which represents most of the empty space in the hierarchy. opt , which uses some variant 3. construct the kd -tree over scene objects and empty % s from sel SER of surface area heuristic. Try not to split both the objects and the empty % s. One way could be opt to consider the % s of sel SER as objects.

When the objects were non-overlapping , s, an RSA using the corresponding kd -tree could be efficient. After locating the first full leaf a ray traversal algorithm terminates, since the ray-object intersection must occur. If % s only approximate objects – intersection occurs with a certain probability – it is difficult to predict the advantage of surface area heuristic that use empty spatial regions over OSAH. This is also the case when the % s associated with the objects overlap. The outline of the kd -tree construction method that uses the empty space gives us several other opt subproblems. One of them is how many empty spatial regions exist for SER , given a scene ³ N

CHAPTER 4. CONSTRUCTION OF KD-TREES

64

containing N objects. Let us suppose that the scene consists of non-overlapping objects having the shape of % . Since the number of faces for objects’ , s is finite, the boundary between the empty spatial regions and object spatial regions contains a finite number of corners and edges, and is thus N . The number of possible empty spatial regions is then also N . opt A difficult problem is the algorithm for constructing the set SER . The problem was studied in IE2 by Lingas et al. [103] (cited according to [42]), and is called the Minimum Edge Length Rectangular Partition Problem. They formulate it as follows: given a rectilinear figure, partition it into rectangles with the minimum total length of new boundaries. Lingas et al. [103] show that this problem is NPhard. Although we do not prove it here formally, we cannot expect that in IE3 space the problem will not be NP-hard. Moreover, we have to describe the empty space boundary as the complementary space of the % s associated with the objects in the scene. We can use some heuristic algorithm to get an opt approximative solution for SER , but for IE3 space we are not aware that any such algorithm has been 2 published. For IE some heuristics are presented by Du and Zhang [42]. Although the use of empty space as described above has its potential, the time complexity of kd -tree construction would become unacceptable for practical applications. Instead of using a general concept for cutting off empty space, we describe below three simpler methods based on the cost model.

4.4.2

Early Cutting Off Empty Space

Cutting off empty space can occur when the estimated cost computed according to Eq. 4.13 is the minimum found. For this reason computation of the cost function should be performed on the whole range of the % for all three axes to find out the minimum estimated cost. We call this case early cutting off empty space. It occurs when the geometry term of the space to be cut off outweighs the other terms in the cost function, resulting in a reduction in the total cost. Assuming the left child node is empty, Eq. 4.13 simplifies to: (4.21) CˆνG CˆT S ¦ pR CˆR Early cutting off empty space creates new empty leaves possibly in upper levels of the kd -tree hierarchy: large empty spatial regions can be skipped during the ray traversal algorithm as described above.

4.4.3

Late Cutting Off Empty Space

Mostly, the termination criteria for kd -tree construction as already mentioned are a fixed number of objects pointed to in the leaf and the maximum depth of the leaf in the hierarchy. It seems to be uninteresting to split a leaf node containing one object only. Below we show that this case should also be investigated. Let node ν of kd -tree is associated wih %& ν representing the whole scene that consists of one object only. The cost function (Eq. 4.12) is then expressed as: 1

∑ CˆIT iD i 1

Cˆν1

(4.22)

Let us suppose the %& ν is split by a plane in such a way that the object remains in the right child node only, whereas the left child node is empty. Then using the cost function Eq. 4.21 we get the following new cost function: Cˆν2

CˆT S ¦ pR _

1

∑ CˆIT 1 o

CˆT S ¦ pR CˆIT

(4.23)

k 1

The position of the splitting plane influences only the last term in the cost function, Eq. 4.23. Since the traversal cost CˆT S is included, the cost Cˆν2 could also be higher than the original cost Cˆν1 . The selection

4.4. CONSTRUCTION OF KD-TREES WITH UTILIZATION OF EMPTY SPATIAL REGIONS 65 of the minimum cost using either Eq. 4.23 or Eq. 4.22 is thus sensitive to ratio between CˆT S and CˆIT unlike the early cutting off empty space. Obviously, the cutting off empty space method discussed here should be applied only if Cˆν2 ² Cˆν1 . Since this occurs in the leaves of the kd -tree, in the late phase of the kd -tree construction, we call the method late cutting off empty space. The real values of CT S and CIT depend on the implementation of a ray traversal algorithm and rayobject intersection tests. We deal with efficient ray traversal algorithms further in Chapter 5. The time of the ray-object intersection test CIT depends on a particular algorithm and implementation, and is related to the object’s shape. Simple geometrical objects like spheres and triangles can be tested for intersection at a cost comparable to CT S , however, CIT for these simple objects is still higher than CT S for an efficient recursive ray traversal algorithm. Complex objects like NURBS have CIT by order(s) of magnitude higher than simple objects, therefore late cutting off empty space typically pays off for them. Since SPD scenes consist of simple geometrical objects only, cutting off empty space need not decrease the estimated cost considerably. In Fig. 4.6 we can see the difference between late and early cutting off empty space methods. Whenever possible, early cutting off empty space should be preferred, since it creates the kd -tree with fewer nodes, and thus the ray traversal algorithm is in an average case more efficient under the assumption of uniformly distributed rays.

Figure 4.6: A spatial region with empty space and a corresponding kd -tree. Late cutting off empty space (left) is less effective than early cutting off empty space (right).

We can compare late cutting off empty space for a single object with the use of bounding volumes in the interior nodes [145] of the kd -tree. Usually, as the bounding volume of a single object is taken its % . This bounding volume is checked for intersection with a ray before the ray-object intersection test is performed. For the kd -tree, the use of % s is particularly suitable since the shape of an % fits the geometry of the kd -tree. Even if efficient algorithms for computing an intersection between a ray and % are known (for a survey, see [112]), the cost of the intersection test is relatively high compared with the cost of traversal step CT S . It requires at least from two to six intersection tests with the planes of an % that are perpendicular to the coordinate axes, and other computational effort, even if the % is not intersected. Unlike the , used as a bounding volume and put in the interior node of the kd -tree, late cutting off empty space puts splitting planes only for the objects’ sides (at most six planes in IE3 ), where this is advantageous from the viewpoint of the cost model. In the worst case, all six bounding planes can be put around the object, which is unlikely to occur, since in this case the use of , as a bounding volume is less costly. We can also compare the use of bounding volumes for a set of objects either in the leaf or in the interior node. Further, we again assume that we use as the bounding volume an % pointed to in the node ν. When the recursive ray traversal algorithm reaches the node ν, it is required to have a computed entry and exit signed distance from the origin of a ray with the % . This is necessary since the % used as the bounding volume is then taken as a root of the subtree of the kd -tree that is further built. As for the single object, the use of the % as the bounding volume is insensitive to the size of the empty space saved on each side of the % used as the bounding volume in respect to the size of the %& ν

CHAPTER 4. CONSTRUCTION OF KD-TREES

66 associated with the node ν.

4.4.4

Two-Plane Cutting Off Empty Space

In addition to early and late cutting off empty space methods, we can ask about the existence of an algorithm as some cheap alternative to the algorithm proposed in Section 4.4.1. This algorithm should find large empty spatial regions and cut them off from the rest of the kd -tree hierarchy. The problem with the kd -tree is that whenever a node is subdivided, empty spatial regions on both sides of the splitting plane cannot ever be connected again to one leaf node, see Fig. 4.6 (left). If possible, we want to detect these situations, and try to promote the splitting that results in the creation of empty leaves, but we want to use the cost model for our decision. Therefore, we further propose and analyze a simple method that searches for an interval of empty space on one axis only. The technique is an extension of early cutting off empty space, and further promotes the creation of empty leaves in the upper levels of the kd -tree hierarchy. The underlying geometry is depicted in Fig. 4.7.

Figure 4.7: Two-plane cutting off empty space. (a) Spatial region with empty space inside the interval along the axis. (b) RF cutting. (c) LF cutting.

When a kd -tree is constructed as depicted in Fig. 4.7 (b) or (c), we can refine the cost model accordingly. Further, we assume that two subdivision steps are performed as depicted in Fig. 4.7 (b). In this case, the cost for parent node νP corresponds to Eq. 4.14. Then the empty space is cut off in the next step, so the cost of the node is: Cˆ RF

CˆT S ¦ SA *,- νR »2ª SA *%& νRF pD CˆR

Then the total estimated cost for the parent node becomes:

CˆνRF P

SA *%& νL p2ª SA *,- νP pD CˆL ¦ SA *,- νR D2ª SA *%& νP pD CˆR ¦ SA *%& νRF D2ª SA *%& νP pv¦ 1D CˆT S

(4.24)

Similarly, when the kd -tree is constructed as shown in Fig. 4.7 (c), we can estimate the cost for this way of splitting as: CˆνLF P

SA *,- νL D2ª SA *%& νP DD CˆL ¦ SA *%& νR D2ª SA *%& νP DD CˆR ¦ SA *%& νLF »2ª SA *%& νP p^¦ 1D CˆT S

(4.25)

We compute the cost function using Eq. 4.25 and Eq. 4.26 only when the empty space interval is identified within the search for the minimum cost evaluating Eq. 4.14. We select the case LF or RF that

4.5. AUTOMATIC TERMINATION CRITERIA

67

uses two predetermined splitting plane positions only if it has the cost lower than the cost according to Eq. 4.14 for the boundaries of all objects’ along the axis. We can see that this method is also sensitive to the ratio between the value of CT S and CIT , like late empty space cutting off. It is obvious that two-plane cutting off empty space can also be the case of two successive subdivision steps in OSAH. In this case it requires both splitting planes to have a minimum estimated cost independently. The combination of two subdivision steps into one equation further promotes the creation of empty leaves; it reveals the geometry with the empty space one step in advance.

4.5

Automatic Termination Criteria

As we described the term in Subsection 4.2.4.2, automatic termination criteria are termination criteria that do not require any user setting. In this section we deal with the design of automatic termination criteria. Looking at the graphs in Fig. 4.4 we could require that they result in a kd -tree with at least the critical performance point as depicted on the graphs. A simple automatic termination criteria algorithm can follow the curves on these graphs. We can construct a kd -tree for some maximum leaf depth dmax and compute the estimated cost of the kd -tree C dmax using Eq. 4.9. Then we subdivide all the leaves of the kd -tree one step further (if possible) to the maximum leaf depth dmax ¦ 1 , and we compute the new estimated cost C dmax ¦ 1 of this deepened kd -tree. When C dmax ¦ 1 is sufficiently lower than C dmax , the just performed subdivision step improves the estimated cost and the kd -tree can be further deepened to depth dmax ¦ 2 . Otherwise, we should not continue subdividing. However, incrementally increasing maximum leaf depth dmax for the whole kd -tree construction is not the only way to get the pseudo-minimum of the estimated cost. Actually each node ν to be potentially subdivided has its own history of splitting that is associated with all the nodes on the path between the root node and the node ν. Setting only one maximum leaf depth dmax for the whole kd -tree is not very appropriate to the problem. Automatic termination criteria should keep track of the history of splitting the nodes of the kd -tree. We should remark that one subdivision step need not necessarily bring an improvement in cost immediately. It is possible that the estimated cost increases due to a current subdivision step, even if the splitting plane with minimum cost is selected. This can be caused by two possible reasons. First, the splitting plane could not separate enough objects. In this case, most objects are intersected by the splitting plane, thus a recursive ray traversal algorithm can require one more traversal step. Second, the linear cost estimate of the unsubdivided child node is just an estimate. If a subdivision step is unsuccessful, it does not mean that further subdividing of the corresponding child nodes cannot decrease the total estimated cost of the kd -tree considerably. If many unsuccessful subdivision steps occur on the path between the root node and the current node, this would indicate that objects in these spatial regions were difficult to separate. Avoiding further subdivisions steps in this case could decrease the cost of the kd -tree. We observed during experiments with the termination criteria algorithm that using some maximum leaf depth dmax is a useful feature of the termination criteria algorithm. First, the use of dmax bounds the maximum memory requirements for the kd -tree representation by limiting the number of kd -tree nodes to a constant. Second, since % s associated with objects can overlap, they need not be separable by a splitting plane at all. In this case the number of objects cannot become lower than or equal to a constant Nmax . If the objects with the shape of % do not overlap in the projection to all coordinate axes and the kd -tree with one object in every leaf is required, the maximum depth of a leaf node for the balanced kd -tree is dmax log N. When we assume arbitrarily distributed objects in the scene with possible overlapping of objects, then the maximum leaf depth dmax to achieve the critical cost point could be higher. To bound this quantity we propose to express the maximum leaf depth as: dmax

k1 log N ¦ k2

(4.26)

CHAPTER 4. CONSTRUCTION OF KD-TREES

68

The values of constants k1 and k2 can be chosen according to experiments on some set of scenes to achieve the critical performance point. We found by averaging for experiments on 30 SPD scenes (group G3SPD , G4SPD , and G5SPD ) that one possible setting of these constants in the GOLEM rendering system is k1 1 2 ¢ k2 2 0. Further, we discuss the setting of the constant Nmax . Our observation is based on the case when we access a leaf with Nmax objects during a ray traversal algorithm. Assuming that the cost of traversal step CT S is sufficiently lower than the cost of ray-object intersection test CIT , it should be advantageous have Nmax 1. Then in the ideal case in the constructed kd -tree the leaves contain either one or no object. First, the empty leaves are traversed quickly. Second, full leaves with one object are also advantageous for efficiency reasons. If a leaf with one object is checked for ray-object intersection and the intersection exists, no further ray-object intersections need to be performed. Placing one object to leaves is not always possible, since objects or/and % s associated with the objects can overlap. Therefore, during the construction we have to detect these branches of the kd -tree where further deepening does not bring any improvement of the whole cost of the kd -tree. In order to detect these cases we also use the cost model. Let us describe the use of the cost model in automatic termination criteria algorithm. When the node ν with N objects is subdivided, its resulting estimated cost CˆνG is computed according to Eq. 4.13. When the node ν is declared a leaf, its cost CˆνE is expressed by the Eq. 4.12. Then we can express the ratio of these two costs, which shows the quality of the subdivision step:

rq

CˆνG CˆνE CˆT S ¦ CˆIT _ SA *%& lchild νG 2pD NL ¦ SA *%& rchild νG 2pD NR 2ª SA *%& νG » CˆIT N

(4.27)

The higher the quality of the subdivision step the lower rq : for a successful subdivision step we assume rq ² 1 0. For example, if a node with % of cubic shape that contains uniformly distributed objects is subdivided in its spatial median, then for CT S 0 we get the quality of the subdivision rq 2 ª 3. If rq is a constant greater than constant rqmin (we can set rqmin to one or some constant slightly lower than one), we can consider the subdivision step unsuccessful. However, the case that rq ± rqmin can occur only transiently, and the quality of subdivision step rq can improve again in the subsequent subdivision steps. Therefore, we should keep track of the number of unsuccessful subdivision steps on the path from the root node to the current node. If the number of these unsuccessful subdivision steps is higher than the allowed fixed constant Fmax , we declare the node as a leaf, since further subdivision steps are unlikely to decrease the total estimated cost of the kd -tree. We propose to compute Fmax from the maximum leaf depth dmax , since for scenes with higher number of objects the number of allowed unsuccessful steps could be higher, and dmax is derived from the number of objects. We propose to compute Fmax as follows: (4.28) Fmax K 1f ail ¦ K 2f ail dmax We found one possible setting for the constants empirically in the GOLEM rendering system: K 1f ail 1 0, K 2f ail 0 2 and rqmin 0 75. The automatic termination criteria algorithm described here is an empirical algorithm based on experiments using test scenes G3SPD , G4SPD , and G5SPD . We do not claim its optimality or the best setting of constants given above. Such setting of constants is implementation dependent, and particularly, it depends on the ratio CT S ª CIT . The advantage of the proposed algorithm for automatic termination criteria is that it does not require any user intervention, it limits maximum memory usage, and it results in a kd tree with lower or comparable cost with the kd -tree built with ad hoc termination criteria. We verified experimentally the performance of kd -trees built with automatic termination criteria (Section 4.10.2).

4.6. FURTHER RESULTS AND PROBLEMS

4.6

69

Further Results and Problems

The development of the cost model and the surface area heuristic has been conditioned by several rather unrealistic simplifications. The first is the estimate of the cost of the child subtree to be subdivided; the estimate is linear with a number of objects, which corresponds to the worst-case complexity measure. We have shown the validity of this linear cost estimate only for the case when the node is not further subdivided, however, the goal is that the time complexity of the RSA based on the kd -tree should be much less than linear. The second simplification is that we approximate objects by their % s, regardless of their tightness to the objects. Then the objects need not intersect the % s of the leaves where they are referenced, and this can also influence the construction of the kd -tree. The third simplification is the assumption that an arbitrary ray does not intersect any object, which again corresponds to the worst-case complexity measure. This is obviously not true in general, since the case of the existence of an intersection between a ray and an object is a common result of an RSA. The fourth simplification is the assumption of the uniformity of ray distribution, which can be strongly violated in general. Ray distribution depends greatly on the application. Further, we present our study in which we try to avoid these simplifying assumptions. The third and fourth simplification mentioned above are elaborated in separate sections below, since the scope of the text is rather large.

4.6.1

Cost Estimate

Until now we have supposed that the worst case in the linear cost estimate is valid – when a node containing objects is not further subdivided, which corresponds to declaring the node as a leaf. For nodes of the kd -tree with many objects, such an estimate seems unrealistic. Since these nodes are further subdivided, their cost is decreased. Estimating the cost of node ν, which is to be subdivided, is however difficult for general scenes. We can solve the problem in several ways. The first solution is precise and costly; we can really construct a subtree for node ν, then evaluate its estimated cost using Eq. 4.9. This leads to a combinatorial explosion of the construction algorithm, as we stated in Section 4.3.2, and this is not usable for a practical algorithm. The second solution is that we can use for an estimate the results measured on the SPD scenes. Since these scenes can be generated with various numbers of objects by specifying the size factor SF , we can get the function C f N , where C is the cost of the kd -tree constructed using Eq. 4.14, and N is the number of objects. These functions for all the SPD scenes are plotted in Fig. 4.8. For constructing kd -trees in the experiments we applied automatic termination criteria. Unfortunately, all the measured scenes have completely different functions f N , which cannot be expressed by a single function of one variable N. We also failed with the following approach: we computed the estimated cost fˆ¼½ N for a particular scene ³ as the function of one variable N. We applied the estimate during the construction of the kd -tree for the scene ³ , however, the resulting cost of the kd -tree was higher than for the linear estimate. This is probably due to the fractal nature of the generated SPD scenes – increasing the number of objects need not imply a similar distribution of objects in the scene. In our third attempt to solve the problem, we can try to estimate the cost from Eq. 4.9 under the following assumptions: a BSP tree with a spatial median method is built over a set of uniformly distributed objects. The input for the estimate of node ν is the number of objects N and the surface area SA *%& ν ¾ of the % associated with ν. We can estimate the average depth for the constructed BSP tree as d˜ log N ¦ k1 , k1 ± 0, since some objects can reside in more than one leaf at the same time. For simplicity we assume that ,- ν is of cubic shape with edge size w. Then the surface area SA *%& ν p

CHAPTER 4. CONSTRUCTION OF KD-TREES

70

Figure 4.8: The cost in dependence on the number of objects for SPD scenes for TPD for different size factors. The graphs rIT M f1 N , N˜ T S f2 N , TR ª Nrays f3 N , and ΘRUN f4 N .

is 6 w2 . The estimated cost Cˆ ν of node ν is the sum of three estimated costs: the cost of traversing interior nodes CˆT I ν , the cost of traversing leaves CˆT L ν , and the cost of ray-object intersection tests CˆIT ν . ˜ The number of leaves in the BSP tree at the average depth d˜ is Nl 2d , and the number of interior ˜ nodes is Ni 2d ¤ 1. We can compute the estimated cost CˆT I ν considering the cost of the nodes separately at different depths of the kd -tree. A node ν at one particular depth d in the kd -tree has the same shape and thus its surface area %& ν is constant, regardless the position of ν in the kd -tree because we assume a spatial median method. The shape of , associated with a node at depth d for which holds d 3 i ¢ i µ 0 ¢ 1 ¢ 2 ¢222¿º is cubic again. Table 4.1 shows that the surface area of % associated with the one node and all the nodes as a function of the depth in the kd -tree. Then using the sum for a geometric sequence we can derive CˆT I ν : CˆT I ν

1 ˆ CT S SA

Ni

∑ SA i

i 1

1 ˆ CT S SA

À

d˜ 3

∑ 8i SA ÁÂ 36ª 6 i

i 0

CˆT I KÃ

ÄÀ

log N k1 3

∑

i 0

48i

Å

CˆT S Æ N ¦ k1

(4.29) Nl Å 1 ˆ ˆ ˆ Similarly, we can derive that CT L ν SA CT S ∑i 1 SA l CT S Æ N ¦ k1 . Since in a leaf the average number of objects is constant, provided that the distribution of objects in the scene space is uniform 1 ˆ and no object is intersected by a ray, it holds that CˆIT ν SA CIT ∑Ni i 1 SA iD N l Å CˆIT Æ N ¦ k1 . To conclude, the linear estimate of cost Cˆ ν Å N ¦ k1 is thus valid and k1 is unknown under the conditions stated above. However, the multiplicative factor hidden behind -notation is unknown.

4.6. FURTHER RESULTS AND PROBLEMS d 0 1 2 3 4

Ni 1 2 4 8 16

SA *%& ν p 6 w2 4 w2 2 5 w2 1 5 w2 1 w2

Ni SA *%& ν D 6 w2 8 w2 10 w2 12 w2 16 w2

71 ∑dj

0 Ni

SA *,- νD 6 w2 14 w2 24 w2 36 w2 52 w2

Table 4.1: The surface area of a node according to the depth d, assuming the input of cubic shape (size of side w) and spatial median method.

4.6.2

Reducing Objects’ Axis-Aligned Bounding Boxes

The notion of % s for scene objects significantly simplifies the kd -tree construction. Since objects can straddle a splitting plane, it can occur after several subdivision steps that some object’s surface need not intersect the current node % of the kd -tree hierarchy, although the % s of some objects intersect the % associated with the current node of the kd -tree. This is depicted in Fig. 4.9 (a).

Figure 4.9: (a) Splitting during the kd -tree construction can result in references to objects that have no intersection with the leaves. (b) When the , associated with an object straddles the slitting plane, it does not guarantee that the object also straddles the splitting plane. (c) Split clipping – reducing the % s of the object, one on the left and one on the right of the splitting plane by clipping. Although % s of objects fit well for kd -tree construction, there are several disadvantages arising from the simple use of % s instead of objects’ surfaces. First, objects that cannot have an intersection with a ray are also tested in some leaves. Second, objects only virtually present in the interior node ν influence the estimated cost of ν and thus the whole process of kd -tree construction. We have found three possible ways to deal with this problem based on more complex intersection routines between the surface of an object and % associated with the currently processed node of a kd -tree. First, we can postprocess all leaves of the kd -tree and check using an intersection test between an % associated with a leaf and an object’s surface. For each leaf of the kd -tree we can thus remove the redundant references to objects. We call this method leaf pruning. In this case we do not avoid the problem of influencing the kd -tree construction by objects that only virtually intersect the % s of the interior nodes. In the postprocessed kd -tree we can also get subtrees with empty leaves only. These branches of the kd -tree would be consolidated to empty leaves using another postprocessing step. Second, we can apply the intersection test between the % of the kd -tree node and surface of an object not only to leaves but also to interior nodes. In each interior node ν, excluding the root node, before evaluating the cost function we check whether all the objects belong to %& ν . We do this intersection also for leaves, which corresponds to leaf pruning. In this way we can guarantee that a minimum set of objects is considered for the currently processed node when evaluating the cost function. However, this method as presented also suffers from another problem. An object is considered to straddle the splitting

CHAPTER 4. CONSTRUCTION OF KD-TREES

72

plane, when the , associated with the objects straddle the splitting plane. It need not hold as shown in Fig. 4.9 (b). This method can be further improved by testing the objects’ assignment to left and right subtree of the objects straddling the splitting plane for each splitting plane position tested. This way is apparently costly and redundant since the intersection tests lack any computation coherency for adjacent positions of the tested splitting plane. Third, we can solve the problem in one step advance. If the position of a splitting plane is known, we also know the objects straddling the splitting plane. For these objects we can reduce the % s on both sides of the splitting plane whenever possible, as depicted in Fig. 4.9 (c). This requires a special intersection method for all shapes of objects, since, for a given object O, its current , , and the position and orientation of a splitting plane, we want to construct two tight % s associated with the object’s spatial region taken on the left and on the right of the splitting plane. These reduced % s for the left and right side of the splitting plane must be passed with the references to the objects for the left and right child nodes. We call this method that clips the % s of objects with regard to a splitting plane split clipping. Principally, split clipping works best of all the algorithms described here. The description of the algorithms for leaf pruning and split clipping, which handles the intersection tests between an object and a plane, for particular shapes of objects is beyond the scope of this thesis. However, we have designed and implemented these algorithms [80], the results from the experiments are surveyed in Subsection 4.10.4.

4.7

General Cost Model

In the development of the surface area heuristic we assumed that a ray does not intersect any objects. Although such an assumption is rather unrealistic, the developed OSAH significantly improves the performance over the spatial median method for sparsely occupied scenes. In this section we extend the cost model to include the possibility that a ray traversal algorithm terminates in its child node because the ray intersects an object. Thus instead of using the upper bound of an estimate in the worst case, we use the average case estimate based on more realistic assumptions. We call the proposed extended cost model the general cost model (abbreviated to GCM further in the text). Instead of formulating the total cost of the whole kd -tree we deal directly with the positioning of a splitting plane inside the interior node, i.e., with the cost of a subdivided node (Eq. 4.14). During the recursive ray traversal algorithm (Chapter 5) there are exactly four ways, in which an arbitrary ray can traverse an interior node of a kd -tree; the left child only (subscript LO), the right child only (subscript RO), the left child first and the right child afterwards (subscript LR), and the right child first and the left child afterwards (subscript RL). Then we can express the new estimated cost as the sum of the four estimated costs for the distinct traversal cases: CˆLO , CˆLR , CˆRL , and CˆRO . Further, we assume that a ray can intersect an object in one of its child nodes with a probability pT . For the sake of convenience, a ray that in %& ν , associated with a node ν, intersects an object will be called the hit ray with respect to ν. Similarly, we call a ray that does not intersect any object in %& ν the miss ray with respect to ν. Let us denote the case for hit rays by superscript T and the case for miss rays by superscript N . If we have a set of rays intersecting %& ν , then pT for the node %& ν can be estimated as the number of hit rays divided by the number of all rays shot inside %& ν . GCM of an interior node ν to be subdivided as follows: Then we can estimate the total cost Cˆnew GCM Cˆnew ν CˆLO

CˆLR CˆRO CˆRL

CˆLO ¦ CˆLR ¦ CˆRO ¦ CˆRL pLO _ pTL CˆLT ¦Ç 1 ¤ pTL D CˆLN pLR _ pTL CˆLT ¦Ç 1 ¤ pTL D_ CˆLN ¦ pTR CˆRT pRO _ pTR CˆRT ¦Ç 1 ¤ pTR D CˆRN pRL _

pTR

CˆRT

¦Ç 1 ¤

pTR

D_ CˆRN ¦ pTL CˆLT

(4.30)

¦È 1 ¤

pTR

D 2

¦È 1 ¤

pTL

D CˆLN 2D¢

CˆRN

(4.31) (4.32) (4.33) (4.34)

4.7. GENERAL COST MODEL

73

where

pLO – (pLR , pRO , and pRL ) is described in Section 4.3.1, pTL – probability of a ray hitting an object in the left child node, pTR – probability of a ray hitting an object in the right child node, T ˆ CL – estimated cost of subtree in left child only for hit rays, N ˆ CL – estimated cost of subtree in left child only for miss rays, T ˆ CR – estimated cost of subtree in right child only for hit rays, N ˆ CR – estimated cost of subtree in right child only for miss rays. The general cost model can be even more detailed, when we distinguish the estimated costs between hit and miss rays for all the traversal cases LO, RO, LR, and RL. Conversely, if we do not distinguish between the estimated cost of hit and miss rays (CˆRN CˆRT ¢ CˆLN CˆLT ), the cost model simplifies to the formulas:

pL CˆL

pR CˆR

CˆLO CˆLR CˆRO CˆRL

pLR pTL

CˆL ¦Ç 1 ¤ pTL D_ CˆL ¦ CˆR ©

CˆR ¦Ç 1 ¤ pTR D_ CˆL ¦ CˆR ©

pRL pTR

pLR CˆL ¦Ç 1 ¤ pTL D CˆR

pRL CˆR ¦Ç 1 ¤ pTR D CˆL Y¢

(4.35) (4.36) (4.37) (4.38)

where CˆL is the estimated cost of the left node and CˆR is the estimated cost of the right node. Since the general cost model more realistically describe the use of a kd -tree in an RSA (for average case and assuming recursive ray traversal algorithm), it could give us a more efficient RSA based on the kd -tree. Obviously, we shall pay for this by increased computational cost during the kd -tree construction. The formulation of the general cost model brings us at least two subproblems. These are the estimation of the cost of a node ν distinctly for hit and miss rays, and the estimation of the probability pT that a ray intersects an object inside the node. We discuss these issues below.

4.7.1

Estimating Blocking Factor

The first subproblem arising with the general cost model is to estimate the probability that a ray intersects an object in the node ν, i.e., inside %& ν . This probability is required to be evaluated for the left and right child of ν in Eqs. 4.31–4.34 for all tested positions of the splitting plane when evaluating the cost function. The problem was discussed by Reinhard et al. [122] for the leaves of spatial hierarchies. The probability pT that a ray hits an object according to that paper is referred to as the blocking factor. The input data given a node ν is the %- ν associated with ν and the set of N objects pointed to in ν. The objects need not lie in %& ν completely, but only partially. The opposite case is also possible; the tight % of all objects can be smaller than %& ν since the objects were separated by splitting planes in previous subdivision step(s). In the latter case the probability that a ray intersects an object is lower. First, we recall the method used by Reinhard et al. [122] for the leaves of the spatial hierarchy. They take the , s of objects and compute the blocking factor from the projected surface areas of objects % inside the leaf % for all three axes. The geometry of the problem is depicted in Fig. 4.10. The blocking factor pT is then computed as: pT

N SAx *%& O pv¦ ∑ SAx *%& νEj p^¦ j 1

SAy *%& O j pv¦ SAz *%& O j p SAy *,- νE D^¦ SAz *%& νE p

¢

(4.39)

where SAa *%& O j t is the projection area of the j-th object % to the plane perpendicular to the a-axis, a µ x ¢ y ¢ z º . Similarly, SAa *%& νE ` is the surface area of the leaf-cell in projection to the plane perpendicular to the a-axis. In order to consider the overlapping of objects in the projection they

CHAPTER 4. CONSTRUCTION OF KD-TREES

74

Figure 4.10: A cell in projection to the z-axis. (a) One object in a cell. (b) Two objects in a cell that overlap in projection to the plane perpendicular to the y-axis.

compute the projected surface area of the intersection for each pair of objects and they add or subtract it from the total surface area covered by the objects. The method is thus limited to the small number of objects in the leaf, since computation of overlapping N objects suffers from combinatorial explosion. We can remark that the % associated with an object serves only as an approximation of the object, and the tightness of object to its , would also be involved in the computation of the blocking factor. To estimate pT in general for many objects in the node remains a problem. One simple way is to sample the space of the node with Nrays rays and record the terminating position of the rays and the entry and exit points for given , . The number of sample rays Nrays must be high enough, assuming the rays are uniformly distributed. If we sort successful ray intersection points along the axis to be subdivided, we can estimate for each position of the splitting plane the blocking factor pT for the left and right child. This algorithm for blocking factor can be costly since many ray-object intersection test can be computed. Efficient sampling assumes another kd -tree is constructed in advance and used for the sampling rays. In order to achieve some precision for the blocking factor estimate, the number of sampling rays should be sufficiently high; we assume at least of the order Nrays 102 . If the constructed kd -tree has NIN interior nodes, the total cost of the estimate algorithm could be unacceptably high. We implemented this algorithm, and the results of applying it to the general cost model are presented in Subsection 4.10.5.

4.7.2

Cost Estimate for Hit and Miss Rays

The second subproblem induced by the general cost model is the cost estimate for hit and miss rays. Let us present arguments for the value of the ratio between costs of hit rays and miss rays. First, the cost of hit rays could be higher than for miss rays, since in the former case it must include at least one rayobject intersection test. Second, the cost of hit rays could be lower than for miss rays, since in the latter case a ray traversal algorithm has to perform many traversal steps and possibly several unsuccessful ray-object intersection tests. The question is whether it is possible to predict simply the ratio between the cost of hit and miss rays. We performed a set of experiments on SPD scenes (G3SPD , G4SPD , and G5SPD ). The results are surveyed in Table 4.2, and we report the subset ∆ of minimum testing output. The superscript miss denotes the case for miss rays, the superscript hit denotes the case for hit rays. The results in Table 4.2 show that for the tested scenes the average number of ray-object intersection tests per ray N˜ IT is on average lower for miss rays than for hit rays. The same holds for the average number of traversal steps per ray N˜ T S . Unfortunately, the value of the ratio between the cost for miss rays and hit rays is not constant and thus cannot be determined in advance. For example, for the scene “lattice” the cost for hit rays is lower than for miss rays, since the miss rays must traverse more nodes to get out of the scene % . In general, the ratio between the cost of hit rays and miss rays is difficult and/or almost impossible to predict.

4.7. GENERAL COST MODEL Scene balls3 balls4 balls5 gears2 gears4 gears9 jacks3 jacks4 jacks5 lattice6 lattice12 lattice29 mount4 mount6 mount8 rings3 rings7 rings17 sombrero1 sombrero2 sombrero4 teapot4 teapot12 teapot40 tetra5 tetra6 tetra8 tree8 tree11 tree15 Average

75

Group

N

hit N˜ IT

miss N˜ IT

N˜ ThitS

N˜ Tmiss S

hit N˜ ET S

miss N˜ ET S

hit N˜ EET S

miss N˜ EET S

G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD G3SPD G4SPD G5SPD –

821 7382 66341 1169 9345 106435 657 5265 42129 1255 8281 105307 516 8196 131076 841 8401 107101 1922 7938 130050 1008 9264 103680 1024 4096 65536 1023 8191 131071 –

6.52 7.37 8.25 3.71 4.11 4.44 9.40 12.4 14.7 3.59 4.12 4.33 4.08 4.36 4.03 9.63 11.2 11.2 3.04 3.05 3.43 5.20 4.57 5.03 5.53 5.68 5.91 4.70 4.86 5.75 –

-26% -30% -35% -9% -34% -38% -55% -55% -60% +13% +4% -10% -48% -39% -43% -29% -31% -33% -56% -56% -56% -31% -36% -37% -79% -81% -96% -31% -34% -60% -40.3%

21.4 30.9 40.7 19.8 26.6 32.5 33.4 50.8 67.2 33.8 41.9 56.0 15.4 21.6 24.9 26.0 43.9 63.9 24.7 29.9 41.4 23.3 33.8 45.6 23.4 31.3 51.1 17.0 20.7 27.9 –

-24% -25% -27% -18% -30% -33% -28% -25% -27% +19% +5% -13% +2% -5% -30% -16% -18% -15% -50% -52% -44% -13% -19% -23% -61% -64% -68% -35% -33% -31% -28.7%

4.11 5.61 7.10 3.58 4.28 4.67 5.87 8.73 11.2 5.61 6.60 7.74 2.81 3.66 4.23 4.82 7.55 10.6 4.80 5.53 7.18 4.71 6.19 7.82 4.50 5.75 8.95 4.28 4.82 6.27 –

-10% -9% -14% -26% -40% -45% -13% -25% -18% +34% +14% -8% +12% +5% -20% -4% -9% -8% -53% -53% -54% -7% -15% -19% -60% -63% -67% -30% -27% -29% -22%

0.956 1.54 2.12 0.733 1.09 1.09 1.48 2.18 2.80 2.17 2.78 3.55 0.786 1.36 1.60 1.29 2.76 4.39 2.92 3.68 5.00 2.00 3.23 4.61 3.08 4.28 7.39 0.465 0.842 1.41 –

+0% +8% +2% -51% -68% -70% +85% +89% +92% +65% +26% -6% +87% +52% +4% +57% +30% +26% -51% -50% -52% +26% +5% -50% -50% -54% -63% +11% +10% +6% 4%

Table 4.2: The ∆ subset of minimum testing output measured for miss and hit rays for testing procedure TPD for G3SPD , G4SPD , and G5SPD scenes. Values for miss rays are taken relatively in % related to the values for hit rays. Since the cost estimate for hit rays and miss rays is not easily predictable, we are forced to use one of the following ways:

disregard the concept of distinguishing between the costs for hit and miss rays, and thus use a simpler form of GCM, i.e., Eqs. 4.35–4.38. In the case we use for example a linear cost estimate, but the blocking factor is estimated via sampling. similarly to the estimation of pT ¢ we can evaluate the costs experimentally assuming a sampling kd -tree is built in advance. The sampling kd -tree is not the kd -tree we build up now, but it can be taken as a rough approximation of it with similar characteristics, since it was constructed with a similar algorithm for the same input data. Then the number of traversal steps and the number of ray-object intersection tests can be estimated well enough. The error of such an estimate remains unknown.

A simple way of estimating the cost for hit and miss rays given a node with the objects remains a difficult problem. We thus implemented the sampling strategy for estimating the costs. The results of applying these estimates to the general cost model are presented in Subsection 4.10.5.

CHAPTER 4. CONSTRUCTION OF KD-TREES

76

4.8

Preferred Ray Sets

The basic cost model (Eq. 4.9) and thus the surface area heuristic algorithms developed until now have been based on the assumption that rays are distributed uniformly in ray space. This enables us to compute the probability that an arbitrary ray hits the %& ν associated with the node ν using Eq. 4.4. If an RSA is applied for example in a global illumination rendering algorithm, the algorithm can also imply for a given scene and viewpoint some particular ray distribution. The question is if it is possible to use this knowledge about the expected ray distribution to build up a kd -tree so that an RSA based on the kd -tree and this ray distribution has improved performance. Further, we break the assumption of uniformity of ray distribution and change the above presented equations to consider only a certain ray set with a particular ray distribution. We focus on three types of ray sets that correspond to parallel, perspective, and spherical projections. For these ray distributions we show how Eq. 4.4 is changed for a particular ray distribution. The idea of the kd -tree construction for preferred ray sets was introduced by Havran and Bittner [76, 77].

4.8.1

Parallel Projection

Probably the most intuitive set of rays is formed by fixing their direction. Rays with the same direction are involved in the parallel projection of the scene [157]. In this case the rays are perpendicular to the projection plane ΠP . Let us suppose that the ray distribution on ΠP is uniform. Additionally, we can restrict the preferred set of rays to a certain viewport. The probability that a ray hits the %& ν associated with a node ν of the kd -tree can then be expressed using a surface area of the projection of the %& ν to ΠP clipped to the viewport. The corresponding geometry is depicted in Fig. 4.11.

Figure 4.11: Parallel projection of an % to the plane ΠP . Let viewport WR be a rectangular window on the projection plane ΠP . Let SR PAR ΠP WR be a set of rays perpendicular to ΠP and intersecting WR . The silhouette edges of an ,- ν projected onto ΠP form a convex polygon PC *%& ν É . Optionally, we can define PC *%& ν as a convex hull of all the vertices clip of %& ν projected to ΠP . In order to determine the polygon PC *%& ν Ê WR § PC *%& ν ° lying on ΠP , a clipping algorithm must be applied [157]. Let SAPAR X be the surface area of the entity X on the plane ΠP and let %&*³+ be the axis-aligned bounding box containing the whole scene ³ . Similarly to Eq. 4.4 we can express the probability that a ray from SR PAR ΠP WR hits the % once it passes through ,-*³. as follows:

4.8. PREFERRED RAY SETS

77

p *%%

SAPAR PCclip *%& ν °2 SAPAR PC *%&*³+p2

(4.40)

Eq. 4.40 can be used to replace directly the probabilities in Eq. 4.9 for both left and right child nodes. We call the surface area heuristic modified for parallel projection a parallel surface area heuristic (abbreviated to PARSAH).

4.8.2

Perspective Projection

We form another preferred ray set by fixing the origin of the rays. Similarly as for parallel projection, we focus only on rays passing through a certain viewport. The origin of rays (viewpoint) OR and the viewport define a viewing frustum, see Fig. 4.12. Assuming that the rays are uniformly distributed on the viewport, the corresponding ray set SR PER ΠP WR contains rays involved in the commonly used perspective projection that intersects the viewport WR .

Figure 4.12: Perspective projection of % to the plane ΠP . Let PC X be the polygon obtained by projecting an axis-aligned bounding box X perspectively to the plane ΠP and SAPER PC X 2 the surface area of the polygon PC X on the projection plane. In order to compute the projected surface area of %& ν , which is associated with the node ν, a clipping to the viewing frustum must be applied, PCclip *%& ν p WR § PC *%& ν ° , similarly to the parallel projection. The conditional probability that a ray from SR PER ΠP WR hits the % once it passes through the axisaligned bounding box of the whole scene %&*³. can be expressed as: p *%,

SAPER PCclip *%& ν °2 SAPER PC *%&*³+°2

(4.41)

We call the surface area heuristic modified for perspective projection a perspective surface area heuristic (abbreviated to PERSAH). The clipping of % to the viewport must be always applied, hence the speed of the clipping is crucial for the build time TB of the kd -tree. In the following text, we outline the algorithms of clipping and computing the surface area of the projected % .

CHAPTER 4. CONSTRUCTION OF KD-TREES

78 Viewing Frustum Clipping

A general algorithm determining the surface area of the projection of an % to projection plane ΠP can be as follows: Project all the vertices of the % to ΠP and construct a convex hull of the projected vertices. This convex hull corresponds to a certain polygon. Then clip the polygon with respect to the viewport and compute its surface area. Obviously, the projection of all eight vertices and construction of the convex hull of the % for a specific viewpoint is both costly and computationally redundant. We need to construct only the projection of the silhouette of the % , since then the maximum number of projected points is six and the minimum is four. According to the mutual position of the viewpoint and the % we can identify twenty-seven regions for which the projected , has the same sequence of silhouette edges. These regions are induced by six planes that determine the % . Given a viewpoint OR and an % the corresponding sequence of silhouette edges can be determined by a table lookup and easily projected to ΠP . The sequence of silhouette edges is clipped in object space by the Sutherland-Hodgman algorithm [157] using the four planes that bound the viewing frustum. The result of clipping is a sequence of connected edges with at most ten vertices. The vertices are projected on to the projection plane, forming a convex polygon. Finally, the surface area of the polygon is computed. A special case occurs when the viewpoint lies inside the , . Obviously, every ray originating at such a viewpoint must always intersect the , and hence the conditional probability used in Eq. 4.41 is equal to one. When this case occurs, it could degenerate the result of the cost function used in PERSAH and we must use another method to position the splitting plane, either spatial median method or OSAH.

4.8.3

Spherical Projection

Another set of rays is induced by those involved in spherical projection. As with perspective projection, the origin of the rays is fixed. The rays induced by spherical projection are not uniformly distributed on a projection plane, but on a sphere with its center at the origin of a ray. Let us denote the set of rays of P spherical projection SR Π SPH , note that no viewport need be defined in this case, thus the clipping phase present in perspective projection can be omitted. A kd -tree built according to such a set of rays could be used to accelerate ray queries for point light sources [157]. P Since the ray distribution in SR Π SPH is uniform for the sphere, the task is to compute a solid angle Ω induced by the frustum enclosing a given , with respect to the center of projection. This task is similar to the computation of the point-to-polygon form factor [62]. Nevertheless, in our case it involves determining a region bounded by Jordan’s curve [47] on a unit sphere. The solid angle induced by a given % can be computed as the sum of all solid angles for visible faces from a given viewpoint. The solid angle Ω of a rectangle with height h and width w positioned with one corner at the origin and perpendicular to the z-axis can be computed in the following way (see Fig. 4.13): l

fË

x2 ¦ y2 ¦ c2 cos β

c dA dx dy dΩ l

dA cos β dΩ l2

c dx dy

Ë ¦ x2

y2 ¦ c2

3

(4.42)

The solid angle is then computed by integration: w

Ω sr

c dx dy

h

ÍÌ x 0 Ì y 0 x2 ¦ Ë Solving this integral with the substitution u xc and v Ω sr

arctan

c ÏÎ

y2 ¦ c2

y c

3

(4.43)

results in:

w h c2 ¦ h2 ¦ w2

(4.44)

The computation of the solid angle for a rectangle with one corner not positioned in the origin is performed by more general solution of the equation above. Then we can compute the solid angle taken

4.8. PREFERRED RAY SETS

79

by all three faces of the % that can be visible from a viewpoint. The conditional probability with respect to the %&*³. of the whole scene ³ is then: p *%%

Ω *%% Ω *%&*³.D

(4.45)

We call the surface area heuristic modified for spherical projection (Eq. 4.45) a spherical surface area heuristic (abbreviated to SPHSAH).

Figure 4.13: Computation of spherical projection for rectangle. It is obvious that for a viewpoint located inside a given % the corresponding solid angle is Ω 4 π, thus the probability p *%% is equal to one. In this case, similarly PERSAH, it is necessary to use another way for positioning the splitting plane.

4.8.4

Discussion

Below we discuss the properties of surface area heuristic for preferred ray sets. First, we discuss parallel surface area heuristic for the normal N £ ΠP x ¢ y¢ z of the projection plane ΠP . We can easily prove that the kd -tree constructed with PARSAH with a normal where Ð x Ð Ð y Ð Ð z Ð corresponds to the kd -tree constructed with OSAH. (Eq. 4.14). There are exactly eight vectors on a unit sphere with these properties. Hence, the cost estimate of the kd -tree constructed for PARSAH with other vectors than these eight vectors can be potentially decreased, and thus an RSA based on the kd tree assuming the required ray distribution with other vectors than these eight vectors would be more efficient. One problem can arise with the clipping of the projected % , when the constructed kd -tree is used for rays outside the viewing frustum. Since the probability using Eq. 4.40 is zero, the constructed kd -tree can degenerate, and this can negatively influence the cost for a ray traversing this part of the kd -tree. We have found two solutions to this problem. The first detects the situation and then for these cases uses OSAH for any node of the kd -tree that is partially or fully outside the viewing frustum. Then even the nodes on the boundary of the viewing frustum will be constructed using OSAH. The second solution is simpler; we can approximate the perspective projection by spherical projection for all cases where the solid angle taken by the viewing frustum is small. For so called paraxial rays (maximum angle between any two rays up to 5o ) the approximation is sufficiently precise. Then the ray distribution for spherical and perspective projection is similar and no clipping is performed.

CHAPTER 4. CONSTRUCTION OF KD-TREES

80

4.9

Time Complexity Analysis

Until now we have not analyzed the time complexity of kd -tree construction. We show below that for N objects in the scene the time complexity for construction algorithms above is N log N , but with various multiplicative factors hidden behind -notation. For analysis we assume that the constructed kd -tree has a number of leaves linear with N. This corresponds to the use of the automatic termination criteria algorithm developed in Section 4.5, where the computation of dmax is derived from N. Since each object is referenced at O NER ª N leaves on average, the number of leaves is at most NER . For ad hoc termination criteria the number of leaves is due to the constant value of dmax formally 1 , which is not usable for any analysis at all. For the kd -tree construction based on the spatial median method (see Subsection 4.2.2) given a node ν with N objects we put the splitting plane to a known position. If we do not sort objects according to their mutual position before the construction, we must test all the objects to find if they belong to the right or left node, thus in N time. The total number of objects associated with the interior nodes for one depth of the kd -tree also linear with N, and the depth of kd -tree is log N , thus resulting in N log N for the whole kd -tree. When we sort the objects before splitting the root node, sorting takes N log N time, which results again in N log N for kd -tree construction. For the kd -tree construction based on the surface area heuristic algorithms we first sort the boundaries of , s associated with the objects in the scene along all three axes. Sorting itself takes N log N time. In IE3 for each subdivision step, we have three lists of sorted boundaries and we evaluate the cost function for each possible position of the splitting plane. When we select the splitting plane Π to be used, we subdivide the lists into two halves, duplicating the % s of objects split by Π. Both testing and splitting of lists takes N time. The depth of the kd -tree constructed is also log N , which again results in N log N time complexity for kd -tree construction. The running time consumed to perform particular operations of surface area heuristic algorithms within the kd -tree construction can vary greatly. It includes the computing of the estimated cost, estimating blocking factor and the costs of subtrees, projecting and clipping within the general cost model, etc. The time needed to construct the kd -tree is expected to be saved within the execution phase of an RSA based on the kd -tree, which is justified by the number of ray shooting queries. We discuss these experimental results in the next section.

4.10

Summary of Results and Discussion

In this section we survey the experimental results concerning kd -tree construction. The experiments were performed on the above-mentioned G3SPD , G4SPD , and G5SPD test scenes from SPD package, see Section 3.5 for more characteristics of the scenes. Preferably, we report the experimental results in the form of minimum testing outputs (Section 2.5) – for each experiment a set of 13 parameters. To decrease the space taken by tables we decided to use only testing procedure TPD whenever applicable. The results of the first phase of the BES project (see Chapter 3) indicate that the results of using TPA , TPB , and TPC are sufficiently similar. The Tables in Appendix E show the minimum testing output for each scene and tested RSAs based on the kd -tree with a particular construction algorithm. The results in this section just summarize the results for a particular construction algorithm3 . Some of the presented methods for kd -tree construction can be combined together, and this is also sometimes necessary. Obviously, late cutting off empty space and two-plane cutting off empty space must be combined with ordinary surface area heuristic (OSAH). In kd -tree construction with the use of a general cost model it would be possible to use the ray distribution for some preferred ray set, but we show the results only for the testing procedure TPD (see Subsection 3.4.1). The termination criteria are just another part of the kd -tree construction algorithm, split clipping of objects’ % s has to be applied 3 The

line 0 in the Tables in Appendix Eshows the result for a “na¨ıve RSA”.

4.10. SUMMARY OF RESULTS AND DISCUSSION

81

on the fly, and leaf pruning in postprocessing. We thus do not report all the possible combinations of the methods described in this chapter, but only the reference algorithm and a new method that allows us easy comparison. If not stated otherwise, we compare ΘRUN of the presented RSA with reference one to compare the time devoted to ray shooting only. For the all testing presented in this chapter we used the recursive ray traversal algorithm TABrec (described in Section 5.4).

4.10.1

Positioning of the Splitting Plane

Let us summarize the results for positioning of the splitting plane, as stated in Section 4.2.2. The results are fully given in Appendix E, let us describe the settings for the experiments. The setting is always described for a line X , which corresponds to denotation in Appendix E:

Line 1: kd -tree constructed with a spatial median using ad hoc termination criteria (dmax 2) and change of the splitting plane orientation in cyclic order (x,y,z,x. . . ).

16, Nmax

Line 2: kd -tree is constructed with an object median using ad hoc termination criteria (dmax Nmax 2) and change of the splitting plane orientation in cyclic order (x,y,z,x. . . ).

16,

Line 3: kd -tree constructed with an object median using ad hoc termination criteria (dmax 16, Nmax 2). The orientation of the splitting plane is taken so as to split the smallest possible number of objects. Line 4: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

16, Nmax

Line 5: kd -tree constructed with OSAH using ad hoc termination criteria (dmax 16, Nmax search of the minimum cost splitting plane is restricted to the median interval. Line 6: kd -tree constructed with OSAH using ad hoc termination criteria (dmax change of the splitting plane orientation in cyclic order (x,y,z,x. . . ).

2).

16, Nmax

2). The 2) and

For the following discussion on kd -tree construction techniques, we select as a reference the construction algorithm given by line 4 that has achieved the best results on average. The results are summarized in Table 4.3. Line 4 1 2 3 5 6

Avg +1804% +354% +577% -3% +7%

CountBetter 0 2 1 24 11

Best

BestName

Worst

WorstName

reference algorithm +25% jacks5 +36604% -7% lattice29 +1755% -3% lattice29 +2764% -12% tetra5 +5% -8% mount4 +88%

tree15 teapot40 teapot40 gears9 sombrero4

Table 4.3: Summary table for positioning of the splitting plane. The spatial median (line 1) always achieved worse performance than the reference algorithm. On average for 30 SPD scenes was 1804% slower than the reference (line 4), in the best case 25% slower (“jacks5”), and in the worst case 36604% slower (“tree15”). The lack of performance is particularly remarkable for large scenes and for sparsely occupied scenes (“treeX”). For highly occupied scenes (“gearsX”) the decrease in performance is only small. The maximum leaf depth set to 16 does not allow the BSP tree to adapt well for highly occupied spatial regions. The object median with change of the splitting plane orientation in cyclic order (line 2) was on average of higher performance than the spatial median. However, it was on average 354% slower than

CHAPTER 4. CONSTRUCTION OF KD-TREES

82

the reference, in the best case 7% faster (“lattice29”), and in the worst case 1755% slower (“teapot40”). With increasing number of objects the method reaches worse performance. The object median with arbitrary change of the splitting plane orientation (line 3) shows similar characteristics as above (line 2), but has even worse performance. It was on average 577% slower than the reference, in the best case 3% faster (“lattice29”) and in the worst case 2764% slower (“teapot40”). Moreover, the time required for construction is also higher than the time for line 2, since it requires that we evaluate the possible plane position in the object median for all three axes. OSAH restricted to median interval (line 5) achieved practically the same results as the reference solution (line 4). The constructed BSP trees have practically the same number of leaves, and the timings differ only slightly. On average, it was 3% faster than the reference, in the best case 12% faster (“tetra5”), and in the worst case 5% slower (“gears9”). OSAH with change of the splitting plane orientation in cyclic order (line 6) was on average 7% slower than the reference, in the best case it was 8% faster (“mount4” and “teapot4”), and in the worst case 88% slower (“sombrero4”). The interesting property of this construction algorithm is the reduction of the build time, which is particularly noticeable for G5SPD scenes with high number of objects.

4.10.2

Termination Criteria

Here we summarize the results for the termination criteria. We tested the termination criteria for this setting:

Line 7: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

8, Nmax

1).

Line 8: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

8, Nmax

2).

Line 9: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

16, Nmax

1).

Line 10: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

16, Nmax

2).

Line 11: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

24, Nmax

1).

Line 12: kd -tree constructed with OSAH using ad hoc termination criteria (dmax

24, Nmax

2).

Line 13: kd -tree constructed with OSAH using automatic termination criteria (k1 K 1f ail 1 0, K 2f ail 0 2, and rqmin 0 75).

1 2, k2

2 0,

As a reference algorithm for the following discussing we take the algorithm given by line 13, the automatic termination criteria, since in practice it achieved the best performance on average. The results are summarized in Table 4.4. Line 13 7 8 9 10 11 12

Avg +2977% +2979% +19% +22% +7% +5%

CountBetter 0 0 6 3 6 7

Best

BestName

Worst

WorstName

reference algorithm +19% mount4 +30321% +18% mount4 +30259% -12% rings3 +225% -12% rings3 +228% -12% tree8 +44% -10% tree8 +20%

tree15 tree15 tree15 tree15 gears9 jacks4

Table 4.4: Summary table for the termination criteria.

4.10. SUMMARY OF RESULTS AND DISCUSSION

83

The results for the kd -tree specified by line 7 and 8 are practically similar, and the setting Nmax 1 produces a higher number of leaves for the kd -tree specified by line 7. Comparing the results of these ad hoc termination criteria with reference (automatic termination criteria), setting specified by line 7 and 8 was on average 2977% slower, in the best case 19% slower (“mount4”), and at the worst case 30321% slower (“tree15”). Setting dmax to 8 is thus insufficient even for a small number of objects in the scene (G3SPD ), particularly, it decreases the performance for G4SPD and G5SPD scenes. The results for the kd -tree specified by lines 9 and 10 are also quite comparable, and the setting Nmax 1 again results in a higher number of leaves. Comparing with the reference, line 9 is on average 19% slower than the reference, in the best case 12% faster (“rings3” and “tree8”, faster than reference for 6 scenes), and at the worst case 225% slower (“tree15”). The kd -tree specified by line 10 achieved slightly worse results, it was on average 22% slower, in the best case 10% faster (“rings3”, faster than the reference for 3 scenes) and in the worst case 228% slower (“tree15”). For large scenes, the setting of the maximum leaf depth to Nmax 16 was insufficient, particularly for sparsely occupied scenes. The results for kd -tree specified by line 11 and 12 show that setting Nmax 24 achieves the best results of all ad hoc termination criteria tested. The kd -tree specified by line 11 is on average 7% slower than the reference, in the best case 12% faster (“tree8”, faster than the reference for 6 scenes), and in the worst case 44% slower (“gears9”). This setting of ad hoc termination criteria tries to continue the subdivision even if this need not improve the total performance, as we can see for the scene “gears9”. The kd -tree specified by line 12 is on average 5% slower than the reference, in the best case 10% faster (“tree8”, faster than reference for 7 scenes), and in the worst case 20% slower (“jacks4”). Unlike for dmax 8 ¢ 16, for dmax 24 the setting Nmax to 2 (line 12) instead of setting Nmax to 1 (line 11) improves the performance on average. We can also compare automatic and ad hoc termination criteria generally. The ad hoc termination criteria result in kd -trees with higher performance than automatic termination criteria for 22 out of 180 experiments. However, in the best case the improvement is only 12% and in the worst case ad hoc termination criteria are 30321% slower. If we require to use ad hoc termination criteria, we can on average recommend the use of the best results achieved for Nmax 2 and dmax 24, however, we should be aware of the increased average build time TB and memory consumed by the kd -tree compared with the automatic termination criteria.

4.10.3

Cutting Off Empty Space

Here we summarize the results for cutting off empty space methods. We used the following setting for the experiments:

Line 14: kd -tree constructed with OSAH and automatic termination criteria, but slightly modified according to the paper by Subramanian and Fussel [145]. The probability of intersecting the left and right child nodes is taken two tights % s enclosing objects on the left and right side of the splitting plane. We have implemented this other variant of surface area heuristic only to verify the results presented by Subramanian and Fussel. Line 15: kd -tree constructed with OSAH using automatic termination criteria. In addition, late cutting off empty space is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. The limit for the maximum number of planes to be put within late cutting off empty space was set to 3. Line 16: kd -tree constructed with OSAH using automatic termination criteria. In addition, two-plane cutting off empty space is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. Line 17: kd -tree constructed with OSAH using automatic termination criteria. In addition, both late and two-planes cutting off empty space is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. The limit for the maximum number of planes to be put within late cutting off empty space was set to 3.

CHAPTER 4. CONSTRUCTION OF KD-TREES

84

As a reference algorithm for the following discussion we take the RSA specified by line 13 – it is the standard algorithm for the kd -tree construction without the additional improvements. The results are summarized in Table 4.5. Line 13 14 15 16 17

Avg +9% +2% +0% -4%

CountBetter 4 5 9 22

Best

BestName

reference algorithm -4% tree8 -2% tre11 -14% gears4 -18% sombrero1

Worst

WorstName

+26% +13% +13% +25%

teapot4 rings3 rings3 rings3

Table 4.5: Summary table for cutting off empty space for automatic termination criteria. The kd -tree as described by line 14 is on average 9% slower than the reference, in the best case 4% faster (“tree8”, faster than reference for 4 out of 30 scenes), in the worst case 26% slower (“teapot4”). In addition, the build time is increased on average by 20%. The use of this surface area heuristic is questionable. The kd -tree as described by line 15 is for given setting of CˆIT and CˆT S on average slower than the reference by 2%. For 5 out of 30 scenes it performs faster, in the best case is 2% faster than the reference (“tree11”), in the worst case is 13% slower (“rings3”). The kd -tree as described by line 16 is on average of the same performance as the reference solution. In the best case it is 14% faster (“gears4”), and in the worst case is 10% slower (“’rings3’). For the given setting of the constants it is faster for 9 out of 30 scenes tested. The kd -tree as described by line 17 – a combination of split clipping and late cutting – is on average 4% faster than the reference. In the best case it is 18% faster (“sombrero1”), in the worst case it is 19% slower (“rings3”). For the given setting of the constants it is faster for 22 scenes out of 30 tested. The results for kd -trees as described by lines 15, 16, and 17 do not correspond with the results that were achieved in the previous experiments [134]. However, these previous experiments were tested for the ad hoc termination criteria setting: dmax 16 and Nmax 1. Therefore we have performed the experiments for the following setting:

Line 18: kd -tree constructed with OSAH, and late empty space cutting off is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. The limit for maximum number of planes to be put within late cutting off empty space was set to 3. Ad hoc termination criteria were used: dmax 16 and Nmax 1. Line 19: kd -tree constructed with OSAH, and two-plane cutting off empty space is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. Ad hoc termination criteria were used: dmax 16 and Nmax 1. Line 20: kd -tree constructed with OSAH, and both two-plane and late cutting off empty space is applied, with the setting: CˆIT 0 7 and CˆT S 0 3. Ad hoc termination criteria were used: dmax 16 and Nmax 1. To evaluate these experiments we used as the reference algorithm the algorithm specified by line 9, which uses the same termination criteria. The results are summarized in Table 4.6. For the kd -tree specified by line 18 we got results that correspond to the previous finding by Sixta [134]. On average, the performance was improved by 9% compared with the reference algorithm. In the best case it was 20% faster (“sombrero2”), and in the worst case 2% faster (“tetra6” and “rings17”). So for all 30 tested scenes late cutting off empty space improved the performance.

4.10. SUMMARY OF RESULTS AND DISCUSSION Line 9 18 19 20

Avg -9% -9% -9%

CountBetter 30 29 30

Best

BestName

85 Worst

WorstName

reference algorithm -20% sombrero2 -2% -18% sombrero2 +0.5% -20% sombrero2 -2%

tetra6 lattice29 tetra6

Table 4.6: Summary table for cutting off empty space for ad hoc termination criteria. Similarly, for the kd -tree specified by line 19 we also got improved results. On average, the performance was improved by 9%, in the best case it was 18% faster (“sombrero2”), and in the worst case 0.5% slower (“lattice29”). For 29 out of 30 scenes the performance was improved. For the combined solution (line 20) the performance improvement corresponds to that presented for line 18. Combining of the techniques (line 20) did not bring any improvement. To conclude, both cutting off empty space methods does improve the performance of the RSA, when this is possible, i.e., for the kd -tree built with ad hoc termination criteria. On the other hand, the results obtained for automatic termination criteria show that it is not likely to improve the performance of an RSA based on the kd -tree with these additional cutting off empty space methods.

4.10.4

Reducing Objects’ Axis-Aligned Bounding Boxes

Here we summarize the results of applying the techniques introduced in Subsection 4.6.2. We used the following settings:

Line 21: kd -tree constructed with OSAH using automatic termination criteria (as for line 13). In addition, leaf pruning was used as a postprocessing. Line 22: kd -tree constructed with OSAH using automatic termination criteria (as for line 13). In addition, split clipping was applied. As a reference algorithm for the following discussion we again take the algorithm specified by line 13, since all the improvements try to increase the performance of this algorithm. The results are summarized in Table 4.7. Line 13 18 19

Avg -6% -8%

CountBetter 18 29

Best

BestName

Worst

WorstName

reference algorithm -41% teapot12 +19% -35% jacks5 +11%

gears9 gears9

Table 4.7: Summary table for reducing objects’ % s. The kd -tree specified by line 21 is on average 6% faster than the reference. In the best case it is 41% faster (“teapot12”), and in the worst case 19% slower (“gears9”). For 18 out of 30 tested scenes it is faster than the reference algorithm, but the build time is also slightly increased. The kd -tree specified by line 22 is on average 8% faster than the reference algorithm. In the best case it is 35% faster (“jacks5”), and in the worst case 11% slower (“gears9”). For 22 out of 30 tested scenes it is faster than the reference algorithm. It might be expected that ΘRUN and thus the whole running time TR should always be decreased by leaf pruning or split clipping. We have shown above that this is not the case. It always holds

CHAPTER 4. CONSTRUCTION OF KD-TREES

86

that the number of intersection tests per ray is decreased when the ray traversal algorithm for the ray shooting problem is used for all ray shooting queries within the testing procedure TPD . To test the visibility for a pair of points we have used modified ray traversal algorithm that stops when any rayobject intersection is found (see Section 1.3) and the corresponding point of intersection lies at most at some signed distance from the origin of the ray. This enables that the number of traversal steps in the kd -tree to be increased, however, the number of ray-object intersection tests can also be increased. We see that modifying the ray traversal algorithm for visibility for a pair of points influences the results for leaf pruning and split clipping.

4.10.5

General Cost Model

Here we describe the results for the general cost model (GCM). We implemented and tested the algorithms with the following settings:

Line 23: kd -tree constructed for a variant of GCM that estimates only the blocking factor using the sampling kd -tree – a resulting kd -tree is constructed using Eqs. 4.35–4.38. The number of sampling rays Nrays was determined as Nrays 0 3 Ñ N with the conditions Nrays 100 and Nrays ¸ 1000, where N is the number of objects in the node. Automatic termination criteria ˆ where Cˆ is the were used. The estimated cost of hit ray Cˆhit was computed as Cˆhit 1 5 Ñ C, cost of miss rays computed as a linear estimate taking into account the number of objects. Line 24: like line 23, but the cost was also estimated in the sampling kd -tree, disregarding the difference between the cost of hit and miss rays. The cost of the subtree for one ray was thus computed as Cˆ 1 ª Nrays _ CˆIT NIT ¦ CˆT S NT S for Nrays sampling rays in the sampling kd -tree, where NIT was the total number of ray-object intersection tests and NT S was the total number of traversal steps for all sampling rays. Line 25: like line 24, but distinguishing between the costs for hit and miss rays. As a reference for the following discussion we again take the kd -tree specified by line 13, since the general cost model is intended to increase the performance over the kd -tree constructed with a linear cost estimate. The results are summarized in Table 4.8. Line 13 23 24 25

Avg

CountBetter

-1% +17% +131%

17 3 1

Best

BestName

Worst

WorstName

reference algorithm -12% tree11 +30% -5% tetra6 +123% -1% sombrero1 +2637%

gears9 rings17 rings17

Table 4.8: Summary table for general cost model. The kd -tree specified by line 23 has approximately the same performance as the reference. It is on average 1% faster than the reference, in the best case it is 12% faster than the reference (“tree11”), and in the worst case 30% slower (“gears9”). For 17 out of 30 scenes the general cost model improved the performance, but the build time for the kd -tree includes the sampling and thus it is considerably increased. The kd -tree specified by line 24, which tries to estimate the cost via sampling, is on average 17% slower than the reference solution. In the best case it is 5% faster (“tetra6”), and in the worst case 123% slower (“rings17”). We can remark that cost estimate algorithm based on the sampling as described above does not perform well for the scenes with objects whose % s overlap.

4.10. SUMMARY OF RESULTS AND DISCUSSION

87

The kd -tree specified by line 25 has even lower performance than the kd -tree specified on line 24. On average it is 131% slower, in the best case it is 1% faster (“sombrero1”), and in the worst case 2637% slower (“rings17”). The summary results of the implemented estimates on the sampling kd -tree tree show that the general cost model for most SPD scenes has approximately the same performance, but for several scenes the kd -tree constructed with GCM has significantly lower performance. With increasing model precision and estimation of the cost via sampling the resulting performance is even decreased. To conclude, the experiments performed here on a general cost model validate the linear cost estimate. To the best of our knowledge, linear cost estimate achieves the best results for RSA based on the kd -tree. The algorithm proposed here for general cost model, which estimates the cost of the kd tree to be constructed for a given set of objects using the sampling on another kd -tree, in several cases completely degrades the performance.

4.10.6

Preferred Ray Sets

To test the kd -tree built for the preferred ray sets we used the automatic termination criteria algorithm for the constructed kd -trees, since it guarantees approximately the same number of leaves and thus the performance of the RSA from this viewpoint. The use of testing procedures TPA –TPD (see Section 3.4.1) has no sense, since we require some particular set of rays with a certain ray distribution. Therefore, tests were performed as for the testing procedure TPD , but with the depth of recursion set to 1, thus shooting only so called primary rays to the scene (number of rays Nrays 513 Ñ 513), which corresponds to the preferred ray sets only. This holds for perspective, parallel, and spherical projection. The spherical projection uses a viewport size of perspective one, and for parallel projection we constructed a viewport to cover roughly the same portion of the scene in the projected image as for the perspective projection. For all three projections we show the results between the kd -tree constructed using OSAH as the reference and the kd -tree constructed for the special surface area heuristic intrinsic to the projection. First, we compared PARSAH with OSAH for parallel projection. We used the following settings:

Line 26: kd -tree constructed for OSAH with automatic termination criteria, the rays induced by parallel projection were used. Line 27: kd -tree constructed for PARSAH with automatic termination criteria, the rays induced by parallel projection were used. To compare line 27 with line 26, we can conclude that the kd -tree built using PARSAH was on average 6% faster than OSAH. In the best case it is 46% faster (“jacks3”), and in the worst case it is 71% slower (“rings17”). The kd -tree constructed with PARSAH results in lower performance particularly for scenes with objects whose % s overlap. The performance improvement also depends on the vector that specifies the projection. Second, we compared the kd -trees constructed for PERSAH with OSAH for rays induced by perspective projection. We used the following settings:

Line 28: kd -tree constructed for OSAH with automatic termination criteria. Rays induced by perspective projection were used. Line 29: kd -tree constructed for PERSAH with automatic termination criteria. Rays induced by perspective projection were used. Line 30: kd -tree constructed for SPHSAH with automatic termination criteria. Rays induced by perspective projection were used. The kd -tree constructed with PERSAH (line 29) was faster for 11 scenes than the reference algorithm (line 28). For scenes “lattice6”, ”rings3”, ”lattice12”, ”rings7”, ”lattice29”, and “rings17”, the

88

CHAPTER 4. CONSTRUCTION OF KD-TREES

use of PERSAH obviously results in some degenerated kd -trees. For the scene “latticeX” this is understandable, since the viewpoint lies inside the scene % . If we consider all the scenes, then the kd -tree constructed with PERSAH is on average 621% slower than the reference. In the best case it is 27% faster(“tree15”), in the worst case 15330% slower (“lattice29”). If we remove the degenerated kd -trees for the scenes “latticeX” and “ringsX”, it is on average 4% faster, in the best case 27% faster (“tree15”), and in the worst case 18% slower (“tetra5”). The advantages of PERSAH over OSAH are questionable for automatic termination criteria, as the time required for constructing the kd -tree is significantly increased. The kd -tree constructed as specified by line 30 but for rays induced by perspective projection was faster only than the kd -tree constructed for OSAH for three scenes (“mount4”, “balls4”, “mount8”). On average, it is 5293% slower. In the best case it is 11% faster (“mount4”), and in the worst case 153829% slower (“tree15”)). Carefully examining the results we again observe that for several scenes the kd trees constructed with SPHSAH again degenerate – the scenes “sombrero1”, “lattice12”, “sombrero2”, “tree11”, “jacks5”, “rings17”, and particularly the scene “tree15”’. Third, we examined the results for the rays induces by spherical projection. We used the following settings:

Line 31: kd -tree constructed for OSAH with automatic termination criteria. Rays were induced by spherical projection. Line 32: kd -tree constructed for SPHSAH with automatic termination criteria. Rays were induced by spherical projection. In general, the results achieved for spherical projection are very similar to those obtained by perspective projection, since the distributions of ray sets induced by these two projections are similar. The kd -tree constructed as specified by line 32 is on average 5217% slower than the kd -tree constructed for OSAH. In the best case it is 12% faster (“lattice29”), and in the worst case by 153422% slower (“tree15”). It again includes some degenerated cases as for PERSAH, for the scenes ‘’sombrero1”, “lattice12”, “sombrero2”, “tree11”, and particularly “tree15”. The preprocessing times for OSAH and PARSAH without clipping are quite comparable, since in this case the parallel projection corresponds to multiplying the surface area of the % faces by elements of the projection vector defining the preferred set of rays. The preprocessing time to construct a kd -tree with OSAH and PERSAH/SPSAH differ significantly. Even if the clipping algorithm for PERSAH is performed in an incremental way and it thus utilizes the coherence of clipping for subsequent cutting planes along the tested axis, the build time differs by one or two orders of magnitude compared with OSAH. To decrease the build time required by kd -tree construction with SPHSAH we used the approximation [24] for arctan in Eq. 4.44. Parallel Projection in Close-Up The most promising results for the use of kd -trees for preferred ray sets were obtained for parallel projection (PARSAH). Therefore we conducted more experiments with PARSAH, namely as regards the sensitivity of the construction of the kd -tree to the direction of ray shooting queries. We used the scene “tree11” with the following initial projection settings: viewpoint 4 5 ¢ 0 ¢ 1 5 , lookat 0 ¢ 0 ¢ 1 5 , upvector 0 ¢ 0 ¢ 1 . This corresponds to the normal of the projection plane N£ 1 ¢ 0 ¢ 0 and azimuth 0. In the experiments the observer moved around the tree. That is, lookat and upvector remained constant and both viewpoint and N £ changed according to the azimuth in the range 0 ¢ π ª 2 . Setting azimuth to π ª 2 corresponds to viewpoint 0 ¢ 4 5 ¢ 1 5 and N £ 0 ¢ 1 ¢ 0 . First, we compared the kd -trees built using OSAH and PARSAH. The kd -tree constructed with PARSAH correspond to the azimuth (N £ ). Fig. 4.14 (a) shows the running time TR which also includes other computation (shading, the rendered images correspond to ray casting). Fig. 4.15 (a) depicts the

4.11. CONCLUSION AND FUTURE WORK

89

average number of ray object intersection tests. Fig. 4.16 (a) shows the average number of traversal steps per ray. The curves for PARSAH are referenced as PARSAH-A. Second, we tested the sensitivity of the kd -tree constructed for a fixed azimuth π ª 360 for shooting the primary rays for other directions than this for which kd -tree was constructed. Fig. 4.14 (b) shows the running time for views specified by the azimuth, Fig. 4.15 (b) depicts the average number of ray object intersection tests, and Fig. 4.16 (b) shows the average number of traversal steps per ray. The curve for PARSAH is referenced as PARSAH-B. We can observe that the performance improvement of the RSA based on the kd -tree constructed with PARSAH is restricted to angular divergence of π 2 ª 3 from the used for construction. The performance improvements are rather significant. Both the number of intersection tests and the number of traversal steps are reduced by more than one half for the tested scene. The kd -tree constructed with PARSAH for N £ 1 ¢ 0 ¢ 0 degenerates, because one principal axis is not taken into account at all. For this reason we selected as the reference for Fig. 4.15 (b) the kd -tree constructed for azimuth π ª 360. The difference between the kd -tree constructed using OSAH and PARSAH is visualised in Fig. 4.17.

4.11

Conclusion and Future Work

In this chapter we have shown in detail the construction of the kd -tree based on a top-down approach. The algorithms presented for kd -tree construction are heuristics, solving a problem with a simple and intuitive recursive character: when and where to put the splitting plane for an , containing a set of objects. The solution to the problem is not trivial: we show how to decrease the estimated cost of the kd -tree to be built up for the worst and average case complexity, under certain simplifying assumptions. However, we cannot claim any optimality of our methods. The presented techniques applied in kd -tree construction can be combined together in some ways. A technique of preferred ray sets requires some knowledge about the ray distribution of ray shooting queries in advance. The time consumed by kd -tree construction is a tradeoff with the performance of answering ray shooting queries. The longer the build time the more efficient would be the kd -tree. The upper bound of expected average-case complexity of an RSA based on the kd -tree for ray shooting can be determined by using a cost model when the kd -tree is built. Future research work concerning kd -tree construction for RSAs could include a study of more efficient algorithms for estimating the cost, the blocking factor, and the computation of the cost itself. A more intelligent method for searching and utilizing empty spatial regions is a real challenge that should also be studied, and the automatic termination criteria as described should also be further elaborated or change using Russian roulette or other mathematical tools dealing with probability. Another possible research topic could deal with building up a kd -tree with some clustering method. Here we have always discussed the kd -tree construction using a top-down approach. For scenes with very varying size of objects it may be possible to find sets of objects with roughly the same size that occupy particular spatial regions, to build up the kd -trees for such objects sets, and then to take these kd -trees as objects within the kd -tree construction of a “global” kd -tree. Due to the formulation of the cost function in surface area heuristic algorithms, the sets of objects seem to be initially disjoint from the rest of the scene, and the kd -tree is then built more or less for clusters of these objects. Clustering could further improve the performance of an RSA based on the kd -tree in these sparsely occupied scenes, but this needs to be verified experimentally. A study of the degenerated kd -tree for preferred ray sets is another interesting topic for research, which could combine ordinary and preferred ray set surface area heuristics. The important issue of further research in the kd -tree construction is to what extent and at which cost it is possible to improve the performance of RSAs based on the kd -tree.

90

CHAPTER 4. CONSTRUCTION OF KD-TREES

(a) (b) Figure 4.14: The running time TR that includes shading in dependence on angle; OSAH and PARSAHA/B. (a) PARSAH-A and OSAH. (b) PARSAH-B and OSAH.

(a) (b) Figure 4.15: ratio of ray-object intersection tests performed to minimum number of intersection tests rIT M in dependence on angle for OSAH and PARSAH-A/B. (a) PARSAH-A and OSAH. (b) PARSAH-B and OSAH.

(a) (b) Figure 4.16: Number of traversal steps per ray NT S in dependence on angle; OSAH and PARSAH-A/B. (a) PARSAH-A and OSAH. (b) PARSAH-B and OSAH.

4.11. CONCLUSION AND FUTURE WORK

91

(a)

(b) Figure 4.17: Visualization of the kd -tree built for a preferred ray set for scene “fluid”. (a) depicts a kd -tree built using OSAH. (b) shows a kd -tree constructed for PARSAH. For the sake of visual clarity maximum leaf depth dmax was set to 10.

92

CHAPTER 4. CONSTRUCTION OF KD-TREES

Chapter 5

Ray Traversal Algorithms for Kd -Trees In this chapter we deal with ray traversal algorithms for RSAs based on the kd -tree. First, we describe three types of ray traversal algorithms: sequential, recursive, and those with neighbor-links. Then we analyze the recursive ray traversal algorithm and develop a new more robust version of it. Finally, we present a summary of the results of our experimental comparison of the ray traversal algorithms for G3SPD , G4SPD , and G5SPD scenes.

5.1

Motivation

Given a kd -tree, which approximately represents the distance between objects in the scene, we need an algorithm that for a given ray R identifies the sequence of the kd -tree leaves intersected by R. We call this algorithm the ray traversal algorithm. It should be efficient, robust, and as simple as possible to implement. The first ray traversal algorithm for the kd -tree was developed by Kaplan [94], together with the first use of the BSP tree to accelerate ray tracing. The algorithm based on repetitive computation of a point-location search along the ray path within the kd -tree was later called the sequential ray traversal algorithm. Further, Jansen [91] introduced a recursive ray traversal algorithm that significantly differs from the sequential algorithm in the way it identifies the nodes of the kd -tree to be visited. This algorithm recursively descends the branches of the kd -tree along the ray path, starting at the root node, and visits each node at most once per ray. The efficiency and robustness of the recursive ray traversal algorithm was further improved by Havran et al. [82]. MacDonald and Booth [105] described a ray traversal algorithm that uses neighbor-links starting on the faces of leaves of the kd -tree, further elaborated by Havran and Bittner [80]. Here, we call it the ray traversal algorithm with neighbor-links.

5.2

Basic Terminology

In this section we describe some terminology used in this chapter. The input of the ray traversal algorithm is a ray R and a kd -tree that is constructed for a particular scene ³ . These two input entities give two basic input configurations according to the mutual position of the origin of a ray and the axisaligned bounding box of the scene %&*³. . The first configuration is when the origin of a ray is located outside %&*³+ . We call this a ray with external origin. When the origin of a ray is located inside or on the boundary of %&*³. , we call it a ray with internal origin. Given a ray R and an % intersected by R, we can distinguish two important points along the ray path and thus two signed distances. The point where R enters the % is called entry point A; this corresponds to the entry signed distance. The point where R leaves the , is called exit point B; this corresponds to the exit signed distance. A ray traversal algorithm can use knowledge of whether it processes a ray with internal or external origin and possibly some additional data structures. 93

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

94

Each ray traversal algorithm has its own pros and cons, resulting in various average costs of one traversal step C˜T S and average number of traversal steps per ray (parameters N˜ T S , N˜ ET S , and N˜ EET S of subset ∆ of the minimum testing output, see Chapter 2). A correctly implemented and robust ray traversal algorithm should visit each leaf along the ray path exactly once for an arbitrary ray. In this case only N˜ T S from subset ∆, and TR , Θrat , and ΘRUN from subset Θ of the minimum testing output can be influenced by the properties of the ray traversal algorithm. When a ray traversal algorithm requires some additional data structures used only for traversal purposes, subsets Σ and ∆ can also be influenced. The C-pseudocodes of all ray traversal algorithms described in this chapter are given in the Appendices.

5.3

Previous Work

In this section we describe in detail all three types of ray traversal algorithms for kd -trees developed in the past.

5.3.1

Sequential Ray Traversal Algorithm TAseq

The sequential ray traversal algorithm designed by Kaplan [94] is simply the repetitive application of a point-location search in the kd -tree along the ray path. We denote the sequential ray traversal algorithm by TAseq . Let us describe TAseq . We denote by U the point along the ray. The point U is used to locate currently visited leaf of the kd -tree. First, TAseq determines an initial point U along the ray, which serves to search the first leaf to be visited. If a ray has an external origin, the point U is equal to the entry point A of ,-*³. . If a ray has an internal origin, then U is equal to the origin of the ray. The point-location search is applied to get a leaf νE , where for the point U holds: U %& νE . If the leaf νE is not empty, the objects pointed to in νE are tested against the ray for intersection. If any intersected objects are found, the object with the closest intersection point is selected, and it is checked whether the intersection point lies in the ,- νE . If this is the case, the TAseq is finished. If the leaf νE is empty or no object is found to be intersected or the intersection point lies outside %& νE , the exit point B for ,- νE along the ray is determined. Point B is slightly moved forward along the ray path to ensure the next point-location search finds the next leaf. Then the ray traversal algorithm recurses, point location is applied again, etc. TAseq continues until the closest object along the ray path is found, or exit point B gets outside the ,-*³. – thus no object is intersected. The C-pseudocode of TAseq is given in Appendix A. TAseq requires us to know the % s associated with all leaves visited along the ray path, even when the leaves are empty. This is necessary for two reasons: to determine the exit point for % along the ray path, and to test if the point of intersection between the ray and an object lies in the % associated with a leaf. The first way is to store the % s directly in the leaves, which requires six floating-point variables in each leaf in the kd -tree. The second way is to compute the size of the % associated with a leaf during a point-location search, progressively restricting the size of the ,-*³. of the scene ³ , however, this increases the average cost of ray traversal step C˜T S . The obvious disadvantage of TAseq is that it performs many traversal steps within a point-location search. Even if it is very likely that two successive point-location searches traverse a common subsequence of interior nodes starting at the root node, TAseq disregards this fact and always performs the point-location search again, always starting at the root node. When a ray traverses Nl leaves before the object intersected is found, then it visits the root node Nl -times. Although the cost of one traversal step C˜T S is small, many interior nodes are traversed redundantly for the ray, so the information about the distance between objects encoded in the kd -tree is not utilized well.

5.3. PREVIOUS WORK

5.3.2

95

Recursive Ray Traversal Algorithm TAArec

A recursive ray traversal algorithm tries to avoid the basic disadvantage of the sequential ray traversal algorithm – each interior node and leaf of the kd -tree along the ray path is visited exactly once. Obviously, the average cost of traversal step C˜T S for the recursive ray traversal algorithm must be higher than that for the sequential algorithm. The cost model for kd -tree construction described in the previous chapter was based on the assumption that the recursive ray traversal algorithm is used. Here, we describe a variant of the recursive traversal algorithm that was first introduced by Jansen [91], which was discussed in more detail by Arvo [15], and was also republished by Sung [148]. We further denote this variant of the recursive ray traversal algorithm by TAArec . The pseudocode of the recursive ray traversal algorithm TAArec was outlined in Chapter 1 as Algorithm 2. When a ray enters the interior node of the kd -tree, which has two child nodes, then it decides if both of them are to be traversed and in which order. It classifies the child nodes of the current interior node to be traversed according to the position of the origin of the ray with regard to the splitting plane as “near” and “far” child nodes. When the ray traverses only the “near” child node, then it descends to this node and TAArec recurses. When the ray has to visit both child nodes, then TAArec saves the information about the “far ” child node, descends to the “near” child node and then recurses. When no object is found to be intersected inside the “near” child node, the “far” child node is retrieved and TAArec recurses, starting at the “far” child node. The detailed C-pseudocode of the efficient implementation of TAArec is given in Appendix B. The efficient implementation uses the traversal stack to avoid recursion. The stack is used to save the “far” child node if both child nodes are to be visited. TAArec uniformly decides which child nodes are to be traversed and in which order. It always computes the signed distance t to the splitting plane in the currently visited interior node. The entry and exit signed distances a and b for the current node are known from previous traversal steps, since they correspond to the signed distances with the splitting planes that were computed in previous traversal steps. For the root node, the entry and exit signed distances (it always holds a ² b, a is the entry signed distance and b is the exit signed distance) are computed by an explicit algorithm for the intersection between the ray and the % [162]. Based on the relation of the signed distances along the ray path t, a, and b it is possible to determine whether to traverse only the “near” child (the case: t ² a or t ± b ), or the “near” child first and then the “far” child (the case: a ² t ² b).

5.3.3

Traversal Algorithms with Neighbor-Links

Here, we describe two variants of a ray traversal algorithm with neighbor-links. Historically, the idea of using the neighbor-links for a ray traversal algorithm in the kd -tree was introduced in [105]. However, no clear results of experiments or some theoretical analysis were presented. The ray traversal algorithm with neighbor-links is based on additional data structures, i.e., neighborlinks starting on the faces of % s associated with the kd -tree leaves. We describe the construction of these additional data structures and the corresponding ray traversal algorithm. 5.3.3.1

Motivation

It is supposed that a leaf νE of the kd -tree can contain at least the list of objects that intersect the %& νE associated with νE . As we have seen for the case of the sequential ray traversal algorithm, a ray traversal algorithms can require additional data stored in the kd -tree leaves – for example data describing %& νE explicitly. Otherwise, the cost of the traversal step is increased, since the % s of leaves visited would had to be computed on the fly during the point-location search. A ray traversal algorithm that uses neighbor-links must always store these additional data structures explicitly inside the nodes of the kd -tree.

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

96

The ray traversal algorithm with neighbor-links is another method for eliminating repetitive visiting of the interior nodes for the leaves along the ray path, as occurs in the sequential ray traversal algorithm. Unlike the recursive ray traversal algorithm it requires additional data structures in the leaves of the kd tree. The ray traversal algorithm with neighbor-links can also avoid the down traversal phase needed to locate the first leaf if the origin of a ray is located inside the ,-*³. . This down traversal phase occurs in both the sequential and the recursive ray traversal algorithm. In applications with an RSA (for example, in global illumination algorithms for higher order rays) it occurs quite frequently that the origin of ray OR is located on the surface of an object Oi in a known leaf νE , since OR and Oi were the answer to a previous ray shooting query. It is often the case that this higher-order ray intersects an object close to its origin and thus the down-traversal phase from the root node can form a large portion of the whole time consumed by ray traversal algorithm. The disadvantage of ray traversal algorithms with neighbor-links is the increased cost of one traversal step C˜T S compared with the sequential and the recursive ray traversal algorithm. If many leaves are likely to be visited along the ray path, particularly when no object is intersected, then C˜T S can outweigh the number of traversal steps performed in the total cost of an RSA based on the kd -tree. For these ray shooting queries the ray traversal algorithm with neighbor-links is thus slower than a recursive ray traversal algorithm. Below, we describe two variants of traversal neighbor-links. 5.3.3.2

Ray Traversal Algorithm with Single Neighbor-Links TASNL

Let us describe a simpler version of a ray traversal algorithm with neighbor-links, which we call the ray traversal algorithm with single neighbor-links (denoted by TASNL ). In a kd -tree each leaf νE is associated with its axis-aligned bounding box, %& νE . Each %& νE has six faces of a rectangular shape that we call leaf-faces. Let F νE denote a leaf-face of the %& ν associated with the leaf νE . For the sake of convenience we here call an % associated with a leaf the leaf-cell. If two parallel leaf-faces associated with the leaf-cells of two leaves have some intersection of rectangular shape, then the leaves are called neighbor leaves. Given a leaf-face F νE and its neighbor leaves there are two mutual geometric relations. Firstly, there is only one neighbor leaf ν corresponding to the leaf-face F νE – a leaf-face F νE is completely contained in the face of the neighbor leaf-cell F ν : F νE O§ F ν F νE . Secondly, the leaf-face F νE can have intersection with the faces of several neighbor leaves. For a given leaf-face F νE we call a neighbor node the node ν of the kd -tree with the smallest ,- ν for which one face of %& ν contains F νE completely. The neighbor node is either a leaf or an interior node of the kd -tree. A single neighbor-link is a link from a leaf-face to its neighbor node. Using these links TASNL can avoid many traversal steps of interior nodes that would have been performed in the sequential ray traversal algorithm. We have to construct these links and store them for all the faces of all leaves in the kd -tree. Fig. 5.1 (b) shows an example of single neighbor-links for a scene in IE2 . Construction Algorithm The construction of single neighbor-links is straightforward. For each face of each leaf in the kd -tree a single neighbor-link is set up. This requires that we store six pointers in each leaf of the kd -tree, regardless of whether or not the leaf is empty. For a given leaf-face F νE the kd -tree is searched, starting from the root node of the kd -tree. In each step the search continues in the subtree that corresponds to a cell intersecting the face F νE . If the face F νE is split by the plane referred in the currently reached node (i.e., it intersects both subtrees), the search is terminated. A single neighbor-link to the neighbor node obtained by the search is stored within the leaf-face the algorithm was applied. The C-pseudocode of the algorithm for constructing single neighbor-links is given in Appendix D.

5.3. PREVIOUS WORK

97

Assume that a kd -tree has n leaves and its average depth is log n . For each face the down traversal takes log n steps on average. It is applied on 6 n faces. Hence the complexity of the algorithm is n log n .

Figure 5.1: (a) Scene in IE2 . (b) A kd -tree with single neighbor-links. (c) A kd -tree with neighbor-links trees.

5.3.3.3

Ray Traversal Algorithm with Neighbor-Links Trees TANLT

The ray traversal algorithm TASNL uses neighbor-links to locate a neighbor leaf for a ray leaving the leaf-face of the current leaf. Single neighbor-links point either to leaves or to interior nodes of the kd -tree. We denote a single neighbor-link that points to an interior node of the kd -tree as an indirect neighbor-link. After an indirect neighbor-link is used, the search down to the next leaf on the ray path continues as in the sequential ray traversal algorithm. Each indirect neighbor-link in the kd -tree can be replaced by a neighbor-links tree, which solves the search problem more efficiently. The corresponding ray traversal algorithm is more complicated and involves the construction of additional data structures. We call it the ray traversal algorithm with neighbor-links trees (denoted by TANLT ). Let us discuss the motivation for replacing indirect neighbor-links with neighbor-links trees. Although a leaf-face is two-dimensional, a search with an indirect neighbor-link is performed in three dimensions. This is the potential inefficiency spot of TASNL . An auxiliary spatial data structure called a neighbor-links tree enables us to replace the three-dimensional search by a two-dimensional search. Further, we discuss the construction of neighbor-links trees in detail. A neighbor-links tree is a two-dimensional kd -tree that is formed by pruning the splitting planes in the neighbor node to a given leaf-face. The pruned splitting planes are either parallel to the plane supporting the leaf-face or they do not intersect the leaf-face. The first way, pruning the splitting planes parallel to the plane supporting the leaf-face, eliminates one dimension for a point-location search. The second way is based on the projection of the neighbor leaf-faces to the plane supporting a given leaf-face that decreases the number of nodes in a neighbor-links tree and thus the number of traversal steps. Only splitting planes (projected as lines to the plane) intersecting the leaf-face are used in the neighbor-links tree. This pruning corresponds to clipping the two-dimensional kd -tree against a rectangle formed by the leaf-face. The algorithm constructing the neighbor-links tree replaces an indirect neighbor-link for a leaf-face F νE by the corresponding neighbor-links tree. Starting from the node that the indirect neighbor-link points to, a constrained depth-first-search (DFS) on the kd -tree is performed. Only subtrees corresponding to the cells that intersect the leaf-face F νE are visited during the DFS. Only the nodes where the faces intersect the leaf-face F νE are added to the neighbor-links tree. An example of a kd -tree with neighbor-links trees for a scene in IE2 is depicted in Fig. 5.1 (c). Two-dimensional clipping of a neighbor-links tree is depicted in Fig. 5.2. On the left side of this

98

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

Figure 5.2: Building of neighbor-links tree: left - unclipped, right - clipped.

figure a leaf-face is shown (smaller rectangle with thicker edges) for which the neighbor-links tree is built. The large rectangle is a projection of the corresponding neighbor node subtree onto the plane supporting a given leaf-face. The numbers marking the splitting planes denote the depth of the node in the subtree (the root node of the subtree is at depth zero). Note that the first splitting plane (the root node of the subtree) always intersects the leaf-face which the neighbor-links tree is built for. On the right side of Fig. 5.2 a neighbor-links tree is constructed by clipping the subtree against the leaf-face, which results in the minimum number of splitting planes in the constructed neighbor-links tree. Three grey rectangles corresponding to neighbor leaves depict parts of the leaf-face where the traversal is accelerated owing to the clipping. 5.3.3.4

Ray Traversal Algorithm for Neighbor-Links

The ray traversal algorithm for the kd -tree using neighbor-links follows the sequential ray traversal algorithm. It replaces the redundant point-location search that always starts from the root of the kd -tree by that one that starts in the node pointed to by a neighbor-link. It also requires to explicitly store the , of leaves directly in leaf node structure. The ray traversal algorithm consists of two components: the exit-face determination and the point location. Assume the ray traversal algorithm starts in a certain leaf νE . When a ray does not intersect any object pointed to in νE , we require to locate the next leaf νEnext so that %& νEnext is pierced by the ray. First, we determine the exit-face F νE that is intersected by the ray in its positive direction. Knowing the exit-face F νE , the exit-point B lying along the ray path and on the exit-face F νE is computed. Then we follow the neighbor-link corresponding to the exit-face F νE , thus obtaining a node νnext . When νnext points to a leaf, we follow this leaf directly (νEnext Å νnext ). Otherwise, there are two cases according to use of either TASNL or TANLT . TASNL : The node νnext corresponds to an indirect neighbor-link and we perform a point location search for exit point B in the kd-tree, but starting at νnext . TANLT : The node νnext corresponds to the root node of a neighbor-links tree. We perform a pointlocation search for exit point B to determine the next leaf νEnext , but starting at the root node of the neighbor-links tree (νnext ), comparing the coordinates of the exit-point B to the splitting planes of the interior nodes of the neighbor-links tree. If the ray-traversal is terminated in a leaf νET by an intersecting object at a point I, then the leaf νET can be used to start the ray-traversal for all the rays spawned from the point I (higher order rays in global illumination algorithms, etc.). In this case the initial down-traversal phase to the first leaf is avoided. Most of the traversal steps are eliminated when the ray traversal algorithm visits only a few leaves along the ray. If the origin of a ray lies outside %&*³+ , the entry point A of the ray with respect to %&*³+ must be computed. The entry point A is used to locate the first leaf whose % is pierced by the ray. The leaf is found using the point-location search in the kd -tree starting at the root node.

5.4. NEW RECURSIVE RAY TRAVERSAL ALGORITHM 5.3.3.5

99

Algorithm Analysis

The ray traversal algorithm with neighbor-links trees requires some memory to store the two-dimensional kd -trees, and thus additional preprocessing. On the other hand, it further decreases the average number of traversal steps per ray, which can be particularly useful when computing the result of ray shooting queries for higher order rays in global illumination algorithms. The starting leaf to initiate the traversal is also known when the viewpoint for primary rays in a global illumination algorithm lies inside the , of the scene. MacDonald and Booth [105] have shown that the average number of neighbor leaf-cells per leaf-face is always lower than two for an octree. It seems difficult to analyze the case of a kd -tree with arbitrarily positioned splitting planes. Nevertheless, we have found [80] experimentally that this number is on average bound by a small constant (smaller than four) independently of the depth of the kd -tree. This observation is important, since it bounds the average size of neighbor-links trees as well as the average number of point-location search traversal steps (traversal steps within the neighbor-links tree). Assuming that the number of leaves in kd -tree is n (when using automatic termination criteria, n O N for N objects), the construction of neighbor-links trees in TANLT also takes n time, requiring n memory.

5.4

New Recursive Ray Traversal Algorithm

In this section we develop a new recursive ray traversal algorithm TABrec that is robust and more efficient than TAArec , i.e., it has smaller expected cost of one traversal step. This algorithm was introduced by Havran et al. [82]. Before we start with a description of the new recursive ray traversal algorithm we classify all the possible configurations between a ray and the kd -tree interior node geometry that can occur when a ray enters the interior node. This classification allows us to analyze the problems arising with TAArec and subsequently to design a new more robust ray traversal algorithm.

5.4.1

Traversal Classification

When visiting an interior node ν of the kd -tree, we must decide which child node(s) of ν are to be visited and in which order. The recursive ray traversal algorithm decides among some traversal cases induced by the geometry of the problem. The input of a the algorithm is a ray R, the axis-aligned bounding box %& ν , the position and orientation of a splitting plane Π that splits %& ν into two new , s associated with the child nodes. From this input data we can compute the intersection point I between R and Π, the entry point A and the exit point B between R and %& ν . The relationships between I, A, and B then specify which child nodes are to be visited and in which order. Let us classify all possible traversal cases. The classification is depicted in Fig. 5.3 for one orientation of the splitting plane. In the left column the origins of rays are located below the splitting plane (negative, case N), in the middle column above the splitting plane (positive, case P), and in the right column the origins of rays are embedded in the splitting plane (zero, case Z). A local coordinate system referenced to the splitting plane is taken for classification. We call the child node below the splitting plane left and above the splitting plane right. The classification depicts the rays with external origin with a thick cross, and possible internal origin is denoted by a thin cross. A ray can have an internal origin (a ² 0) in all the cases depicted in Fig. 5.3, excluding cases N5 and P5, because these would become equivalent to cases P1 and N1, respectively.

Ò 100

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

Figure 5.3: Classification of mutual positions between a ray and a kd -tree interior node.

5.4.2

Analysis of TAArec

Here we analyze TAArec and show our motivation for designing a new recursive ray traversal algorithm. The traversal step based on comparing the signed distances in TAArec suffers from the robustness problem if the origin of a ray is embedded in the splitting plane (traversal cases Z1 and Z3 in Fig. 5.3). In this case TAArec cannot correctly determine whether to traverse either the left or the right child. This can lead to an incorrect selection of the child node to be traversed in the next step. The traversal cases N2 and P2 are always solved correctly, because the signed distance is computed as an overflow (positive or negative) and the “near” child is selected. For case Z2 any of the two child nodes can be selected to obtain a correct result. When an incorrect child to visit is selected for cases Z1 and Z3, the result of the RSA is incorrect. The impact on the visual quality of an image in a global illumination algorithm that uses TAArec can be significant. For example, in ray casting when the viewer position and thus the origin of the primary rays is embedded in the splitting plane of the root node of the kd -tree, then the part of the image is incorrect

5.4. NEW RECURSIVE RAY TRAVERSAL ALGORITHM

101

(in the worst-case it is completely missing). We have found two ways to deal with the robustness problem of TAArec . First, we can add two more conditions to TAArec to recognize the occurrence of case Z1 and Z3 (if ¤ ε ² t ² ε), where ε is a small positive constant) and then one more condition to distinguish between Z1 and Z3. This apparently further increases the average cost of traversal step C˜T S . Second, we can look at the problem of a ray traversal algorithm from a different perspective, and design a new recursive traversal algorithm. We follow this second way below.

5.4.3

Design of a Recursive Ray Traversal Algorithm TABrec

Uniformity of the algorithm determining between traversal cases, and lack of robustness, can been seen as the main disadvantages of TAArec . Therefore, we analyze the problem of traversing a ray in the kd -tree more thoroughly, and design a new recursive ray traversal algorithm, further denoted by TABrec . It has lower expected average cost for one traversal step C˜T S and, in addition, it always solves all the traversal cases depicted in Fig. 5.3 correctly. 5.4.3.1

Theoretical Considerations

The idea behind improving the efficiency in TABrec is statistical optimization, and making use of the perpendicularity of the splitting planes to the coordinate axes. Traversal cases occurring less frequently can be performed with a higher cost, and traversal cases occurring more frequently will be performed with a lower cost. The most time-consuming traversal cases of TAArec are N4 and P4; the “far” child is pushed onto the stack. The probability of these traversal cases is important for the efficiency of the ray traversal algorithm. We should thus estimate the probability of various traversal cases. We estimate them using the geometric probability tools already described in Subsubsection 4.2.3.1. We assume that %& ν associated with an interior node ν is split in the spatial median (see Section 4.2.2), subdividing the %& ν into halves – this corresponds to the BSP tree construction as depicted in Fig. 5.4. Let the size of %& ν in IE3 be w Ñ d Ñ h. For the sake of simplicity of the analysis, we further assume that the %& ν is cubic in shape (w d h). In IE3 we need three subdivision steps to get the leaves associated with the % s cubic in shape again.

Figure 5.4: Three-step subdivision of % using a spatial median. We can determine the probabilities that a ray pierces the left node (pLO ) only, the right node (pRO ) only, and both of them (pLR ¢ pRL ). We demonstrate the use of geometric probability to compute pLR ¦ pRL , which corresponds to the traversal cases N4 and P4. Further, we denote pLR RL pLR ¦ pRL , as shown in Subsection 4.3.1, pLR pRL , since this corresponds to the case when a ray intersects the splitting plane inside the , . The probability of intersecting the splitting plane put in the first subdivision step pILR RL (w d h, a w ª 2) is computed as: The probability pII LR a b h ª 2) is:

RL

d h 1 (5.1) w h ¦ w d ¦ d h 3 of intersecting the splitting plane put in the second subdivision step (h d, pILR

RL

pII LR

RL

a d ¦

a h d h ¦ a h

1 4

(5.2)

Ó 102

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

Similarly, the probability pIII LR (a b c h ª 2) is:

of intersecting the splitting plane put in the third subdivision step

RL

pIII LR

RL

b h ¦

a b a h ¦ a b

1 5

(5.3)

Since the probabilities are rational linear functions, it can be shown that they have no local extreme for any of w, d, h, a, b, and c. Since the subdivision steps are mutually independent starting with the , of cubic shape, we can compute the average probability pLR RL as follows: pLR

RL

1 I _ p 3 LR

RL

¦

pII LR

RL

¦

pIII LR

RL

47 180

0 261

(5.4)

When a ray traverses within the recursive ray traversal algorithm the whole kd -tree without intersecting an object, it can be shown that, for rays with uniform distribution probability, pLR RL depends on the sum of the surface areas of % faces in the kd -tree formed by the splitting planes. Let us remark that the statistically worst case occurs for the spatial median subdivision of the cubic cell: w d h, a b c w ª 2. In this case the faces corresponding to the splitting planes are also uniformly distributed in the kd -tree, so the probability of intersecting the splitting planes by uniformly distributed rays is the highest possible. 5.4.3.2

Experimental Statistics

We support our theoretical considerations by the results of experiments for testing procedure TPD , which were performed on several scenes from G4SPD (see Section 3.5 for more information about the scenes). Table 5.1 shows the statistics of the traversal cases, and the underlying kd -trees were built with ordinary surface area heuristic. Scene Probability

balls4

tetra6

mount6

gears4

Average

pN1

P1

0.296

0.18

0.434

0.439

0.337

P2

0.0004

0.0

0.0002

0.0004

0.0003

P3

0.328

0.39

0.292

0.345

0.339

P4

0.244

0.208

0.205

0.158

0.204

P5

0.133

0.221

0.07

0.057

0.12

0.0

0.0

0.0

0.0

0.0

pN2 pN3 pN4 pN5 pZ1

Z2 Z3

Table 5.1: Traversal case statistics for SPD scenes for TPD .

Pop operations from stack are not included in the table to get the probabilities in correspondence with the theoretical analysis. These pop operations are performed with a lower probability than pN4 P4 , since when a ray hits an object the remaining nodes stored in the stack are not used. We can see that pN4 P4 ² pLR RL , which enables us to design a new ray traversal algorithm with improved performance if cases other than N4 and P4 are solved efficiently in the new algorithm. 5.4.3.3

New Recursive Ray Traversal Algorithm TABrec

The use of the signed distances for classifying the traversal cases in TAArec leads to the robustness problem. For this reason algorithm TABrec discards the concept of “near” and “far” child nodes and simplifies the decision phase in the traversal step to the following traversal cases:

5.4. NEW RECURSIVE RAY TRAVERSAL ALGORITHM

103

visit the left child only, visit the left child first and the right child afterwards, visit the right child first and the left child afterwards, visit the right child only.

Theoretically, these four traversal cases could be distinguished by performing at least Ô log2 4Õ 2 comparisons, which takes a lower number of conditions than the number of conditions used in TAArec . The algorithm TABrec uses for the decision step a decomposition of the traversal cases based on the mutual position of the entry/exit point and the splitting plane position given the splitting plane orientation. Without loss of generality, we further assume that the splitting plane is perpendicular to the x-axis. We show the difference between TAArec and TABrec on the traversal case N1, see Fig. 5.5.

Figure 5.5: Traversal case N1. Algorithm TAArec : First, the signed distance t to the splitting plane is computed. Then the “near” and “far” children are determined; in Fig. 5.5 the “near” child is below the splitting plane and the “far” child is above the splitting plane. If the signed distance is smaller than zero (t ² 0 0), the “near” child is selected. Algorithm TABrec : If the projection of entry point A to the current axis (x-axis, which corresponds to the orientation of the splitting plane) is less than or equal to the position of the splitting plane (xA ¸ xI ) and the projection of exit point B to the current axis is less than or equal to the position of the splitting plane (xB ¸ xI ), the child below the splitting plane (left) is selected. The algorithm TABrec does not require the signed distance to the splitting plane to distinguish between traversal cases. This is enabled by the orthogonality of the splitting planes in the axis-aligned form of the kd -tree. TABrec then uses a comparison between the coordinates of the entry/exit point and the position of the splitting plane. The classification of traversal cases is simplified to a comparison between real numbers in TABrec . The penalty paid for this low cost traversal step is the need to compute all coordinates of the new exit point lying on the splitting plane for the traversal cases N4 and P4. This is the most time consuming part of TABrec . As we have shown above, the case N4 and P4 occurs with sufficiently low probability (p 0 26) so that the total efficiency of TABrec can be improved in comparison with TAArec . The C-pseudocode of the efficient version of TABrec is given in Appendix C. 5.4.3.4

Handling Singular Traversal Cases

The algorithm TABrec deals with the singular cases (Z1, Z2, and Z3) correctly, because the classification is based on a comparison of the splitting plane position and the entry/exit point coordinates. The issue of calculation imprecision is important for computing the signed distance to the splitting plane and thus new exit point coordinates. Naturally, during the descending phase to the first leaf the signed distance of a new exit point is smaller than for the current exit point. An incorrect result would occur due to the finite representation of real numbers, and therefore we should analyze this case. Let u denote the difference between one of the coordinates of the origin of a ray and the splitting plane, and let DRa denote the component of the ray direction for the a-axis. For the normalized vector of the ray

Ö 104

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

direction always holds DRa ¸ 1 0. The signed distance to the splitting plane is computed as: t u ª DRa . From knowledge about the representation of real numbers [1] it can be shown that imprecision can arise for Ð DRa ÐØ× 1 and Ð u ÐØÙÚÐ DRa Ð . We can show by contradiction that this problem with calculation imprecision does not arise in TABrec , since the range for u and v as stated above cannot occur in the traversal cases N4 and P4.

5.4.4

Comparison between TAArec and TABrec

The time complexity of traversal cases in TAArec and TABrec can be expressed in terms of the arithmetical operations required. Table 5.2 shows the traversal complexity for all traversal cases in the algorithm TAArec . Traversal case N1 P1 N2 P2 N3 P3 N4 P4 N5 P5 Z2 pop

Arithmetical operation

²

5 5 5 9 5 5 3

®

4 3 3 5 5 3 1

Ñ

1 1 1 1 1 1 0

ª

0 0 0 0 0 0 0

Inc Dec

1 1 1 1 1 1 0

0 0 0 1 0 0 1

Table 5.2: Traversal complexity for the algorithm TAArec in terms of arithmetic operations. Table 5.3 gives the traversal complexity in terms of arithmetical operations for all traversal cases in the algorithm TABrec . Traversal case N1 P1 N2 P2 N3 P3 N4 P4 N5 P5 Z1 Z2 P3 pop

Arithmetical operation

2 2 2 11 11 2 2 2 3

²

®

3 3 3 5 4 3 4 3 1

Ñ

0 0 0 3 3 0 0 0 0

0 0 0 2 2 0 0 0 0

ª

Inc Dec

0 0 0 1 1 0 0 0 0

0 0 0 1.3 1.3 0 0 0 0

Table 5.3: Traversal complexity for the algorithm TABrec in terms of arithmetic operations. Below, we compare TAArec and TABrec based on the average number of arithmetical operations required for one traversal step. The probabilities for different traversal cases are obtained by averaging the results of experiments for several scenes. Since the recursive ray traversal algorithm also includes the pop operation, this must be included in the statistics unlike for Table 5.1. That is why the probability values presented in Table 5.1 differ from those presented in Table 5.4. Since objects can be hit during the traversal, the probability for pop operation ppop is smaller than probability pN4 P4 . The probabilities in Table 5.4 are the most unfavorable for algorithm TABrec , which includes the highest pN4 P4 .

5.5. SUMMARY OF RESULTS

105

Probability pN1 P1 pN2 P2 pN3 P3 pN4 P4 pN5 P5 pZ1 2 3 ppop ∑p

Without pop 0.308 0.0002 0.294 0.261 0.134 0.0 – 1.0

With pop 0.262 0.0002 0.25 0.2236 0.114 0.0 0.15 1.0

Table 5.4: Probabilities used for comparison of TAArec and TABrec .

Table 5.5 shows a comparison of TAArec and TABrec based on the number of arithmetical operations to be computed. The number of these operations was computed as a weighted sum of the counts of the arithmetical operations shown in Table 5.2 and Table 5.3 using the weights given in Table 5.4. In the statistically worst case, the theoretical speedup of the TABrec over TAArec achieves a value of about 1 2. Arithmetical operation Algorithm TAArec [-] TABrec [-] TAArec [-] TABrec

²

®

Ñ

ª

5.60 4.01

3.07 3.03

0.85 0.67

0.0 0.45

0.85 0.22

0.37 0.29

1.40

1.01

1.27

–

3.79

1.28

Inc Dec

Table 5.5: Comparison between TAArec and TABrec based on number of arithmetical operations. We also compared TAArec and TABrec experimentally. The results of experiments are included for all G4SPD , and G5SPD SPD scenes in Appendix E, and are summarized in the next section. Obviously, the parameters TR , Θrat , and ΘRUN of the minimum testing output (see Section 2.5) differ. In summary, the cost of one traversal step of TABrec was decreased by 22% on average compared with TAArec . The results [82] of us from experiments that were performed on different computer architecture (MIPS R4400, 200MHz running Irix 6.2) showed us that the cost of the traversal step of TABrec can be decreased even more, by 47% lower on average than for TAArec , but it reached from 53% to 35% in dependence on the input scene. The results of the previous experiments are presented in Table 5.6. We do not want to speculate why the results of experiments performed on various computed architectures differ, since this is a problem inherent to the points HW and COMP, see Section 2.7. A possible reason is that MIPS architecture can execute better the part of TABrec , which computes traversal cases N4 and P4, since most of the computation can be performed using more than one floating point unit. The ray traversal algorithm TABrec handles all the traversal cases correctly and always performs better than TAArec . An interesting property of TABrec is that the entry and exit points, which in TABrec , unlike TAArec , are always computed, can be used directly for ray-bounding box culling [136].

G3SPD ,

5.5

Summary of Results

In this section we summarize the results of experiments on all ray traversal algorithms described above. Since the testing procedure TPD is the same, the input set of query rays, the parameters in minimum testing output influenced by the ray traversal algorithm are only the following:

Û 106

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES Scene Algorithm

balls4

tetra6

mount6

gears4

94.21

7.86

122.4

307.7

[s] [s]

45.58 21.71

4.34 2.81

52.97 24.83

58.52 32.22

TABrec [-] TAArec

0.48

0.65

0.47

0.55

TR TAArec TT S TAArec TT S TABrec rT S

[s]

Table 5.6: Experimental comparison between TAArec and TABrec for MIPS R4400, 200MHz running operating system Irix 6.2. Parameter TR refers to the total running time of ray tracing (TPD ), TT S to the time N devoted to traversing the kd -tree (TT S 1 ¤ Θrat D TR ΘAPPΘRU ΘRU N ), and rT S shows the ratio between the times devoted to traversing the kd -tree for TAArec and TABrec .

Subset Σ: NG – for TANLT the number of generic nodes is increased by the nodes of neighbor-links trees, Subset ∆: N˜ T S – the average number of traversal steps per ray, Subset Θ: TR , Θrat , and ΘRUN – the time portion of RSA devoted to traversing the kd -tree, since the average cost of traversal steps C˜T S and N˜ T S is specific to ray traversal algorithms. The time devoted to traversing the kd -tree is TT S 1 ¤ Θrat D TR ΘAPPΘRU ΘNRU N .

The results of experiments are fully given in Appendix E, let us describe the settings used. The setting is always described for a line X , which corresponds to denotation in Appendix E. The following ray traversal algorithms were tested:

Line 33, 38, and 43: – TAseq – sequential ray traversal algorithm (Section 5.3.1), Line 39, 44, and 34: – TAArec – recursive traversal algorithm (Section 5.3.2), Line 40, 45, and 35: – TABrec – recursive traversal algorithm (Section 5.4.3), Line 41, 46, and 36: – TASNL – traversal algorithm with single neighbor-links (Section 5.3.3.2), Line 42, 47, and 37: – TANLT – traversal algorithm with single neighbor-links trees (Section 5.3.3.3). We decided to verify the time required by ray traversal algorithms on kd -trees built with different number of leaves. We used the following three different settings to build up the kd -trees on these lines in Appendix E:

Line 33, 34, 35, 36, and 37: – setting CT1 – OSAH + ad hoc termination criteria with this setting: maximum leaf depth dmax 16 and the number of objects in leaves Nmax 2 (Subsubsection 4.2.4.1), Line 38, 39, 40, 41, and 42: – setting CT2 – OSAH + ad hoc termination criteria with this setting: maximum leaf depth dmax 18 and the number of objects in leaves Nmax 2, Line 43, 44, 45, 46, and 47: – setting CT3 – OSAH + automatic termination criteria with this setting: k1 1 2, k2 2 0, rqmin 0 75, K 1f ail 1 0, and K 2f ail 0 2 (Section 4.5). The experiments were performed using the testing procedure TPD , in total for one ray traversal algorithm we performed 30 Ñ 3 90 experiments. The ray sets induced by TPD contain rays with both

5.5. SUMMARY OF RESULTS

107

external and internal origin. The known origins of higher order rays were used to initiate the traversal when testing the ray traversal algorithms with neighbor-links (TASNL and TANLT ). Below, we compare the ray traversal algorithms quantitatively. As a reference for comparison we take the recursive ray traversal algorithm TABrec , since in most cases it achieved the best results. We can summarize the results of the experiments as follows:

TAseq – the slowest ray traversal algorithm for all experiments performed. It was in all cases slower than the reference algorithm TABrec . The time for traversing the kd -tree (TT S ) was 60% higher on average than the time for the reference algorithm, in the best case 12% (“gears9” with CT3 setting), and in the worst case 153% (“lattice29” with CT1 setting). The time TR was increased by average by 37%, at least by 8%, and at most by 100% in comparison with TABrec . The average number of traversal steps per ray was considerably higher than for the reference algorithm, on average by 135%, at least by 69% (“mount4” with CT1 setting) and at most by 215% (“jacks5” with CT1 setting).

TAAREC – the traversal algorithm with reasonable performance. It was also slower than reference ray traversal algorithm TABrec for all 90 experiments. The time for traversing the kd -tree (TT S ) was 16% higher on average than the reference algorithm, in the best case 5% (“tree15” with CT2 setting), and in the worst case 36% (“tetra8” with CT1 setting). The time TR was increased on average by 10%, at least by 4% and at most by 20%. The average number of traversal steps per ray was practically equal to the reference algorithm for all the experiments, so no visual artefacts due to the lack of robustness of the algorithm were found. TASNL – the ray traversal algorithm with single neighbor-links performed better than the reference algorithm for 20 out of 90 experiments. The improved cases cover the experiments for scenes with many secondary rays, where the algorithm TASNL eliminates the down-traversal phase required by the recursive ray traversal algorithm. However, the performance of TASNL for other scenes was lower than for the reference algorithm. The time for traversing the kd -tree TT S was 8% higher on average than the reference algorithm. In the best case, TT S of TASNL was decreased by 18% compared with the reference algorithm (“mount8” with CT1 setting), and in the worst case it was increased by 75% (“lattice29” with CT1 setting). The time TR was increased on average by 6% compared with the reference algorithm, in the best case it was 10% lower, and in the worst case it was 62% higher. The average number of traversal steps per ray N˜ T S was on average decreased by 16% (“jacks3” with CT1 setting), at least by 1% and at most by 49% (‘lattice29” with CT2 setting). TANLT – for 18 out of 90 experiments the ray traversal algorithm with neighbor-links trees achieved better performance than the reference algorithm. For one case there was not enough memory to construct the neighbor-links trees for one experiment (the scene “lattice29” with CT3 setting). The time for traversing the kd -tree (TT S ) was 17% higher on average than the reference algorithm. In the best case TT S was decreased by 16% (“mount4”, “mount6”, and “mount8” with CT3 setting and “mount4” with CT1 setting), and in the worst case it was increased by 307% (“teapot40” with CT1 setting). The time TR was increased on average by 12% than the reference algorithm, in the best case it was 9% lower, and in the worst case by 198% higher. These results contradict with the results of TASNL , since the running times for TANLT should be better than for TASNL . It was probably caused by swapping virtual memory to the disk and lack of robustness and precision in measuring the user time of the process in the UNIX environment, since the construction of neighbor-links trees required some additional memory. The number of traversal steps per ray N˜ T S was on average decreased by 24% when compared with the reference algorithm, at least by 8% (“tree8” with CT1 setting) and at most by 50% (scene “lattice29” with CT2 setting); It is also interesting to compare TANLT directly with TASNL . Concerning subset ∆ of the minimum testing output, the time TR of TANLT was increased on average probably due to the abovementioned swapping of memory that was allocated for neighbor-links trees. The average number

108Ü

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES of nodes allocated for neighbor-links trees was about six nodes per leaf of the kd -tree, i.e., one node for each leaf-face.

For the sake of clarity, all the average, minimum, and maximum values are summarized in Table 5.7, where TABREC is taken as the reference algorithm.

Algorithm TASEQ TAAREC TABREC TASNL TANLT

T˜T S 60% 16% 8% 16%

m TT S 12% 5% -18% -16%

M TT S 153% 36% 75% 307%

T˜R 37% 10% 6% 12%

Parameter m TR M TR 8% 4% -10% -9%

100% 20% 62% 198%

N˜ T S 135% 0% -16% -24%

m N˜ T S 69% 0% -1% -8%

M N˜ T S 215 % 0% -49% -50%

Table 5.7: A summary comparison of ray traversal algorithms, where TABREC is taken as the reference – average, minimum and maximum values for TT S , TR , and N˜ T S for all 90 experiments for one ray traversal algorithm. Testing procedure TPD was used. Symbol M denotes the maximum, m the minimum achieved.

The difference of performance for all ray traversal algorithms when testing on the kd -trees built for the different termination criteria (CT1, CT2, and CT3) was not significant. We only verified that the higher the number of nodes in the kd -tree, the lower the performance of TASEQ compared with TABREC .

5.6

Conclusion and Future Work

In this chapter we described three types of ray traversal algorithms for the kd -tree, one variant of the sequential ray traversal algorithm, two variants of the recursive ray traversal algorithm, and two variants of the ray traversal algorithm with neighbor-links. All these algorithms for a given ray and a scene identify the same sequence of leaves, but they differ in number of visited interior nodes and in the way in which the leaves are determined. For the tested scenes, the ray traversal algorithm with the best performance was the recursive ray traversal algorithm TABREC . For scenes with the largest number of higher order rays it was outperformed by ray traversal algorithms with neighbor-links. The recursive ray traversal algorithm TAAREC , which suffers from a lack of robustness, and sequential ray traversal algorithm TASEQ were always slower than TABREC . Future research work could include the development of other ray traversal algorithms with a lower average cost of one traversal step and the smallest possible number of traversal steps. For individual rays, the number of traversal steps for the case of the recursive ray traversal algorithm and the algorithm with neighbor-links trees cannot be decreased. The limitation is that we have to visit each leaf along the ray path, and we have already reached the minimum number of traversal steps. It possibly remains to decrease further the cost of one traversal step, which need not only be an implementation issue, as we have shown by the design of the recursive traversal algorithm TABrec . A more promising topic for research is an algorithm that selects one from all implemented ray traversal algorithms for a particular ray shooting query to achieve the best possible performance. The selection

5.6. CONCLUSION AND FUTURE WORK

109

algorithm for a particular ray shooting query has the input known or estimated properties of the input ray with respect to the scene configuration: is it probable that the ray intersect an object? What is the probable distance of an intersection point? Is the ray with internal or external origin? When we have a ray with internal origin, is it known which leaf of the kd -tree contains the origin of the ray? The research effort in this direction is intertwined with the applications for which the RSA based on the kd -tree is used.

Ò 110

CHAPTER 5. RAY TRAVERSAL ALGORITHMS FOR KD-TREES

Chapter 6

Longest Common Traversal Sequences for Kd -Trees In this chapter we describe two methods utilizing spatial coherence for a particular set of ray shooting queries that decrease the average number of traversal steps per ray and thus the total running time consumed by the RSA. These methods use the concept of the longest common traversal sequence for the kd -tree introduced by Havran and Bittner [78].

6.1

Motivation

We described various ray traversal algorithms in great detail in the previous chapter. Until now we have assumed that solving each single ray shooting query is an individual task independent of solving other ray shooting queries. In global illumination algorithms it is often the case that set of rays have similar directions and origins. This occurs for primary and higher order rays spawned with the same point of origin, rays between two patches that are used to compute a form factor, etc. This raises the question whether it is possible to use such knowledge about the similarity of rays to further improve the efficiency of an RSA. The basic idea of such an improvement for RSAs based on a spatial subdivision is illustrated in Fig. 6.1. Rays lying within a certain convex shaft must pierce the same sequence of generic and elementary cells of a spatial subdivision. We call this phenomenon traversal coherence.

Figure 6.1: The concept of traversal coherence in two dimensions. An arbitrary ray RX lying between rays RA and RB pierces the same sequence of elementary cells. We describe two techniques utilizing the concept of traversal coherence. Our first technique determines a longest common traversal sequence for the kd -tree (further abbreviated to LCTS ) for a given convex region (shaft) Ýv³ consisting solely of leaves of the kd -tree. We call the resulting LCTS the simple LCTS (SLCTS ). The SLCTS (if it exists) can be used for all rays contained within Ýv³ . As 111

Ó 112

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

will be shown later, if no intersected object is found using the current SLCTS , the ray traversal algorithm uses some conventional ray traversal algorithm, such as the one with neighbor-links presented in Subsubsection 5.3.3.3. The second technique uses a more elaborate treatment of the information gained during traversing of the spatial hierarchy. It determines a hierarchical LCTS (HLCTS ), which corresponds to a sequence Þ of nodes of the spatial hierarchy. These nodes form a cut of the hierarchy at the level where the order of traversed nodes can no longer be predetermined for all rays located within Ýv³ . For any ray located in Ýv³ we can avoid traversal steps from the root of the hierarchy to the nodes in Þ . As we show later, the LCTS concept can be further improved. One extension prunes adjacent empty elementary nodes of a spatial subdivision. Another extension determines a termination object (if it exists) that is hit by all rays located in Ýv³ . The rest of the chapter is organized as follows: In Section 6.2 we describe the previous work related to the approach presented here. In Section 6.3 we present the LCTS construction algorithm in detail. Section 6.4 describes utilizing LCTS for ray shooting between two patches and hidden surface removal based on ray casting. Section 6.5 presents results based on a practical implementation. Finally, Section 6.6 concludes the chapter with several possible directions for future research.

6.2

Previous Work

Several papers related to the LCTS concept have been published. Concepts of generalized rays have been introduced; cone tracing [11], beam tracing [87], and pencil tracing [131]. Arvo and Kirk [17] presented a ray classification method which subdivides the five dimensional ray space. For each cell of this subdivision, a sorted list of objects is constructed and a ray is tested for intersection only with objects corresponding to the elementary cell that is intersected by the ray. Simiakakis and Day presented a technique that improves the space complexity of ray classification by adaptively subdividing the ray space [133]. The memory complexity of this approach was improved by Kwon et al. [102] by reducing the ray space from five to four dimensions. The ray coherence theorem [116] is a generalization of the light buffer [70] approach. It uses directionality of rays and a binary search. Haines and Wallace [71] utilized the concept of a shaft; the ray-object intersection tests are restricted to objects that intersect a shaft connecting two patches. Teller and Alex [155] subdivide the viewing frustum by combining Warnock’s visibility algorithm and beam tracing. Similar use of coherence was presented by Gonz´ales and Gisbert for an octree [64]. Pyramid clipping of a spatial subdivision aimed at a parallel implementation for a ray traversal algorithm was presented by van der Zwaan et al. [156]. The technique of directed safe zones utilizing free adjacent elementary cells within a uniform grid was published by Semwal [124]. Recently, Genetti et al. [55] presented an approach for adaptive supersampling in object space, using pyramidal rays. All the methods mentioned utilize some coherence concept, which was surveyed by Gr¨oller [68]. The algorithm presented in this chapter is a combination of directional techniques with a hierarchical spatial subdivision, i.e., with the kd -tree. The proposed combination takes advantages of both recursive and neighbor-links ray traversal algorithms of the kd -tree for a certain set of rays.

6.3

LCTS Construction

The LCTS is constructed for a convex shaft defined by a set of boundary rays. Typically, these rays form the edges of a frustum (if they share the origin) or the edges of a tunnel (if the rays are parallel). For each of the boundary rays a traversal history is stored. This information is used to construct the LCTS that is common to all rays belonging to the shaft. We distinguish between two types of LCTS . The first type – a simple LCTS (SLCTS ) exploits traversal coherence using only leaf nodes of the hierarchy. The

6.3. LCTS CONSTRUCTION

113

second one – a hierarchical LCTS (HLCTS ) also uses traversal coherence in the hierarchical nodes, but requires more computational effort to construct the traversal history.

6.3.1

SLCTS

The concept of SLCTS is depicted in Fig. 6.1 for a uniform grid. Similarly for kd -tree, we assume a convex shaft defined by several rays that traverse the same sequence Þ of elementary cells of a kd -tree. Then an arbitrary ray lying within the shaft traverses sequence Þ as well. The origin of the ray has to be positioned in the shaft. There are some potential problems to be solved for SLCTS as depicted in Fig. 6.2:

Case 1: No common sequence of leaf nodes exist (Fig. 6.2, case 1). Case 2: Having some initial sequence of elementary cells Þ for the rays defining an LCTS , so the last common cell for them is known. If a ray does not hit any object in Þ , which cells have to be traversed then? (Fig. 6.2, case 2).

Figure 6.2: Two potential problems of SLCTS to be solved. The numbers mark the depth of the cutting planes in the kd -tree hierarchy.

We use a simple solution to the first problem; we apply any ray traversal algorithm for the kd -tree described in Chapter 5. The second problem can be solved for the kd -tree by using of a ray traversal algorithm with neighbor-links (see Subsection 5.3.3). When no object is intersected using sequence Þ , then the ray traversal algorithm continues to the next leaf along the ray using either single neighbor-links or neighbor-links trees. The ray traversal algorithm with neighbor-links starts at the last cell of Þ . It is worth mentioning that the concept of SLCTS is applicable not only to the kd -tree, but to any spatial subdivision. Then it is advantageous if the ray traversal algorithm for the spatial subdivision can continue from some elementary cell (last cell of SLCTS ) to the next cell along the ray without the down-traversal phase (e.g., uniform grids).

6.3.2

HLCTS

The second proposed method uses the HLCTS and also exploits the traversal coherence of interior nodes of the kd -tree hierarchy. We describe the details of HLCTS construction and the corresponding ray traversal algorithm below. 6.3.2.1

Traversal Trees

A traversal history for a given ray can be stored by means of a traversal tree. The traversal tree is a binary tree, where each node τ of the traversal tree corresponds to a node ν in the kd -tree that was

Ö 114

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

visited in the scope of the traversal. Additionally, the node contains information about traversing the node that reaches one of the following traversal states: LEFT, RIGHT, LEFT/RIGHT, RIGHT/LEFT, and TERMINATION. The traversal state TERMINATION corresponds either to pierced leaf-nodes of the kd -tree or interior nodes that were pushed on the traversal stack, but as the ray has been terminated, these nodes were not used for further traversal. Other traversal states express the order of the traversal of the kd -tree “below” node ν. Additionally, a node τ of the traversal tree contains a pointer to an exitplane, which is a plane bounding the %& ν associated with node ν along the ray path (see Fig. 6.3 (a)). The use of a pointer to the exit-plane will be described further.

Figure 6.3: (a) Exit plane – the ray leaves the marked leaf node at the face in the exit plane that is formed by the cutting plane of the root node. (b) Traversal tree corresponding to the ray. Notation for nodes of traversal tree: L – LEFT, LR – LEFT/RIGHT, R – RIGHT, RL – RIGHT/LEFT, T – TERMINATION. If node τ is not a leaf, then its left child contains a pointer to the kd -tree node ν1 that was visited first during traversing. The right child of τ (if any) contains a pointer to the kd -tree node ν2 pushed on the traversal stack and thus visited later or unvisited (if the ray has been terminated before reaching this node). See Fig. 6.3 (b), which depicts an example of a traversal tree structure. 6.3.2.2

Constructing Initial HLCTS

The initial HLCTS is constructed using n traversal trees (n ± 1) determined for n boundary rays of a given convex shaft. Using convexity, it can be proved by contradiction that the traversal states for some nodes of the kd -tree are the same for all rays within the shaft, if the corresponding traversal states for all boundary rays are equal. Traversing of these nodes can be avoided by descending the hierarchy and constructing an ordered sequence of nodes where the traversal states for the boundary rays are no longer equal. The HLCTS can be seen as a cut on the kd -tree at the level where the traversal state can no longer be precomputed from the traversal histories of the boundary rays. Fig. 6.4 (a) depicts the boundary rays of a frustum. The HLCTS construction algorithm performs a constrained depth-first-search (DFS) in parallel on all n traversal trees. If the traversal states associated with all n currently reached interior nodes are equal, the algorithm is applied recursively first on the left child and then on the right child (if any). If the reached nodes are leaves of the traversal trees (state=TERMINATION) or the traversal states are not equal, the HLCTS is enlarged using the kd -tree node associated with the reached nodes. Additionally, each HLCTS entry contains n pointers to the associated nodes of the traversal trees (see Fig. 6.4 (b)). Their use will be explained further in the text. Once the initial HLCTS has been constructed, it can be used by the ray traversal algorithm to initiate the traversal for all rays within the corresponding shaft. The traversal stack can be filled using all nodes of the HLCTS . Note that ray traversal algorithms usually assume that the entry and exit points are known for the current node. Using an HLCTS these must be computed explicitly for each visited

6.3. LCTS CONSTRUCTION

115

Figure 6.4: HLCTS construction and use: (a) Underlying geometry, two boundary rays RA and RB , the ray RC between boundary rays, and the regions defined by the rays. (b) HLCTS L1 generated from two traversal histories THA and THB corresponding to boundary rays RA and RB . (c) HLCTS L2 generated from traversal histories of different number of roots, using THA and THC traversal histories and HLCTS L1 . THA is accessed using L1 , THC contains four root nodes generated from L1 . Notation: ß X – cut of the traversal history corresponding to HLCTS X . TH – traversal history; THA , THB for boundary rays, THC for a ray between boundary rays.

node of the HLCTS , since they have not been determined recursively as in the common ray traversal algorithm. The pointers to exit plane in nodes of traversal trees are used to solve this problem.

Û 116 6.3.2.3

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES Constructing General HLCTS

The HLCTS for a given shaft can be refined further by constructing HLCTS for its sub-shafts. Nevertheless, if the ray traversal algorithm does not start at the root node of the kd -tree, the traversal history no longer corresponds to a single tree. Instead, the traversal history is stored as a sequence of traversal trees with their roots corresponding to the nodes of the HLCTS that were used to as initial nodes for traversing. We consider constructing HLCTS for the n traversal histories with the following properties:

Property 1: All traversal histories have been generated from the same HLCTS and thus they correspond to sequences of traversal trees of the same length. Property 2: Only e traversal histories have been generated from the same HLCTS , say L. Other n ¤ e histories have been generated before, but they have been used to establish L. The first case can be solved simply, by applying the previously mentioned algorithm on all n-tuples of the root nodes. If the algorithm is implemented using a stack, these n-tuples can be initially pushed on the stack in reverse order. In the second case the information stored within L is used. Although n ¤ e “old” traversal histories correspond to sequences of traversal trees of length l ¸àÐ L Ð , each entry of L contains pointers to the traversal tree nodes that correspond to this HLCTS entry. Using these pointers the “old” traversal histories can be accessed directly at the “level” corresponding to L. Thus the algorithm is applied on n-tuples of traversal history nodes, where e entries correspond to roots of the traversal trees generated using L. Other n ¤ e entries are determined using appropriate pointers stored within entries of L. The problem and its solution are illustrated in Fig. 6.4 (c).

6.3.3

Further Improvements

Below we present several improvements that can be used in the scope of the construction of both SLCTS and HLCTS . 6.3.3.1

Unification of Empty Leaves

Although the kd -tree is built adaptively with respect to the scene geometry, it can happen that the LCTS contains a subsequence Þ of entries corresponding to empty leaves of the scene kd -tree. This is depicted in Fig. 6.5. The LCTS construction algorithm can be modified to detect this situation and to replace Þ% with a single entry. This approach is analogous to directed safe zones [124] that would be constructed on the fly.

Figure 6.5: Unification of empty leaves. For two rays with common origin three empty leaves can be found. The modification of the LCTS construction algorithm is straightforward. If a new LCTS entry X that corresponds to an empty leaf node is to be added, it is first checked whether the last entry Y of

6.4. APPLICATION OF LCTS

117

the constructed LCTS corresponds to an empty leaf of the kd -tree. If this is the case, the LCTS is not enlarged by X Instead the “exit-plane-node” of X is used to replace the exit-plane-node of Y In this way the spatial extent corresponding to Y is enlarged properly for all rays within the shaft. This is necessary for the correct behavior of the ray traversal algorithm. 6.3.3.2

Termination Object

A remarkable reduction of traversal steps for rays in a given convex shaft Ýv³ can be obtained by determining a single convex termination object that is hit by all rays in Ýv³ . This is possible only if the origin of the rays is somehow restricted. This holds for example for a pyramidal shaft (frustum) where all rays originate at the same point. If there is a termination object, traversing a kd -tree can be eliminated completely. Using LCTS we can perform a simple test that optionally determines the termination object for the LCTS . The presented approach is conservative, since it does not always determine the termination object even if it exists, but it never gives a wrong answer if no termination object exists for a given shaft. The termination object O exists if all of the following conditions hold:

Condition 1:

All boundary rays of the shaft hit the same object O and are terminated in the same cell .

Condition 2:

Object O is convex and it is the only object intersecting the cell

Condition 3:

The cells visited before reaching the cell

.

are empty.

If the unification of empty leaves described above is applied, the last condition reduces to one of the following cases:

Condition 3a: Cell

corresponds to the first entry of the LCTS .

Condition 3b: Cell corresponds to the second entry and the first entry corresponds to an empty leaf (unification of empty leaves). 6.3.3.3

Initial Leaf Sequence for HLCTS

This improvement is applicable to HLCTS only. If the HLCTS corresponds to a pyramidal shaft (frustum), it can be expected that the first entries of the HLCTS correspond to leaves of the kd -tree. In such a case this initial leaf sequence of HLCTS forms an SLCTS . With each HLCTS we keep a single value n that denotes the number of leaves at the beginning of the HLCTS . If n Ð HLCTS Ð , the sequence corresponds to a sequence of leaf nodes of the kd -tree. In this case it forms an SLCTS and cannot be refined any longer. The HLCTS construction algorithm can be modified to copy the first n leaf nodes from the “parental” sequence without performing any matching. The previously mentioned HLCTS construction algorithm is applied starting at the n ¦ 1 -th node of the HLCTS . Similarly, index n can be exploited in the ray traversal algorithm where the first n nodes can be visited without using any traversal stack. If the traversal is terminated before reaching the n ¦ 1 -th node, stack initialization is completely avoided.

6.4

Application of LCTS

Ray shooting with the LCTS concept can be used in many global illumination techniques (for a survey, see [149, 157]) based on discrete sampling of space via rays, e.g., ray tracing, photon tracing, Monte Carlo methods, shadow determination, form factor computation. We discuss here two techniques that can be used in a more general way, namely patch-to-patch visibility and hidden surface removal.

118Ü

6.4.1

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

Patch-to-patch Visibility

For the purposes of computing patch-to-patch visibility factors, the HLCTS is more suitable, since it can occur that there is no SLCTS for some two patches. The task is to determine mutual visibility using ray shooting for a given two patches in the scene. We create convex hulls of both patches, then we determine a set of rays that form boundary rays of a convex shaft between these convex hulls. We construct the traversal tree for each ray in the set and then the corresponding HLCTS , which is subsequently used for any ray between the patches. This application of HLCTS is similar to the concept of shaft culling [71], but there are major differences in the way of obtaining the desired set of cells intersecting the shaft. In shaft culling, intersection tests between shafts and cells are performed, which can be costly. The HLCTS technique uses only ray shooting and then only non-geometric computations for HLCTS construction, which is less expensive. The results of applying these two techniques are not the same. Generally, HLCTS determines a superset of cells determined by classical shaft culling, but in much more efficient way.

6.4.2

Hidden Surface Removal

Hidden surface removal that uses ray shooting is usually called ray casting. For this purpose, both HLCTS and SLCTS are suitable; a significant reduction of traversal steps can be achieved by using a termination object as mentioned in the previous chapter. The common origin of rays (viewpoint) induces that the initial sequence of common nodes in LCTS is likely to be a sequence of leaves (SLCTS ). The more paraxial rays, the longer the initial SLCTS . Assuming the cells are farther from the viewpoint, the rays are less likely to generate the same sequence of kd -tree leaves. Thus the scene geometry, the kd -tree properties, and the image resolution influence the level of utilization of traversal coherence. There are several ways how to exploit the LCTS concept for hidden surface removal. We can deal with an image as a one-dimensional array of one-dimensional arrays (scanline approach, using traversal coherence in one dimension) or as a two-dimensional array (using traversal coherence in two dimensions). Since both SLCTS and HLCTS techniques can be applied, we have exactly four cases:

SLCTS-1D: Scanline with SLCTS – this approach is basically undersampling on a scanline, creating SLCTS for two adjacent samples and using this SLCTS to compute samples between them. This scheme is depicted in Fig. 6.6 (a). HLCTS-1D: Scanline with HLCTS – this approach can be implemented as above, but a better utilization of traversal coherence combines HLCTS with bisection. The initial HLCTS is incrementally refined. The scheme is depicted in Fig. 6.6 (b). SLCTS-2D: Two-dimensions with SLCTS – the sampling can be performed as undersampling nx Ñ ny pixels. Having the sequence of traversed leaves for the four corners of a rectangle, the SLCTS is created and used for all the rays inside the rectangle. The SLCTS -1D can be seen as special case for ny 1. The scheme is depicted in Fig. 6.6 (c). HLCTS-2D: Two-dimensions with HLCTS – the bisection is applied in two dimensions; the axis for splitting is regularly changed. At the beginning four rays corresponding to the pixels in the image corners are cast. In one bisection step, four rays are cast again and two new rectangles are created. See Fig. 6.6 (d). We should point out that all these sampling schemes can be applied more successfully when the image resolution is high with respect to the space subdivision projected to the image plane. In this case, many areas in the image have a common traversal sequence, which often form SLCTS . An example of such a projection is depicted in Fig. 4.17.

6.5. RESULTS

119

Figure 6.6: Hidden surface removal sampling patterns for LCTS : (a) SLCTS -1D (b) HLCTS -1D (c) SLCTS -2D (d) HLCTS -2D. The numbers mark the order in which the rays are cast.

Note that SLCTS can also be used with bisection, but since the rays corresponding to corner pixels are far from being paraxial, they do not generate any SLCTS . We have verified experimentally that the undersampling method as described here is more efficient for the SLCTS concept.

6.5

Results

We implemented all the sampling techniques described above for hidden surface removal and the HLCTS approach for ray shooting between two patches. For testing, we again used the SPD scenes [69], but we report here only a subset of G4SPD scenes (8 out of 10 scenes) due to a lack of space. The kd -trees for all G4SPD scenes were built using the ordinary surface area heuristic with late cutting off empty space (see Chapter 4) for ad hoc termination criteria with this setting: dmax 16 and Nmax 2. The basic scene properties and the number of leaves of the constructed kd -trees are listed in Table 6.1. More data on the scenes, describing their scene complexities, were presented in Subsection 3.5.1.

objects spheres polygons cones cylinders kd -tree leaves

balls4

gears4

lattice12

7382 7381 1 5253

9345 9345 13830

8281 2197 6084 25689

Scene mount6 rings7 8196 4 8192 8554

8392 4195 1 1 4195 12924

teapot12

tetra6

tree11

9264 9264 2387

4096 4096 2972

8191 4095 1 4095 3426

Table 6.1: Properties of testing scenes and kd -trees built for experiments.

6.5.1

Patch-to-patch Visibility

We have observed that it is difficult to predict whether HLCTS construction pays off for patch-topatch visibility. If only a low number of rays between the patches is cast, then the time required for HLCTS construction is not recovered later. For tens and hundreds of rays between the patches, HLCTS construction can be worthwhile, particularly when the patches are incrementally refined as occurs in

Ò 120

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

hierarchical radiosity algorithms. The reduction in the number of traversal steps depends on the shape and the positioning of the constructed shaft in the scene and the configuration of the objects in the scene. The more elongated the shaft, and the deeper the kd -tree, the better improvement of efficiency can be achieved. We do not provide here any quantitative results for two reasons, since we found that the results achieved experimentally vary greatly, depending on the above-mentioned conditions. First, the SPD scenes are not composed of planar primitives – so they are inconvenient for testing patch-to-patch visibility. Second, we are not aware of any standard and well described algorithm that provides a set of pairs of patches for a given scene in a similar way as hidden surface removal provides a set of rays, as described below.

6.5.2

Hidden Surface Removal

We tested hidden surface removal using ray shooting with the recursive ray traversal algorithm TABrec , the ray traversal algorithm with neighbor-links trees TANLT , and LCTS traversal algorithms SLCTS 1D, SLCTS -2D, HLCTS -1D, and HLCTS -2D. Since applying any LCTS technique influences only the number of traversal steps, the average cost of one traversal step and thus the whole running time (i.e., N˜ T S , N˜ ET S , N˜ EET S of subset ∆ and TR , Θrat , and ΘRUN of subset Θ of minimum testing output, see Chapter 2), we present here only a subset of the parameters, which nevertheless enables us to evaluate the results given by LCTS techniques. The ratio of the number of intersection tests to minimum intersection tests rIT M and the average number of traversal steps N˜ T S for primary rays for 1024 Ñ 1024 image resolution, and for all the traversal methods, is shown in Table 6.2. The recursive ray traversal algorithm TABrec was used as a reference for comparison. The running times in the two tables include the construction of LCTSs , which are built on the fly, so no additional preprocessing is performed. Table 6.3 shows the sensitivity of the different ray traversal algorithms to the resolution of the image for the scene “teapot12”. Note that for these tests on hidden surface removal the ray traversal algorithm with neighbor-links trees TANLT is not efficient compared to a recursive ray traversal algorithm. The first reason is that the cost of one traversal step for TANLT is slightly higher than for the recursive ray traversal algorithm. The second reason is that for hidden surface removal for the tested scenes, rays are cast from outside the scene, so many empty leaves have to be traversed before hitting an object. Fig. 6.7 visualizes the traversal coherence for the scene “mount6” and the method SLCTS -2D. It is obvious that most pixels in the projection have at least one common initial leaf node. All the experiments were conducted on the SGI O2 , MIPS R10000, 180 MHz, 256 MBytes RAM, running the Irix 6.3 operating system. All the ray traversal algorithms tested were implemented within the GOLEM rendering system [75].

6.5.3

Discussion

Successful use of the LCTS concept for patch-to-patch visibility is conditioned by the number of rays shot between the patches. Similarly, the application of LCTS for hidden surface removal via ray casting depends on image resolution. Let us discuss in detail the properties of hidden surface removal using ray shooting with LCTS . The performance improvement is scene dependent, but this is the case for all heuristic RSAs. It follows from the results that hierarchical traversal steps to the first leaf are successfully avoided, and the total number of traversal steps is decreased typically by more than 60% (for scene “lattice12” as much as 78%). This corresponds to an improvement in performance by 20% on average, since most of the computation is then devoted to ray-object intersections. (The total running time TR does not include the remaining application time, the ratio of the time of RSA to the time of the whole ray tracing is scene dependent, and reaches from 40% to 75% for the used test scenes, see [86] and the results in Appendix E.)

6.5. RESULTS

121

(a)

(b)

(c)

Figure 6.7: Visualization of the traversal coherence of SLCTS -2D for the scene “mount6” (a) Normal ray tracing. (b) Blended with the color for pixels: white – sampling pixels, red – pixels for which there exists a common traversal sequence with at least one leaf, green – pixels for which the terminating object was found, blue – pixels for which it is known that no object can be hit, black – pixels for which no LCTS was found. (c) A zoomed in part of image (b), where the sampling pattern is better visible. B (see Section 5.4) for an arbitrary ray used as a Note that the recursive ray traversal algorithm Trec reference is highly optimized for SGI architecture. This means that the improving the performance of the recursive ray traversal algorithm by another 20% on average using the LCTS concept is significant. Special attention should be given to the setting of the undersampling resolution for SLCTS approaches. We can observe a tradeoff between undersampling resolution and the possible existence of SLCTS or/and its properties. If we take fewer samples for whole SLCTS , there is a lower probability of reduction of traversal steps for the constructed SLCTS . The number of samples needed to construct one SLCTS was always constant, either two (SLCTS -1D) or four (SLCTS -2D). Let us take pixels on a scanline and construct SLCTS -1D for two pixels. Let nx be the distance between the two pixels. We can then use the constructed SLCTS for n nx ¤ 2 pixels if such an SLCTS exists. The undersampling resolution nx 5 pixels is a reasonable compromise. The same holds for SLCTS -2D, when we set the undersampling resolution to 5 Ñ 5 pixels. In general, SLCTS -2D enables us to use SLCTS (if SLCTS exists) for n nx ny ¤ 4 ¤ ny ¤ 2 ¤ nx ¤ 2 nx ny ¤ nx ¤ ny pixels. For high resolution images, nx (and ny for SLCTS-2D) can be set to an even higher value.

Ó 122

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

We can see that two-dimensional LCTS methods are better able to exploit coherence properties than one-dimensional LCTS methods. The HLCTS performs more efficiently than SLCTS with sampling 5 Ñ 5 only for high resolution images, mainly because the cost of HLCTS construction is higher than for SLCTS construction, and thus it is recovered only for larger regions of the image, where HLCTS forms SLCTS .

6.6

Conclusion and Future Work

In this chapter we have studied a new way of exploiting coherence in ray shooting algorithms based on the kd -tree for certain ray sets that induce some similarity between rays. We tried to avoid as many hierarchical traversal steps within the spatial hierarchy as possible, and at the same time to preserve all the advantages of using the hierarchy. We introduced two concepts of longest common traversal sequence; SLCTS and HLCTS . The presented techniques decrease the number of traversal steps for hidden surface removal based on ray casting, typically by more than 60%. For high resolution images the reduction of traversal steps is even more remarkable. There are several possible topics for future research based on the LCTS concept. The application of LCTS could be studied for higher order rays, in context of particular global illumination algorithms. A LCTS can be applied if the rays of the first order (primary rays) have the same termination object and create a shaft that is then completely reflected or refracted. In addition, image space sampling patterns suitable for LCTS application other than those studied here should be investigated. The automatic setting of the SLCTS undersampling resolution based on scene properties and the use of LCTS in rendering animation sequences are also possible topics for future research. A further research topic could be an algorithm determining whether it is efficient to construct an LCTS given two patches in the scene for some application such as form factor computation.

6.6. CONCLUSION AND FUTURE WORK

Parameter rIT M N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 ¤ η4096 ¤ N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 ¤ η4096 ¤ N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 ¤ η4096 ¤ γ1024 ¤ N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 ¤ η4096 ¤ N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 [-] η4096 ¤ γ1024 ¤ N˜ T1024 S ¤ TR ¤ Tapp α1024 ¤ η1024 [-] η4096 [-]

balls4 7.40

1024

s

30.4 15.4 1.00 1.00 1.00

1024

s

26.2 16.0 0.862 1.04 1.02

s

13.2 14.7 0.434 0.958 0.846 0.964

s

10.1 14.7 0.332 0.958 0.820

s

12.4 12.6 0.408 0.819 0.749 0.919

s

11.3 13.6 0.372 0.887 0.800

1024

1024

1024

1024

Scene lattice12 mount6 rings7 teapot12 tetra6 8.81 5.29 10.9 4.82 7.59 B Ray traversal algorithm TArec 16.6 49.9 30.7 41.3 22.2 13.7 48.1 25.6 12.9 28.6 11.5 7.46 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 Ray traversal algorithm TANLT 15.6 19.1 26.8 34.2 19.0 11.6 48.2 23.5 13.4 31.9 12.2 8.65 0.940 0.383 0.873 0.828 0.856 0.847 1.00 0.918 1.03 1.12 1.06 1.16 1.00 0.913 1.02 1.01 1.06 1.15 SLCTS -1D ray traversal algorithm, window 5 Ñ 1 pixels 7.13 11.6 11.9 15.7 8.46 5.70 47.7 20.8 11.0 28.6 10.6 8.32 0.430 0.234 0.388 0.380 0.381 0.416 0.990 0.812 0.853 0.999 0.925 1.16 0.977 0.785 0.787 0.947 0.871 0.987 0.871 0.999 0.938 0.981 0.833 0.560 HLCTS -1D ray traversal algorithm 5.58 13.8 8.52 11.0 6.35 4.65 48.1 22.3 11.8 27.5 10.6 7.39 0.336 0.277 0.278 0.266 0.286 0.339 0.998 0.872 0.913 0.960 0.929 0.990 0.972 0.775 0.752 0.868 0.816 0.851 SLCTS -2D ray traversal algorithm, window 5 Ñ 5 pixels 6.46 10.8 10.7 13.8 7.73 5.45 46.7 19.7 9.99 27.2 9.69 7.24 0.389 0.216 0.349 0.334 0.348 0.398 0.971 0.768 0.772 0.949 0.846 0.97 0.956 0.730 0.691 0.890 0.796 0.892 0.780 0.999 0.916 0.971 0.798 0.525 HLCTS -2D ray traversal algorithm 6.37 14.8 9.35 12.7 7.31 5.32 47.5 21.7 10.4 26.4 9.60 6.31 0.384 0.297 0.305 0.308 0.329 0.388 0.987 0.846 0.805 0.921 0.838 0.846 0.964 0.782 0.691 0.857 0.751 0.719

gears4 4.37

123

tree11 10.1 23.2 16.1 1.00 1.00 1.00 16.9 16.7 0.728 1.03 1.02 9.83 15.9 0.424 0.988 0.947 0.999 7.66 15.6 0.33 0.970 0.895 9.13 15.1 0.394 0.939 0.896 0.990 8.61 14.6 0.371 0.903 0.855

Table 6.2: Comparison of ray traversal algorithms for hidden surface removal based on ray casting. rIT M – number of ray-object intersection tests to minimum number of ray-object intersection tests. N˜ T1024 S – number of traversal steps per ray on average for a specific ray traversal algorithm and resolution 1024 Ñ 1024. TR ¤ Tapp 1024 – the running time for ray shooting only ( const ΘRUN ) for a specific traversal algorithm for resolution 1024 Ñ 1024. α1024 – the ratio between the number of traversal steps for a specific ray traversal algorithm and number of traversal steps of TABrec for resolution 1024 Ñ 1024. η1024 – the ratio between the running time of the specific traversal algorithm and the running time of TABrec for resolution 1024 Ñ 1024. η4096 – the ratio between the running time of a specific ray traversal algorithm and the running time of TABrec for resolution 4096 Ñ 4096. γ1024 – the ratio of the number of SLCTS sequences that contain at least one leaf to the number of all possible sequences for resolution 1024 Ñ 1024.

Ö 124

CHAPTER 6. LONGEST COMMON TRAVERSAL SEQUENCES FOR KD-TREES

Parameter N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤ N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤ N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤ N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤ N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤ N˜ T S ¤ TR ¤ Tapp s α ¤ η ¤

256 Ñ 256

512 Ñ 512

Resolution 1024 Ñ 1024 2048 Ñ 2048

4096 Ñ 4096

TABrec

ray traversal algorithm 22.1 22.2 22.2 22.2 0.734 2.89 11.5 45.6 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 ray traversal algorithm TANLT 18.9 19.0 19.0 19.0 0.792 3.08 12.2 48.4 0.855 0.856 0.856 0.856 1.080 1.066 1.064 1.060 SLCTS -1D traversal algorithm, window 5 Ñ 1 pixels 11.4 9.87 8.46 7.31 0.741 2.77 10.6 40.6 0.516 0.445 0.381 0.329 1.010 0.959 0.924 0.889 HLCTS -1D traversal algorithm 10.7 8.30 6.35 4.91 0.782 2.87 10.6 39.6 0.484 0.374 0.286 0.221 1.065 0.990 0.929 0.868 SLCTS -2D traversal algorithm, window 5 Ñ 5 pixels 11.6 9.65 7.73 6.09 0.683 2.55 9.69 36.9 0.525 0.435 0.348 0.274 0.930 0.883 0.846 0.809 HLCTS -2D traversal algorithm 12.4 9.70 7.31 5.43 0.711 2.60 9.60 35.8 0.561 0.437 0.329 0.244 0.969 0.90 0.838 0.784

22.2 182 1.00 1.00 19.0 193 0.856 1.057 6.41 159 0.289 0.871 3.94 149 0.177 0.815 4.75 145 0.213 0.796 4.09 137 0.184 0.751

Table 6.3: Comparison of ray traversal algorithms for primary rays for the scene “teapot12” at different resolutions. N˜ T S – number of traversal steps per ray for a particular ray traversal algorithm. TR ¤ Tapp s – the running time for ray shooting only ( const ΘRUN ) for a specific ray traversal algorithm. α – the ratio between the number of traversal steps for a specific traversal algorithm and the number of traversal steps of TABrec . η – the ratio between the running time of the specific ray traversal algorithm and the running time of TABrec .

Chapter 7

Memory Mapping of Kd -Trees In this chapter we deal with kd -tree representation in the main memory of a computer. We show that for computer architectures with large cache line size the way of mapping nodes of the kd -tree in the main memory influences the time needed to transfer data from the main memory to a processor, hence the cost of one traversal step of ray traversal algorithms. We describe and analyze a few ways of mapping kd -tree nodes to memory, and also provide the results obtained from experiments.

7.1

Motivation

The most important operation carried out for a kd -tree in any application is exhaustive traversal of interior nodes and leaves; for example, the traversal in depth-first-search (DFS) order or the ray traversal algorithm described in detail in Chapter 5. Input/output efficient algorithms and data structures for memory hierarchies have acquired noticeable research interest [31, 119, 120] in last decade. The design of these data structures is driven by the properties of external/internal memory hierarchy. For example, Nyberg et al. [115] described an algorithm for sorting which takes into account the memory hierarchy. Unlike the special algorithms for external memory data structures, in this chapter we deal with the internal memory hierarchy between the processor and the main memory, including either on-chip cache or second-level cache. The main difference between this hierarchy and that for external memory is its size and the access time to one data block. The values for the external memory hierarchy are much larger than those between the processor cache and the main memory. Moreover, techniques for external memory data structures were developed mostly for one-dimensional search problems. For example, the well known B-tree [34] cannot be used to decrease the time complexity of the kd -tree traversal for n ± 1, since the B-tree cannot represent the n-dimensional data that are the subject of the kd -tree. We analyze novel methods to increase the spatial locality of data in the cache and thus to decrease the running time of any algorithm that works over the kd -tree. For the sake of simplicity of theoretical analysis we will assume that we traverse the kd -tree in DFS order from the root node to a leaf.

7.2

Preliminaries

In this section we recall a few technical facts necessary to understand the concept of mapping kd tree nodes to memory. This includes memory allocation techniques and the structure of the memory hierarchy in a computer.

125

Û 126 7.2.1

CHAPTER 7. MEMORY MAPPING OF KD-TREES

Memory Allocation

The basic topic of this chapter is mapping kd -tree nodes to addresses in the main memory. The main memory of a computer is principally a one-dimensional array of memory cells where data are stored. The allocation and deallocation of dynamic variables in the main memory is always provided by procedures of software library that is usually called the memory allocator. Let us suppose the contiguous block of the unoccupied memory is assigned to the memory allocator at the beginning of some program that builds up a kd -tree. This memory block is used to assign the addresses within the block to the variables allocated, so the variables do not overlap. We call this memory block a memory pool. Since mapping of variables into memory by a memory allocator is crucial for an understanding of this chapter, we discuss it here in detail. A common solution for allocating variables is to use a general memory allocator. Each node of the kd -tree is then represented as a specially allocated variable. Let MI denote the size of memory to store information in a node. This is the position and the orientation of the splitting plane. Let MP be the size of a pointer. Then the size required to represent one interior node of the kd -tree is MIN MI ¦ 2 MP . Using the general memory allocator requires that we store with each allocated variable two additional pointers that are required later to free the variable from the memory pool. We can also use another strategy to allocate memory for a node of the kd -tree. We use a fixed-size memory allocator, described for example in [141], to allocate variables of the same type and thus of the same and fixed size MV . We can then dedicate a special memory pool to allocate the interior nodes of the kd -tree, since these nodes are of the same fixed size. During kd -tree construction the nodes are allocated from the memory pool as from an array in linear order.

7.2.2

Memory Hierarchy

The time complexity of a ray traversal algorithm performed on the kd -tree is connected with the hardware used. The cost of the ray traversal step CT S includes the time needed to transfer the data from a main memory to a processor. Let us recall the organization and the properties of the memory hierarchy. For analysis we suppose Harvard architecture with separate caches for instructions and data. Let TMM denote the latency of the main memory (time to read/write one data block). The larger the memory and the smaller the access time, the higher the cost of the memory. The instruction/data latency of processors is significantly smaller than TMM . That is why a cache is placed between the memory and the processor. The cache is a memory of relatively small size with respect to the size of the main memory. The cache latency TC is smaller than TMM . This solution is economically advantageous; it uses the temporal and spatial locality of data exposed by a typical program, and the average access time to the data in the main memory can then be significantly reduced. Data between the cache and the main memory are transferred in blocks that correspond to the architecture of a cache memory, and each time only one block is transferred. The size of the block is referred to as the cache line size MCL . A typical memory hierarchy is depicted in Fig. 7.1. We use for analysis here only one cache placed between the processor and the main memory. To quantify the running time we denote the time consumed by operations in terms of processor cycles. Let TW denote the average processing time on a node of the kd -tree to decide whether to follow its left or right descendant. Typical values for superscalar processors and for a typical application using kd -trees are TMM 55 ¢ TC 4 ¢ TW 5 ¢ MCL 128 Bytes for MIPS R8000 [130]. These values are used again in the numerical examples for formulas further in the text. Note that for a typical searching algorithm on the kd -tree holds TW × TMM .

7.3. REPRESENTATIONS OF THE KD-TREE

127

Figure 7.1: Typical memory hierarchy.

7.3

Representations of the kd -tree

As we have already stated, the kd -tree is actually represented by a binary tree. In general, a binary tree with an arbitrary specified splitting plane inside the interior nodes does not represent a valid instance of the kd -tree, since each splitting plane has to intersect the % associated with the corresponding node. This is one reason why the decomposition induced by the kd -tree cannot be simply replaced by B-tree or some hashing scheme commonly used for one-dimensional search problems. The information stored in the interior node of the kd -tree is the orientation and the position of the splitting plane. The leaf contains just the pointer to the list of objects required by all ray traversal algorithms, and additional data for some ray traversal algorithms (see Chapter 5 for details). Here we suppose the % is known explicitly for a root node only, also that the % s associated with the interior and leaf nodes are not stored explicitly in these nodes. This kd -tree can be used in the recursive ray traversal algorithm. Under the assumptions stated above, the analysis performed below disregards the n-dimensionality of the kd -tree, so we consider a kd -tree as a binary tree. Let us recall some terminology concerning binary trees needed for the analysis performed here. We call a binary tree complete if all its leaves are positioned at the same depth d from the root node and thus the number of leaves is 2d . We call a binary tree incomplete when the binary tree is not complete. Let hC define the complete height of a binary tree X as the maximum depth for which the binary tree constructed by the nodes of X is complete. Below we describe in detail four representations of the kd -tree in memory. This includes a common method for representing kd -tree nodes using a general memory allocator. We call this random representation. A less known and less used method is DFS order representation. (The use of DFS order representation can be unintentional.) Finally, we describe two forms of a subtree representation that decrease further the average cost of one traversal step of the kd -tree.

7.3.1

Random Representation

A common way to store an arbitrary kd -tree in the main memory is to represent each node as a special variable using a general memory allocator. The representation is depicted in Fig. 7.2 (a). This representation requires additional memory for the pointers used by the general memory allocator for each allocated variable, but it is the simplest technique to implement. The addresses of the nodes in the main memory have no connection with the location of the nodes in the kd -tree. Assume that two

128Ü

CHAPTER 7. MEMORY MAPPING OF KD-TREES

Figure 7.2: Kd -tree representations (cache line size MCL DFS (c) Subtree.

3 sizeo f kd ¤ tree node (a) Random (b)

additional pointers are needed to allocate the variable, since the general memory allocator requires the pointers to deallocate the variable. Then the memory size MS consumed by random representation to store nNO nodes of kd -tree is: (7.1) MSrandom NNO _ 4 MP ¦ MI

7.3.2

Depth-First-Search (DFS) Representation

A DFS representation uses a fixed-size memory allocator, which was described above. In this representation the nodes are put subsequently in the memory pool in linear order, when a kd -tree is built up in the DFS order, see Fig. 7.2 (b). The size of the memory consumed to represent nNO nodes of the kd -tree is: (7.2) MSDFS NNO _ 2 MP ¦ MI Then 2 NNO MP of memory is saved compared with random representation, since the two pointers would have been required by the general memory allocator.

7.3.3

Subtree Representation

Here we describe a new type of mapping kd -tree nodes to memory, originally introduced by Havran [72] and further elaborated in [73]. This kd -tree representation reduces the cost of a traversal step in ray traversal algorithms performed on the kd -tree. Let us describe the representation in detail. For subtree representation we also use a fixed-size memory allocator, similarly to DFS representation, but the size of one allocated variable is equal to cache line size MSL . The variable of size MSL is by an explicit algorithm subsequently occupied by the nodes of the kd -tree organized into subtrees. All nodes of the kd -tree are then organized to these subtrees, see Fig. 7.2 (c). Once the subtree is read to the cache from the main memory within traversing, the access time to the nodes of the subtree is equal to cache latency TC . The subtree need not be complete. We distinguish between two subtree representations, see Fig. 7.3. An ordinary subtree has all nodes of the same size, with two pointers to its descendants, regardless of whether the descendant lies in the subtree. A compact subtree has no pointers among the nodes inside the subtree, because their addressing is provided explicitly by a traversal algorithm. The pointers are needed only to point between the subtrees. Then the leaves of an incomplete subtree are marked in a special variable stored in each subtree (one bit for each node in the subtree). The size of the memory taken by the two subtree representations is given in the next section.

7.4. TIME COMPLEXITY AND CACHE HIT RATIO ANALYSIS

129

Figure 7.3: Subtree representation: (a) Ordinary (b) Compact.

7.4

Time Complexity and Cache Hit Ratio Analysis

Here we analyze the time complexity of a DFS order traversal for all kd -tree representations. The theoretical analysis assumes that the kd -tree nodes data stored in the main memory are not loaded into the cache, i.e., the cache hit ratio rCH á 0 â 0. Further, we suppose that the kd -tree is complete and its height is hl . An incomplete kd -tree requires that we compute its average height h¯ l and substitute it for hl . These simplifications enable us to express the average traversal time TA on the kd -tree in DFS order from its root to a leaf. We compute the TA for an example of a kd -tree of height hl á 23. Further, we suppose random visiting of the nodes and the probability pL á 0 â 5 that we turn left in a node. The way of traversing nodes corresponds to the sequential ray traversal algorithm that performs the pointlocation search. If some data are already located in the cache (rCH ã 0 â 0), an analysis using known mathematical tools can be very difficult or even infeasible [13]. Therefore we investigated the case by means of simulation.

7.4.1

Random Representation

We assume a cache hit ratio rCH á 0 â 0 during the whole traversal, i.e., the processing time of each kd tree node is TMM ä TW . As we know that the number of nodes along the traversal path from the root to the leaf is hl ä 1, we can express the average traversal time TA as follows: TA For values given above (TMM

7.4.2

á

55, TW

áå hl ä 1Dæ â å TMM ä á

5, hl

á

TW æ

(7.3)

23) we obtain TA

á

1392 â 0 cycles.

DFS Representation

DFS representation increases the cache hit ratio by involuntarily reading the descendant nodes for the next traversal step(s) if DFS traversal continues to the left descendant(s) of the current node. Assuming that the size of the kd -tree node is MIN á MI ä 2 â MP , we derive the average traversal time TA as follows: TA For MIN

7.4.3

á

á

MIN å hl ä 1Dæ â_ç pL â TMM â MCL ä

4 ä 2 â 4 á 12, MCL

á

128, and pL

TW ä TC â å 1 è

MIN æ MCL

á 0 â 5 we obtain TA á

äÇå 1 è

pL æDâ TMM é

(7.4)

859 â 1 cycles.

Ordinary Subtree Representation

Assume that MCL and MIN are given. Let MST be the size of the memory needed for each subtree used to represent subtree type identification. We express the size of the memory taken by a complete ordinary

ê 130

CHAPTER 7. MEMORY MAPPING OF KD-TREES

subtree of height h:

å 2hë 1 è 1Dæ â MIN ä

M å hæ

á M å h æíì

MST

(7.5)

MCL

From Eq. 7.5 we derive the complete height of the ordinary subtree hC :

áGî è 1 ä

hC

log å

MCL è MST MIN

ä 1æ ï

(7.6)

The number of nodes in the incomplete ordinary subtree at the depth d

The average height of the subtree hA hA

å 2h ë 1 è 1Dæ â MIN è

MCL è

áGî

NODK

C

MST

MIN

hC for nODK

á è 1ä

ã

log å 2hC ë

á

hC ä 1 is then:

ï

(7.7)

0 is computed as follows: 1

ä

NODK æ

(7.8)

Finally, the average traversal time for the whole kd -tree of height hl is: TA áfå hl ä 1 æDâ å TW

ä

TMM ä TC â hA æ hA ä 1

(7.9)

The subtrees are placed in the main memory, so they are aligned with the cache lines when read to the cache. Each subtree is thus stored in one cache line. The size of the unused memory in the cache line is then: (7.10) å 2h ë 1 è 1 ä NODK æDâ MIN è MST OSR For MIN á 12 ð MST á 4, we get hC á 2, nODK á 3, hA á 2 â 46, Munused á 4, and the average traversal time TA á 555 â 9 cycles.

7.4.4

MCL è

á

OSR Munused

C

Compact Subtree Representation

Let MI be the size of the memory to represent the information in the kd -tree node, MP the memory taken by one pointer. The size of the memory consumed by a complete subtree of height h is expressed as follows:

å 2hë 1 è 1Dæ â MI ä

M å hæ

á M å h æñì

2h ë

1

â MP ä

MST

(7.11)

MCL

The complete height hC of the subtree is derived from Eq. 7.11 similarly to Eq. 7.6 as follows: hC

M M è M á è 1T ä î CLMä I ä IMP ST ï

(7.12)

In the same way as for ordinary subtree representation, we derive the number of nodes nODK located at the depth d á hC ä 1 in the subtree: NODK

áGî

MCL è 2hC ë

1

â å MI ä MI ä

MP æ MP

ä

MI è MST

ï

(7.13)

The unused memory for one subtree in the cache line can be derived similarly as for an ordinary subtree: CSR Munused

á

MCL è

å 2h ë 1 è C

1 ä NODK æDâ MN

è 2 â MP â å NODK ä

2hC

è

NODK ò 2 æiè MST

(7.14)

7.5. SIMULATION RESULTS

131

The average height of subtree hA and the average traversal time TA are computed using Eq. 7.8 and CSR Eq. 7.9. For MP á 4, MI á 4, and MST á 4 we compute hC á 3, nODK á 0, hA á 3 â 0, Munused á 0, and TA á 510 â 0 cycles. The hC , nODK , and hA as the function of the cache line size for ordinary and compact subtree representations, and TA for all kd -tree representations, are depicted in Fig 7.4.

7.5

Simulation Results

We implemented a special program to simulate the data transfer in a typical memory hierarchy of a computer that runs DFS traversal on a complete kd -tree. The simulation was carried out for the same memory hierarchy and kd -tree properties as in the previous section: TMM á 53, TC á 4, TW á 5, hl á 23, MP á 4 Bytes, MI á 4 Bytes, MST á 4 Bytes, and a four-way set associative cache with cache line size MCL á 27 á 128 Bytes; the size of the cache was 220 Bytes. The cache placement algorithm and its structure correspond to those found in superscalar processors, we simulated MIPS R8000/R10000 [130].

Figure 7.4: The analysis: (A) Average traversal time TA å MCL æ for all kd -tree representations, (B) hA å MCL æ , (C) nODK å CL æ , (D) hC å CL æ for subtree representations; Representations: (a) Random (b) DFS (c) Ordinary subtree (d) Compact subtree.

Parameter

TA ç cyclesé (theoretical) TAó ç cyclesé (simulated) r á TA ò TAó çÆè é r˜CH ç %é

random

DFS

Representation ordinary subtree

1392.0 987.1 1.41 35.8

859.1 629.4 1.36 69.8

555.9 445.6 1.24 83.5

compact subtree 510.0 379.3 1.34 90.3

Table 7.1: The average traversal time computed theoretically and obtained by the simulation, ratio of traversal times, and cache hit ratio from the simulation for DFS traversal. The theoretical and simulated times, and their ratio, are summarized in Table 7.1. The parameter r˜CH is the average cache hit ratio to access a kd -tree node in the cache during DFS traversal. The average cache hit ratio for the node as the function of its depth in the kd -tree is shown in Table 7.2.

ô 132

CHAPTER 7. MEMORY MAPPING OF KD-TREES

Parameter r˜CH (random) r˜CH (DFS) r˜CH (ordinary subtree) r˜CH (compact subtree) Parameter r˜CH (random) r˜CH (DFS) r˜CH (ordinary subtree) r˜CH (compact subtree)

0 100 100 100 100 12 21 57 80 7

1 100 100 100 100 13 19 59 64 100

2 100 100 100 100 14 19 47 66 100

3 100 100 100 100 15 0 59 79 100

4 97 100 100 100

Depth 5 6 91 62 93 79 100 97 100 100

7 52 84 73 100

8 39 58 90 69

9 25 56 85 100

10 21 63 53 100

11 18 51 79 100

16 0 54 66 1

Depth 17 18 0 0 48 51 70 72 100 100

19 0 49 74 100

20 0 47 61 0

21 0 54 75 100

22 0 43 74 100

23 0 54 62 100

Table 7.2: The average cache hit ratio r˜CH ç %é as the function of node depth in the kd -tree. Note that for MCL á 128 the compact subtree is complete, so the cache hit ratio for all the nodes at the same depth in the kd -tree is equal. This is why values of r˜CH for depth 12, 16, and 20 are quite different than values of r˜CH for neighbor depths, since the kd -tree nodes in the specified depths are often read first from the main memory. The probability that these nodes are already loaded in the cache is smaller with increasing depth in the kd -tree. The average traversal times obtained by simulation correlate well with those computed theoretically. It is obvious that the times obtained by the simulation are smaller than those derived theoretically, since the theoretical analysis supposes in each step an initial value of cache hit ratio rCH á 0 â 0.

7.6

Results of Experiments

We tested the influence of memory mapping experimentally for the recursive ray traversal algorithm TAArec for random, DFS, and ordinary subtree representation (in [72]). We decided not to implement the compact subtree representation since accessing leaf nodes in subtrees without pointers is more complex and also requires a special traversal algorithm. Theoretical analysis above also shows that the results will not bring a significant improvement of performance compared with ordinary subtree representation. The tests were performed on a subset of G4SPD scenes from SPD for the testing procedure TPD (ray tracing). The kd -tree representation influences only the average time of one traversal step, which has impact on TR , Θrat , and ΘRUN of the minimum testing output (see Section 2.5). The results of experiments for TPD are summarized in Table 7.3. The experiments were conducted on SGI O2 with processor MIPS R8000, 180MHz, 128 MBytes RAM (SCL á 128), running Irix 6.1 operating system using the GOLEM rendering system.

7.7

Discussion

The performance of the recursive ray traversal algorithm is not improved as significantly as could have been expected from the simulation described in Section 7.5. The first reason is that the simulation was performing DFS with 50% probability to turn left in each node of the complete binary tree. This is no longer true for testing procedure TPD when the subsequent primary rays are generated in scanline order. Then these similar rays are likely to hit the same sequence of nodes of the kd -tree (traversal coherence, see the Chapter 6). The second reason is the smaller number of nodes in the kd -trees built for the test scenes compared with the number of kd -tree nodes used for the simulation. The third possible reason

7.8. CONCLUSION AND FUTURE WORK

133

balls4

gears4

mount6

Scene rings7

N NG ä NE N˜ T S ç õ 106 é TR å random æ.ç sé TT S å random æWç sé TT S å DFS æ+ç sé TT S å ordinary subtree æWç sé

7382 8469 53.8 87.0 42.1 35.6 32.5

9345 20541 77.4 314.6 59.8 51.8 46.9

8196 13369 68.1 53.0 22.9 17.4 15.1

8401 23455 47.4 102.4 21.5 14.6 13.1

4096 6011 5.6 7.41 4.09 3.76 3.75

8191 5675 37.2 92.6 18.6 14.6 12.3

– – – – – – –

÷

1.18

1.15

1.31

1.47

1.09

1.27

1.25

1.30

1.28

1.52

1.64

1.09

1.51

1.39

Parameter

ö

÷

C˜T S random [-] C˜T S DFS ˜ CT S random C˜T S ordinary subtree

ö

ö

ö

÷

÷

[-]

tetra6

tree11

average

Table 7.3: The result for testing procedure TPD on a subset of G4SPD scenes. Parameter TT S refers to the time to devoted to traversing only (TT S á C˜T S â Nrays â rSI â rIT M áøå 1 è Θrat æDâ TR â ΘAPPΘRU ë ΘNRU N ). The kd -trees were built with OSAH and ad hoc termination criteria: dmax á 16, Nmax á 2. is that the model describing the transfer of data between the processor and the main memory is still too simplified to model the real behavior of a memory hierarchy system based on the MIPS R8000 processor. We have shown that the properties of the kd -tree representation in the memory for such a traversal algorithm as those in RSAs stay in the range given by theoretical analysis and also the results of the simulation given in the previous section. The impact of kd -tree representation on RSA performance is not as high as might have been expected from the theoretical analysis and simulation, since the order of the nodes visited in the recursive ray traversal algorithm when it is applied in the testing procedure TPD differs from that induced by DFS.

7.8

Conclusion and Future Work

In this chapter we analyzed the time complexity and cache hit ratio of different kd -tree representations in computer architectures with a large cache line for DFS order traversal in detail. We have shown that the time complexity of traversing a kd -tree is reduced by organizing its inner representation so that it matches the memory hierarchy better. We showed experimentally that DFS and ordinary subtree representations can decrease the traversal time for RSAs based on the kd -tree, using TPD testing procedure. Theoretically, the subtree representation decreases the traversal time for DFS order traversal by 62% and increases the cache hit ratio from 35% to 90% for a given example of a common memory hierarchy. In addition, proposed subtree representation of the kd -tree decreases the size of memory to store the nodes of a kd -tree by 57% compared with the random representation. Future research work on the technique presented here could cover efficient memory mapping for other hierarchical data structures, namely other variants of multi-dimensional binary trees and hierarchical data structures in general. Dynamization of these data structures with regard to cache sensitive representation is also an interesting topic for further study.

ù 134

CHAPTER 7. MEMORY MAPPING OF KD-TREES

Chapter 8

Conclusion and Future Work In the thesis we dealt with ray shooting algorithms. Ray shooting itself is known to have no algorithm aiming at worst-case complexity feasible in practical implementation, since it has already been proved that such an algorithm for N objects run at least in ú å log N æ requiring Ω å N 4 æ storage and preprocessing time in the worst case. For this reason we dealt with heuristic algorithms for ray shooting aimed at average-case complexity. Since these algorithms for generally specified input – given N objects – are particularly difficult or even impossible to be successfully analyzed theoretically at present, the findings in the thesis are not expressed using the worst-case complexity measure and commonly used ú -notation. Instead, for the presented algorithms, we report the results of experiments carried out for a set of scenes. For this purpose we used thirty scenes from the Standard Procedural Database, which are publicly available; the experimental results concerning algorithms described in the thesis are thus reproducible and verifiable by any subsequent researchers. Since average-case techniques have been used and the findings are supported by the results of experiments, we cannot claim time optimality of any ray shooting algorithm described in this thesis. All the algorithms that form the subject of the thesis were implemented and tested. Since each implementation can be subject to bugs and implementation errors, in order to minimize such risks we visualized the spatial data structures underlying the tested ray shooting algorithms [81, 35] and verified the invariants of all experiments performed. A considerable implementation effort was needed to implement all the algorithms; it includes more than one hundred thousand lines of source code in C++, excluding third-party sources. The conclusions and possible future research topics are presented at the end of each chapter, and here we provide only a short summary of all results achieved, and some suggestions for possible future research. Since the summary given here is concise, an interested reader should follow Sections Conclusion and Future Work of Chapters 2–7 for more details.

8.1

Summary of Results

In the first part of the thesis (Chapter 2 and 3), we dealt with general issues of heuristic ray shooting algorithms. In Chapter 2 we developed computation model and performance model for ray shooting algorithms so any ray shooting algorithm can be mapped to these models. Based on these two models we developed the methodology for comparing ray shooting algorithms, which is based on reporting the minimum testing output (a set of thirteen parameters) for each experiment performed. Under certain conditions this allows us to compare experimentally various ray shooting algorithms almost independently of implementation issues. An interesting by-product of comparison methodology development is the discovery that ray tracing as defined for scenes in Standard Procedural Database (513 õ 513 primary rays, depth of recursion 4) is impossible to run in real time (at least 25 frames per second) on present-day commonly 135

û 136

CHAPTER 8. CONCLUSION AND FUTURE WORK

used uniprocessor hardware because of time demands of any ray shooting algorithm. (We suppose Intel Pentium II, 466 MHz, we do not assume the use of a special graphics hardware, see results in Appendix E for details.) In Chapter 3 we presented a comparison of twelve commonly used ray shooting algorithms for a set of thirty test scenes, the concept of statistically best ray shooting algorithm, and a preliminary version of the algorithm for selecting an efficient ray shooting algorithm given scene characteristics. The comparison is a part of the ongoing long term BES project devoted to ray shooting algorithms that is in progress at present. Our finding was that ray shooting algorithm based on the kd -tree achieved statistically the best results in comparison with other ray shooting algorithms tested. The second part of the thesis (Chapter 2–7) is devoted to various issues of ray shooting algorithms based on the kd -tree. We selected it for detailed research in accordance with the results of the first phase of the BES project. In Chapter 4 we addressed the problem of kd -tree construction when the kd -tree is used as the underlying data structure for ray shooting algorithms. We dealt with top-down method for kd -tree construction, i.e., with the positioning of a splitting plane in a node of the kd -tree and termination criteria. This construction algorithm estimates the cost of a constructed kd -tree that is used to govern the position of the splitting plane. The precision of the estimates influences the resulting efficiency of the kd -tree for the ray shooting algorithm. The basic cost model used for the estimate is based on several unrealistic assumptions, but allows us to improve the efficiency of the ray shooting algorithm based on the kd -tree by order(s) of magnitude for sparsely occupied scenes. We proposed the general cost model which describes more accurately the time complexity of ray shooting algorithms based on the kd -tree. The general cost model validates the use of the basic cost model, since it shows that the difference in performance achievable using these two cost models are rather limited. Other developments concerning kd -tree construction dealt with the use of empty space inside the scene, automatic termination criteria, the properties of approximating objects by their axis-aligned bounding boxes, and utilizing knowledge of the distribution of rays to be queried to further decrease the running time of ray shooting algorithms based on the kd -tree. Chapter 5 described five ray traversal algorithms, which are used for traversing a ray through a kd tree. These include the sequential ray traversal algorithm, the basic and robust version of the recursive ray traversal algorithm, and two versions of the ray traversal algorithm with neighbor-links that use additional data structures. An efficient ray traversal algorithm tries to decrease both the number of traversal steps per ray and the average cost of a traversal step. The design of a ray traversal algorithm always searches some tradeoff between these two quantities to get the time devoted to traversing the kd -tree as small as possible. There are two extremes, ray traversal algorithm either visits a minimum number of kd -tree nodes with a somewhat higher cost of a traversal step, or the cost of one traversal step is small, but more traversal steps are required. Our contribution concerning ray traversal algorithms is that we developed a new robust recursive ray traversal algorithm for the kd -tree. Further, we compared all the known ray traversal algorithms experimentally. Chapter 6 dealt with the problem of a ray traversal algorithm for a set of rays that have similar directions and origins. When some rays are shot and the rays exhibit some sense of similarity, this knowledge can be used to decrease further the time consumed by traversing the kd -tree. Such ray sets are often induced by the application, when rays are restricted to the convex shaft – they can have the same or a similar point of origin and direction. Under these conditions, we described the construction of the longest common traversal sequence, which gives us the sequence kd -tree nodes to be visited. The use of the longest common traversal sequence decreases the number of traversal steps and thus the time for traversing the kd -tree. Obviously, it does not change the number of ray-object intersection tests to be performed. We described two variants of the longest common traversal sequence – simple and hierarchical. Then we went on to show how the concept can be utilized within the application, and the results of experiments for hidden surface removal.

8.2. SUGGESTIONS FOR FURTHER RESEARCH

137

Chapter 7 described a more hardware-oriented topic – the mapping of kd -tree nodes to the memory of a computer. We showed that the mapping of kd -tree nodes on computer architecture with a large cache line has impact on the total running time of ray shooting algorithms, i.e., the cost of one traversal step. We described four mapping methods and analyzed them both theoretically and experimentally. Our results show that it is possible to decrease the cost of one traversal step and the size of memory for representing a kd -tree on this type of computer architecture.

8.2

Suggestions for Further Research

Some of the algorithmic techniques described in the thesis can be extended and further researched in various contexts. Concerning the first part of the thesis (Chapter 2), we consider the comparison methodology as presented to be more or less complete. The BES project (Chapter 3) should be completed, and it will provide the results of experiments for one hundred scenes. These results will allow us to confirm or disprove the results performed on thirty scenes from the Standard Procedure Database, and then to show the relationship between these different sets of scenes. The algorithm to select an efficient ray shooting algorithm for a given scene based on the scene characteristics will be either validated or improved. Although the second part of the thesis describing ray shooting algorithms based on the kd -tree would seem exhaustive, it still offers some possible topics for further research. The crucial issue of any further research in this direction is to what extent and at what cost it is possible to improve the performance of new algorithms for ray shooting based on the kd -tree compared with the algorithms presented in this thesis. Possible topics of further research include an improved algorithm that will estimate the cost of a kd -tree to be built given a set of objects, an algorithm that estimates blocking factor, an improved version of automatic termination criteria algorithm, an efficient algorithm for computing the cost in the general cost model, and an algorithm for the kd -tree construction with clustering of objects. For ray traversal algorithms an interesting topic of research is an algorithm that selects the ray traversal algorithm to be used to achieve the best possible performance given a ray shooting query. The concept of the longest common traversal sequence can be researched from the application point of view; how the longest common traversal sequence can be applied to minimize the running time in hidden surface removal determining the resolution of the underlying sampling pattern in image space for a given kd tree. Further, the application of the longest common traversal sequence in particular global illumination algorithms can be researched. The memory mapping concept for the kd -tree allows us to raise the question of whether a similar approach can also be successfully applied in the representation of other spatial data structures. Another interesting and promising research issue is how to apply kd -trees for ray shooting algorithm in the case of moving, deforming, and animated objects without completely rebuilding the kd -tree for each frame of an image sequence.

138ü

CHAPTER 8. CONCLUSION AND FUTURE WORK

Bibliography [1] ANSI754. ANSI/IEEE std. 754-1985. An American National Standard. IEEE Standard for Binary Floating-Point Arithmetic. New York, IEEE 1985. [2] ISO/IEC 14772-1:1997. VRML’97: The virtual reality modelling language, 1997. [3] 3D Object Intersection Home Page. Maintained by T. M¨oller, E. Haines and P. Foscari, 2000. http://www.realtimerendering.com/int/. [4] P. Agarwal, T. Murali, and J. Vitter. Practical techniques for constructing binary space partitions for orthogonal rectangles. In Proc. 13th Annu. ACM Sympos. Comput. Geom., pages 382–384, 1997. [5] P. K. Agarwal, B. Aronov, and M. Sharir. Computing envelopes in four dimensions with applications. In Proc. 10th Annu. ACM Sympos. Comput. Geom., pages 348–358, 1994. [6] P. K. Agarwal and J. Erickson. Geometric range searching and its relatives. Tech. Report CS1997-11, Department of Computer Science, Duke University, 1997. [7] P. K. Agarwal and J. Matouˇsek. On range searching with semialgebraic sets. In Proc. 17th Internat. Sympos. Math. Found. Comput. Sci., volume 629 of Lecture Notes Comput. Sci., SpringerVerlag, pages 1–13, 1992. [8] P. K. Agarwal and J. Matouˇsek. Ray shooting and parametric search. In Proc. 24th Annu. ACM Sympos. Theory Comput., pages 517–526, 1992. [9] P. K. Agarwal and M. Sharir. Ray shooting amidst convex polytopes in three dimensions. In Proc. 4th ACM-SIAM Sympos. Discrete Algorithms, pages 260–270, 1993. [10] A. Aho, J. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, Mass., 1974. [11] J. Amanatides. Ray tracing with cones. In Computer Graphics (SIGGRAPH ’84 Proceedings), volume 18, pages 129–135, July 1984. [12] J. Amanatides and A. Woo. A fast voxel traversal algorithm for ray tracing. In G. Marechal, editor, Proc. Eurographics ’87, pages 3–10, Aug. 1987. [13] O. Arnold. Probability, statistics, and queuing theory with computer science applications. Academic Press, San Diego, 1990. [14] B. Aronov and S. Fortune. Average-case ray shooting and minimum weight triangulations. In Proc. 13th Annu. ACM Sympos. Comput. Geom., pages 203–212, 1997. [15] J. Arvo. Linear-time voxel walking for octrees. Ray Tracing News. Available from htpp: //www.acm.org/tog/resources/RTNews/html/rtnews2d.html, 1(5), 1988. 139

ê 140

BIBLIOGRAPHY

[16] J. Arvo. Ray tracing with meta-hierarchies. In SIGGRAPH ’90 Advanced Topics in Ray Tracing course notes. ACM Press, Aug. 1990. [17] J. Arvo and D. Kirk. Fast ray tracing by ray classification. In M. C. Stone, editor, (SIGGRAPH ’87 Proceedings), volume 21, pages 55–64, July 1987. [18] J. Arvo and D. Kirk. A survey of ray tracing acceleration techniques, In A. S. Glassner editor, An introduction to ray tracing, Academic Press, pages 201–262, 1989. [19] J. S. Badt. Two algorithms for taking advantage of temporal coherence in ray tracing. The Visual Computer, 4(3):123–132, Sept. 1988. [20] P. Bekaert. Hierarchical and Stochastic Algorithms for Radiosity. Ph.D. thesis, Department of Computer Science, Katholieke Universiteit Leuven, 1999. [21] J. Bentley. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18:509–517, 1975. [22] M. D. Berg, D. Halperin, M. Overmars, J. Snoeyink, and M. V. Kreveld. Efficient ray shooting and hidden surface removal. Algorithmica: An International Journal in Computer Science, 12(1):30–53, 1994. [23] K. Bouatouch, M. O. Madani, T. Priol, and B. Arnaldi. A new algorithm of space tracing using a CSG model. In G. Marechal, editor, Proc. Eurographics ’87, pages 65–78, Aug. 1987. [24] R. Capelli. Fast approximation to the arctangent. In D. Kirk, editor, Graphics Gems II, Academic Press, San Diego, pages 389–391, 1992. [25] F. Cazals, G. Drettakis, and C. Puech. Filtering, clustering and hierarchy construction: A new solution for ray-tracing complex scenes. Computer Graphics Forum, 14(3):C371–C382, 1995. [26] F. Cazals and C. Puech. Bucket-like space partitioning data-structures with applications to raytracing. In 13th ACM Symposium on Computational Geometry, Nice, pages 11–20, 1997. [27] F. Cazals and M. Sbert. Some integral geometry tools to estimate the complexity of 3d scenes. Technical Report RR-3204, The French National Institue for Research in Computer Science and Control (INRIA), July 1997. [28] J. Chapman, T. W. Calvert, and J. Dill. Exploiting temporal coherence in ray tracing. In Proceedings of Graphics Interface ’90, pages 196–204, May 1990. [29] J. Chapman, T. W. Calvert, and J. Dill. Spatio-temporal coherence in ray tracing. In Proceedings of Graphics Interface ’91, pages 101–108, June 1991. [30] M. J. Charney and I. D. Scherson. Efficient traversal of well-behaved hierarchicial trees of extents for ray-tracing complex scenes. The Visual Computer, 6(3):167–178, June 1990. [31] Y.-J. Chiang. Dynamic and I/O-efficient algorithms for computational geometry and graph problems: Theoretical and experimental results. Technical Report CS-95-27, Department of Computer Science, Brown University, Aug. 1995. [32] J.-H. Chuang and W.-J. Hwang. A new space subdivision for ray tracing CSG solids. IEEE Computer Graphics and Applications, 15(6):56–62, Nov. 1995. [33] J. G. Cleary and G. Wyvill. Analysis of an algorithm for fast ray tracing using uniform space subdivision. The Visual Computer, 4(2):65–83, July 1988.

BIBLIOGRAPHY

141

[34] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, 1990 (tenth printing 1993). [35] L. Dachs. Vizualizace datov´ych struktur pro ukl´ad´an´ı prostorov´ych dat. Master thesis, Czech Technical University, May 1999. In Czech. [36] F. d’Amore and P. G. Franciosa. On the optimal binary plane partition for sets of isothetic rectangles. In Proc. 4th Canad. Conf. Comput. Geom., pages 1–5, 1992. [37] R. Day. How to write & Publish a Scientific Paper. Academic Press, 1997. [38] D.Cohen and Z.Sheffer. Proximity clouds - an acceleration technique for 3D grid traversal. The Visual Computer, 11:27–38, 1994. [39] M. de Berg, M. de Groot, and M. Overmars. New results on binary space partitions in the plane. Comput. Geom. Theory Appl., 8:317–333, 1997. [40] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, Berlin, 1997. [41] O. Devillers. The macro-regions: an efficient space subdivision structure for ray tracing. In W. Hansmann, F. R. A. Hopgood, and W. Strasser, editors, Proc. Eurographics ’89, pages 27–38, Sept. 1989. [42] D.-Z. Du and Y. Zhang. On heuristics for minimum length rectilinear partitions. Algorithmica, 5:111–128, 1990. [43] F. Durand. 3D Visibility: Analytical Study and Applications. Ph.D. thesis, Universite Grenoble I, July 1999. [44] R. ENDL. An object-oriented ray tracing architecture for the analysis of ray-generators in spatial subdivisions. In Proceedings of Compugraphics ’95, pages 268–277, Dec. 1995. [45] R. Endl and M. Sommer. Classification of ray-generators in uniform subdivisions and octrees for ray tracing. Computer Graphics Forum, 13(1):C3–C19, Mar. 1994. [46] M. Feixas, E. del Acebo, P. Bekaert, and M. Sbert. An information theory framework for the analysis of scene complexity. In P. Brunet and R. Scopigno, editors, Proc. Eurographics ’97, pages 95–106, Sept. 1999. [47] J. D. Foley, A. van Dam, S. K. Feiner, and J. F. Hughes. Computer Graphics: Principles and Practice. Addison-Wesley Publishing Co., Reading, Mass., 2nd edition, 1990. [48] T. Foris, G. M´arton, and L. Szirmay-Kalos. Ray shooting in logarithmic time. In Proceedings of Winter School of Computer Graphics 96, pages 84–90, Feb. 1996. [49] A. Formella and C. Gill. Ray tracing: a quantitative analysis and a new practical algorithm. The Visual Computer, 11(9):465–476, 1995. [50] A. Formella, C. Gill, and V. Hofmeyer. Fast ray tracing of sequences by ray history evaluation. In Proceedings of Computer Animation ’94, IEEE Computer Society Press, pages 184–191, May 1994. [51] A. Fournier and P. Poulin. A ray tracing accelerator based on a hierarchy of 1D sorted lists. In Proceedings of Graphics Interface ’93, pages 53–61, May 1993.

ô 142

BIBLIOGRAPHY

[52] H. Fuchs, Z. M. Kedem, and B. F. Naylor. On visible surface generation by a priori tree structures. In Computer Graphics (SIGGRAPH ’80 Proceedings), volume 14, pages 124–133, July 1980. [53] A. Fujimoto, T. Tanaka, and K. Iwata. ARTS: Accelerated ray tracing system. IEEE Computer Graphics and Applications, 6(4):16–26, 1986. [54] I. Gargantini and H. H. Atkinson. Ray tracing an octree: numerical evaluation of the first intersection. Computer Graphics Forum, 12(4):C199–C210, Oct. 1993. [55] J. Genetti, D. Gordon, and G. Williams. Adaptive supersampling in object space using pyramidal rays. In Computer Graphics Forum, 17(1):C29–C54, Mar. 1998. [56] M. Gervautz. Consistent schemes for addressing surfaces when ray tracing transparent CSG objects. Computer Graphics Forum, 11(4):C203–C211, Oct. 1992. [57] M. Gigante. Accelerated ray tracing using non-uniform grids. In Proceedings of Ausgraph ’90, pages 157–163, 1988. [58] Global illumination mailing list. http://w3imagis.imag.fr/˜Francois.Sillion/ GlobillumList.html. [59] A. S. Glassner. Space subdivision for fast ray tracing. IEEE Computer Graphics and Applications, 4(10):15–22, Oct. 1984. [60] A. S. Glassner. Spacetime ray tracing for animation. IEEE Computer Graphics and Applications, 8(2):60–70, Mar. 1988. [61] A. S. Glassner. An Introduction to Ray Tracing. Academic Press, 1989. [62] A. S. Glassner. Principles of Digital Image Synthesis. Computer Graphics and Geometric Modeling. Morgan Kaufmann, San Francisco, CA, 1995. [63] J. Goldsmith and J. Salmon. Automatic creation of object hierarchies for ray tracing. IEEE Computer Graphics and Applications, 7(5):14–20, May 1987. [64] P. Gonzalez and F. Gisbert. Object and ray coherence in the optimization of the ray tracing algorithm. In Proceedings of Computer Graphics International ’98 (CGI’98), Hannover, Germany, pages 264–267, June 1998. [65] M. T. Goodrich and R. Tamassia. Dynamic ray shooting and shortest paths via balanced geodesic triangulations. In Proc. 9th Annu. ACM Sympos. Comput. Geom., pages 318–327, 1993. [66] D. Gordon and S. Chen. Front-to-back display of BSP trees. IEEE Computer Graphics and Applications, 11(5):79–85, Sept. 1991. [67] E. Gr¨oller and W. Purgathofer. Using temporal and spatial coherence for accelerating the calculation of animation sequences. In W. Purgathofer, editor, Proc. Eurographics ’91, pages 103–113, Sept. 1991. [68] E. Gr¨oller and W. Purgathofer. Coherence in Computer Graphics. Technical Report TR-186-295-04, Institute of Computer Graphics, Vienna University of Technology, Favoritenstrasse 9/186, A-1040 Vienna, Austria, 1995. Human contact: [email protected] [69] E. A. Haines. A proposal for standard graphics environments. IEEE Computer Graphics and Applications, 7(11):3–5, Nov. 1987. Available from http://www.acm.org/pubs/tog/ resources/SPD/overview.html.

BIBLIOGRAPHY

143

[70] E. A. Haines and D. P. Greenberg. The light buffer: A ray tracer shadow testing accelerator. IEEE Computer Graphics and Applications, 6(9):6–16, Sept. 1986. [71] E. A. Haines and J. R. Wallace. Shaft culling for efficient ray-traced radiosity. In P. Brunet and F. W. Jansen, editors, Photorealistic Rendering in Computer Graphics (Proceedings of the Second Eurographics Workshop on Rendering), Springer-Verlag, New York, pages 122–138, 1994. [72] V. Havran. Cache sensitive representation for the BSP tree. In Proceedings of Compugraphics’97, GRASP – Graphics Science Promotions & Publications, pages 369–376, Dec. 1997. [73] V. Havran. Analysis of cache sensitive representation for binary space partitioning trees. Informatica, Slovene Society Informatika, ISSN 0350-5596, 23(3):203–210, May 1999. [74] V. Havran. A summary of octree ray traversal algorithms. Ray Tracing News, 12(2):cca 10 pages, Dec. 1999. Available from http://www.acm.org/tog/resources/RTNews/ html/rtnv12n2.html. [75] V. Havran. GOLEM rendering system, 2000. HOME page at http://www.cgg.cvut.cz/ GOLEM. [76] V. Havran and J. Bittner. Constructing rectilinear BSP trees for preferred ray sets. In Proceedings of Short Communication Papers of WSCG’99, poster section, pages 1–2, Feb. 1999. [77] V. Havran and J. Bittner. Rectilinear BSP trees for preferred ray sets. In Proceedings of SCCG’99 (Spring Conference on Computer Graphics), Budmerice, Slovak Republic, pages 171– 179, Apr./May 1999. [78] V. Havran and J. Bittner. LCTS: Ray shooting using longest common traversal sequences. Computer Graphics Forum (Proc. Eurographics ’2000), 19(3):C59–C70, Aug 2000. [79] V. Havran, J. Bittner, and J. Pˇrikryl. Best efficiency scheme project proposal. Home page at: http://www.cgg.cvut.cz/GOLEM/bes.html, Oct 1999. ˇ ara. Ray tracing with rope trees. In Proceedings of SCCG’98 [80] V. Havran, J. Bittner, and J. Z´ (Spring Conference on Computer Graphics), Budmerice, Slovak Republic, pages 130–139, Apr. 1998. ˇ ara. VIS-RT: A visualization system for RT spatial data structures. [81] V. Havran, L. Dachs, and J. Z´ In Proceedings of WSCG’2000, short communication papers, pages 28–35, Feb. 2000. ˇ ara. Fast robust BSP tree traversal algorithm for ray [82] V. Havran, T. Kopal, J. Bittner, and J. Z´ tracing. Journal of Graphics Tools, 2(4):15–23, Dec. 1997. [83] V. Havran and W. Purgathofer. Comparison methodology for ray shooting algorithms. Technical Report TR-186-2-00-20, Institute of Computer Graphics, Vienna University of Technology, Favoritenstrasse 9/186, A-1040 Vienna, Austria, Nov. 2000. Human contact: [email protected] [84] V. Havran, J. Pˇrikryl, and W. Purgathofer. Statistical comparison of ray-shooting efficiency schemes. Technical Report TR-186-2-00-14, Institute of Computer Graphics, Vienna University of Technology, Favoritenstrasse 9/186, A-1040 Vienna, Austria, May 2000. Human contact: [email protected] [85] V. Havran and F. Sixta. Comparison of hierarchical grids. Ray Tracing News, 12(1):cca 4 pages, June 1999. Available from http://www.acm.org/tog/resources/RTNews/html/ rtnv12n1.html.

ù 144

BIBLIOGRAPHY

ˇ ara. Evaluation of BSP properties for ray–tracing. In Proceedings of SCCG’97 [86] V. Havran and J. Z´ (Spring Conference on Computer Graphics), pages 155–162, Budmerice, June 1997. [87] P. S. Heckbert and P. Hanrahan. Beam tracing polygonal objects. Computer Graphics (SIGGRAPH’84 Proceedings), 18(3):119–127, July 1984. [88] M. Held. ERIT – a collection of efficient and reliable intersection tests. Journal of Graphics Tools, 2(4):25–44, Dec. 1997. [89] T. Horvath, G. M´arton, P. Risztics, and L. Szirmay-Kalos. Ray coherence between a sphere and a convex polyhedron. Computer Graphics Forum, 11(2):C163–C172, June 1992. [90] P.-K. Hsiung and R. H. Thibadeau. Accelerating ARTS. The Visual Computer, 8(3):181–190, Mar. 1992. [91] F. W. Jansen. Data structures for ray tracing. In L. R. A. Kessener, F. J. Peters, and M. L. P. van Lierop, editors, Data Structures for Raster Graphics, Springer-Verlag, New York,, pages 57–73, 1986. [92] D. Jevans. Object space temporal coherence for ray tracing. In Proceedings of Graphics Interface ’92, pages 176–183, May 1992. [93] D. Jevans and B. Wyvill. Adaptive voxel subdivision for ray tracing. In Proceedings of Graphics Interface ’89, pages 164–172, June 1989. [94] M. Kaplan. Space-Tracing: A Constant Time Ray-Tracer, pages 149–158, July 1985. [95] M. R. Kaplan. The use of spatial coherence in ray tracing. In D. E. Rogers and R. A. Earnshaw, editors, Techniques for Computer Graphics, Springer-Verlag, pages 173–193, 1987. [96] T. L. Kay and J. T. Kajiya. Ray tracing complex scenes. In D. C. Evans and R. J. Athay, editors, SIGGRAPH ’86 Proceedings), volume 20, pages 269–278, Aug. 1986. [97] D. Kirk and J. Arvo. Improved ray tagging for voxel-based ray tracing. In J. Arvo, editor, Graphics Gems II, Academic Press, San Diego, pages 264–266, 1991. [98] K. S. Klimaszewski. Faster ray tracing using adaptive grids and area sampling. Ph.D. thesis, Brigham Young University, Dec. 1994. [99] K. S. Klimaszewski. Faster Ray Tracing Using Adaptive Grids and Area Sampling. Ph.D. thesis, Dept. of Civil and Environmental Engineering, Brigham Young University, Provo, Utah, 1994. [100] K. S. Klimaszewski and T. W. Sederberg. Faster ray tracing using adaptive grids. IEEE Computer Graphics and Applications, 17(1):42–51, Jan./Feb. 1997. [101] Y. P. Kuzmin. Ray traversal of spatial structures. Computer Graphics Forum, 13(4):C223–C227, Oct. 1994. [102] B. Kwon, D. S. Kim, K.-Y. Chwa, and S. Y. Shin. Memory-efficient ray classification for visibility operations. IEEE Transactions on Visualization and Computer Graphics, 4(3):193–201, July/Sept. 1998. [103] A. Lingas. Heuristics for minimum edge length rectangular partitions of rectilinear figures. In Proc. 6th GI Conf. Theoret. Comput. Sci., volume 145 of Lecture Notes Comput. Sci., SpringerVerlag, pages 199–210, 1983.

BIBLIOGRAPHY

145

[104] J. D. MacDonald and K. S. Booth. Heuristics for ray tracing using space subdivision. In Proceedings of Graphics Interface ’89, pages 152–63, June 1989. [105] J. D. MacDonald and K. S. Booth. Heuristics for ray tracing using space subdivision. Visual Computer, 6(6):153–65, 1990. [106] G. M´arton. Acceleration of ray tracing via Voronoi diagrams. In A. W. Paeth, editor, Graphics Gems V, Academic Press, Boston Mass., pages 268–284, 1995. [107] G. M´arton and L. Szirmay-Kalos. On average-case complexity of ray tracing algorithms. In Proceedings of Winter School of Computer Graphics 95, pages 187–196, Feb. 1995. [108] H. Maurel, Y. Duthen, and R. Caubet. A 4D ray tracing. Computer Graphics Forum, 12(3):C285– C294, Aug 1993. [109] M. D. J. McNeill, B. C. Shah, M.-P. Hebert, P. F. Lister, and R. L. Grimsdale. Performance of space subdivision techniques in ray tracing. Computer Graphics Forum, 11(4):C213–C220, Oct. 1992. [110] J. S. B. Mitchell, D. M. Mount, and S. Suri. Query-sensitive ray shooting. In Proc. 10th Annu. ACM Sympos. Comput. Geom., pages 359–368, 1994. [111] S. Mohaban and M. Sharir. Ray shooting amidst spheres in three dimensions and related problems. SIAM J. Comput., 26(3):654–674, June 1997. [112] T. M¨oller and E. Haines. Real-Time Rendering. A K Peters, Ltd., 1999. [113] C. Montani and R. Scopigno. Ray tracing CSG trees using the sticks representation scheme. Computers and Graphics, 14(3/4):481–490, 1990. [114] K. Murakami and K. Hirota. Incremental ray tracing. In K. Bouatouch and C. Bouville, editors, Photorealism in Computer Graphics, Springer-Verlag, pages 17–32, 1992. [115] C. Nyberg, T. Barclay, Z. Cvetanovic, J. Gray, and D. Lomet. Alphasort: A cache-sensitive parallel external sort. VLDB Journal, (4):603–627, 1995. [116] M. Ohta and M. Maekawa. Ray coherence theorem and constant time ray tracing algorithm. In T. L. Kunii, editor, Computer Graphics 1987 (Proceedings of CG International ’87), SpringerVerlag, pages 303–314, 1987. [117] M. Pellegrini. Ray shooting and lines in space. In J. E. Goodman and J. O’Rourke, editors, Handbook of Discrete and Computational Geometry, CRC Press LLC, Boca Raton, FL, chapter 32, pages 599–614, 1997. [118] Q. Peng, Y. Zhu, and Y. Liang. A fast ray tracing algorithm using space indexing techniques. In G. Marechal, editor, Proc. Eurographics ’87, pages 11–23, Aug. 1987. [119] M. Pharr and P. Hanrahan. Geometry caching for ray-tracing displacement maps. In X. Pueyo and P. Schr¨oder, editors, Eurographics Rendering Workshop 1996, Eurographics, Springer-Verlag, Wien, pages 31–40, June 1996. [120] M. Pharr, C. Kolb, R. Gershbein, and P. Hanrahan. Rendering complex scenes with memorycoherent ray tracing. In T. Whitted, editor, SIGGRAPH 97 Conference Proceedings, Annual Conference Series, ACM SIGGRAPH, Addison Wesley, pages 101–108, Aug. 1997.

û 146

BIBLIOGRAPHY

[121] M. Quail. Space time ray tracing using ray classification. Bachelor thesis, Department of Computing, Macquarie University, Nov. 1996. [122] E. Reinhard, A. J. F. Kok, and F. W. Jansen. Cost prediction in ray tracing. In Rendering Techniques ’96, Springer-Verlag, Wien, pages 41–50, 1996. [123] S. M. Rubin and T. Whitted. A 3-dimensional representation for fast rendering of complex scenes. In SIGGRAPH ’80 Proceedings, volume 14, pages 110–116, July 1980. [124] S. S and K. H. Directional safe zones & dual extent algorithms for efficient grid traversal. In Proceedings of Graphics Interface’97, pages 76–87, 1997. [125] H. Samet. Design and analysis of Spatial Data Structures: Quadtrees, Octrees, and other Hierarchical Methods. Addison–Wesley, Reading, Mass., 1989. [126] H. Samet. Implementing ray tracing with octrees and neighbor finding. Computers and Graphics, 13(4):445–60, 1989. [127] H. Samet. Applications of Spatial Data Structures. Addison-Wesley, Reading, Mass., 1990. [128] I. D. Scherson and E. Caspary. Data structures and the time complexity of ray tracing. The Visual Computer, 3(4):201–213, Dec. 1987. [129] C. H. Sequin and E. K. Smyrl. Parameterized ray tracing. In J. Lane, editor, SIGGRAPH ’89 Proceedings, volume 23, pages 307–314, July 1989. [130] SGI. Power Challenge Technical report, Silicon Graphics Computer Systems, 1996. [131] M. Shinya, T. Takahashi, and S. Naito. Principles and applications of pencil tracing. In M. C. Stone, editor, Computer Graphics (SIGGRAPH ’87 Proceedings), volume 21, pages 45–54, July 1987. [132] G. Simiakakis. Accelerating RayTracing with Directional Subdivision and Parallel Processing. Ph.D. thesis, University of East Anglia, Oct. 1995. [133] G. Simiakakis and A. M. Day. Five-dimensional adaptive subdivision for ray tracing. Computer Graphics Forum, 13(2):C133–C140, June 1994. [134] F. Sixta. Dˇelen´ı prostoru pro sledov´an´ı paprsku. Bachelor thesis, Czech Technical University in Prague, May 1997. In Czech. [135] F. Sixta. Datov´e struktury pro ukl´ad´an´ı prostorov´ych dat. Master thesis, Czech Technical University, Jan. 1999. In Czech. [136] J. M. Snyder and A. H. Barr. Ray tracing complex models containing surface tessellations. In M. C. Stone, editor, Computer Graphics (SIGGRAPH ’87 Proceedings), volume 21, pages 119– 128, July 1987. [137] SOFTIMAGE. Mental Ray, A Programmer’s Reference Guide. Gesellschaft f¨ur Computerfilm and Maschinintelligenz Gmbh & Co. KG, Berlin, 1995. [138] H. Solomon. Geometric Probability. J.W. Arrowsmith Ltd, 1978. [139] N. Stolte and R. Caubet. Discrete ray-tracing of huge voxel spaces. Computer Graphics Forum, 14(3):C383–C394, Sept. 1995. [140] L. Stone. Theory of Optimal Search. Academic Press, New York, 1975.

BIBLIOGRAPHY

147

[141] B. Stroustrup. The C++ Programming Language, 3rd edition. Addison-Wesley, 1997. [142] W. Stuerzlinger. Bounding volume construction using point clouds. In Proceedings of SCCG’96 (Spring Conference on Computer Graphics), pages 239–246, June 1996. [143] K. R. Subramanian. Personal communication, 1998. [144] K. R. Subramanian and D. S. Fussel. Factors affecting performance of ray tracing hierarchies. Technical Report Tx 78712, The University of Texas at Austin, July 1990. [145] K. R. Subramanian and D. S. Fussel. A search structure based on k-d trees for efficient ray tracing. Technical Report (Ph.D. Dissertation), Tx 78712-1188, The University of Texas at Austin, Dec. 1990. [146] K. R. Subramanian and D. S. Fussell. Automatic termination criteria for ray tracing hierarchies. In Proceedings of Graphics Interface ’91, pages 93–100, June 1991. [147] K. Sung. A DDA octree traversal algorithm for ray tracing. In W. Purgathofer, editor, Proc. Eurographics ’91, pages 73–85, Sept. 1991. [148] K. Sung and P. Shirley. Ray tracing with the BSP tree. In D. Kirk, editor, Graphics Gems III, Academic Press, San Diego, pages 271–274, 1992. [149] L. Szirmay-Kalos. Monte-Carlo methods in global illumination, script written in Institute of Computer Graphics, Vienna University of Technology, Oct. 1999. [150] L. Szirmay-Kalos and G. M´arton. On the complexity of ray shooting. In Dagstuhl Seminar on Rendering, 1996, 1996. [151] L. Szirmay-Kalos and G. M´arton. On the limitations of worst–case optimal ray shooting algorithms. In Proceedings of Winter School of Computer Graphics 97, pages 562–571, Feb. 1997. [152] L. Szirmay-Kalos and G. M´arton. Analysis and construction of worst-case optimal ray shooting algorithms. Computers and Graphics, 22(2–3):167–174, Mar. 1998. [153] L. Szirmay-Kalos and G. M´arton. Worst-case versus average case complexity of ray-shooting. Computing, 61(2):103–131, 1998. [154] R. Tarjan. Data Structures and Network Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, 1987. [155] S. Teller and J. Allex. Frustum casting for progressive, interactive rendering. Technical Report MIT LCS TR-740, MIT, Jan. 1998. [156] M. van der Zwaan, E. Reinhard, and F. W. Jansen. Pyramid clipping for efficient ray traversal. In Proceedings of Eurographics Rendering Workshop 1995, Dublin, Ireland, pages 1–10, 1995. [157] A. Watt and M. Watt. Advanced Animation and Rendering Techniques. ACM-PRESS, AddisonWesley, 1992. [158] H. Weghorst, G. Hooper, and D. P. Greenberg. Improved computational methods for ray tracing. ACM Transactions on Graphics, 3(1):52–69, Jan. 1984. [159] K. Y. Whang, J. W. Song, J. W. Chang, J. Y. Kim, W. S. Cho, C. M. Park, and I. Y. Song. OctreeR: an adaptive octree for efficient ray tracing. IEEE Transactions on Visualization and Computer Graphics, 1(4):343–349, Dec. 1995.

148ü

BIBLIOGRAPHY

[160] T. Whitted. An improved illumination model for shaded display. Communications of the ACM, 23(6):343–349, Aug. 1979. [161] A. Wilkie, R. F. Tobler, and W. Purgathofer. Orientation lightmaps for photon radiosity in complex environments. In Proceedings of Computer Graphics International ’2000 (CGI’2000), pages 279–286, June 2000. [162] A. Woo. Fast ray-box intersection. In A. S. Glassner, editor, Graphics Gems, Academic Press, San Diego, pages 395–396, 1990. [163] A. Woo. Ray tracing polygons using spatial subdivision. In Proceedings of Graphics Interface ’92, pages 184–191, May 1992. [164] A. Woo, P. Poulin, and A. Fournier. A survey of shadow algorithms. IEEE Computer Graphics and Applications, 10(6):13–32, Nov. 1990. [165] G. Wyvill, T. L. Kunii, and Y. Shirai. Space division for ray tracing in CSG (Constructive Solid Geometry). IEEE Computer Graphics and Applications, 6(4):28–34, Apr. 1986. [166] F. Yamaguchi and M. Niizeki. Some basic geometric test conditions in terms of Pluecker coordinates and pluecker coefficients. Visual Computer, 13(1):29–41, 1997. [167] P. Zemˇc´ık and A. Chalmers. Optimised CSG tree evaluation for space subdivision. Computer Graphics Forum, 14(2):C139–C146, June 1995.

Notation Upper Case Roman AG xyz A B x y z BES BSP BVH C CX CIT f ail CIT succ CIT CT CT I CT L CT S COMP DR DS E ESSD FF ν GX GXSPD GOLEM HLCTS H H HSSD HUG HW I ISC IMPL LCTS K L LSA b L LO LR M N N1 N2 N3 N4 N5 NE

ýþ ÿ ÿ ý þ _ÿ *ÿ

ÿ þ ÿ

þ

ÿ ÿ ÿ ÿ

adaptive grid point in IE3 , entry point of ray traversal algorithm point in IE3 , exit point of ray traversal algorithm Best Efficiency Scheme research project binary space partitioning (tree) bounding volume hierarchy cost – the running time to perform some particular algorithmic operation cost of operation X cost of ray-object intersection test cost of failed ray-object intersection test cost of successful ray-object intersection test total cost for shooting an arbitrary ray cost of traversing the interior node of the kd -tree cost of traversing the leaf node of the kd -tree cost of a single RSA traversal step compiler direction vector for ray R data structure underlying a particular RSA entry elementary spatial subdivision face, a face associated with ν of the node ν group of test scenes with number of objects in range 10X 1, 10X 1

group of test scenes from SPD where number of objects is closest to 10X implementation framework used for algorithms within the thesis hierarchical longest common traversal sequence positive or negative halfspace defined by plane in IEn hierarchical spatial subdivision hierarchy of uniform grids hardware point of intersection mutual visibility information implementation longest common traversal sequence ray-object intersection test is performed K-times instance of HLCTS the surface area of of the left child for position b the case left the case left only the case left, then right size of memory number of objects negative traversal case in a recursive traversal algorithm number of elementary nodes in DS

þ

149

ê 150

Notation

NEE NER NET S NEET S NG Ni NIT f ail NIT succ NIT Nl NL Nmax Nrays NR NSP NT I NT L NT S NV N l O Oi g n R O O84,O89,O93 O84A,O93A P PC P1 P2 P3 P4 P5 Q Qi R R RL RG RO RSA b RSA S SR SA SA X SF SLCTS SPD SSD T TA TMM Tapp TC TB TIT TR TT S TH TP

þ

þ þ

ÿ ÿ ÿ ÿ

þ

þ

number of empty elementary nodes in DS total number of references to objects in elementary nodes of DS number of elementary nodes accessed per ray number of empty elementary nodes accessed per ray number of generic nodes in DS number of interior nodes in kd -tree number of ray-object intersection tests per ray number of failed ray-object intersection tests per ray number of successful ray-object intersection tests per ray number of leaves in kd -tree number of objects in the left child of the current node number of objects in kd -tree node, when it is declared as a leaf number of rays number of objects in the right child of the current node number of objects intersecting the splitting plane number of interior nodes traversed in kd -tree number of leaves traversed in kd -tree number of all nodes accessed per ray number of voxels in uniform grid number of objects stored in the l-th leaf of kd -tree object i-th object upper bound of the worst-case complexity is g n origin point of ray R various versions of octree built with spatial median subdivision two variants of octree built with surface area heuristic polygon convex polygon positive traversal case in a recursive traversal algorithm center point center point of the of i-th object ray the case right the case right, then left recursive grid the case right only the surface area of of the right child for position b ray shooting algorithm set set of rays surface area surface area of the element X size factor to generate scenes of different complexity in SPD simple longest common traversal sequence Standard Procedural Database [69] spatial subdivision time traversal algorithm latency of main memory remaining running time of application excluding the time consumed by a ray shooting algorithm cache latency preprocessing time required to build data structures for particular ray shooting algorithm time for ray-object intersection tests running time required to perform given TP in the application time for traversing the nodes of DS traversal history RSA testing procedure a generating particular set of rays given a scene

þ

Notation UG U V VP Vol X X Y W WR Z1 Z2 Z3

ÿ

þ

ÿ

ÿ ÿ

151 uniform grid points in IE3 viewpoint in IE3 volume of X in IE3 entity, spatial region, cell, object, etc. number of wins for particular RSA within BES project viewport zero traversal case in a recursive traversal algorithm

Lower Case Roman a b b bOM bSM c d d ν dmax drec dvoxel e enG int f h i j k l n˜ m n

þ

ÿ

nG int p pX pY X p0 pT r rCH rIT M rSI s slen t u v w xyz xyz

ÿ ÿ ÿ ÿ

entry signed distance corresponding to entry point A exit signed distance corresponding to entry point B position of the splitting plane in , b 0 1

position of the splitting plane for object median position of the splitting plane for spatial median real number depth depth of node ν in the kd -tree (depth of the kd -tree root node is zero) maximum depth allowed for a leaf in the kd -tree depth of rays, for primary rays holds drec 0. voxel density in context of uniform grid integer number average number of intersections with objects for global line, casting global lines number of tasks where experiment failed due to time limit height, height of a tree indices kurtosis length average number of objects in the voxel number of tasks where experiment failed due to memory limits integer number number of dimensions of space average number of intersections with objects for global line, surface area probability probability of case X conditional probability of case Y when case X occurs probability of zero intersections blocking factor, i.e., probability of a ray hitting an object ratio cache hit ratio ratio of ray-object intersection tests performed to minimum number of intersection tests ratio of number of rays hitting objects to number of all rays skewness average span length signed distance real number variance width real numbers axis denotation

ý

ÿ

ô 152

Notation

Scripts ( s)

axis-aligned bounding box(es) axis-aligned bounding box tightly enclosing entity X convex shaft cell of SSD cell of SSD in IEn space sequence, sequence of kd -tree nodes scene (region of space) scene containing N objects

þ X

n

þ N

Upper Case Greek ∆ Θ Σ Π ΠP Ψ Ω f n Ω

subset of four parameters of minimum testing output describing dynamic use of DS subset of five hardware/implementation dependent parameters of minimum testing output subset of four parameters of minimum testing output describing static properties of DS plane projection plane sparseness lower bound of the worst-case complexity is f n solid angle in IE3

þ þ

þ

Lower Case Greek ε θ λ ν νG νE νEE νEF σ

small positive constant angle between two vectors in IE3 nonuniformity coefficient node of DS generic node of DS elementary node of DS empty elementary node of DS full elementary node of DS standard deviation

Miscellaneous

S x IEn UV log N dU V ∂X Xˆ Y ˜ Z const int X ext X Ux Uy Uz lchild ν rchild ν

þ ÿ

þ

þ ÿ ÿ þ 2ÿ

þ

cardinality of set S absolute value of real number x n-dimensional Euclidean space line segment with U and V as endpoints dyadic logarithm of N distance between points U and V boundary of X mark ˆ denotes that Xˆ is an estimate of quantity X mark ˜ denotes that Y ˜ is the average value of quantity Y mark denotes that Z is the vector in IEn space constant interior of X exterior of X coordinates of point U in IE3 left (right) child of the node ν in the kd -tree

Appendix A

Appendix A C-pseudocode of sequential ray traversal algorithm TAseq for kd -tree. /* Possible orientation of the splitting plane in the interior node of the kd-tree, */ /* ”No axis” denotes a leaf. */ enum Axes X axis, Y axis, Z axis, No axis ; /* Declaration of the kd-tree node. */ struct KDTNode Point3D min, max; /* extent of node . . . six float values */ GeomObjlist *objlist; /* list of enclosed objects */ struct KDTNode *left; /* pointer to the left child */ struct KDTNode *right; /* pointer to the right child */ Axes axis; /* orientation of the splitting plane */ ; /* Locate leaf containing the point starting given node. */ KDTNode* LocateLeaf(KDTNode *node, Point3D point)

KDTNode *currNode = node; if (point lies outside node bounding box) return ["no leaf exists"]; while (currNode points to interior node) if ( point[currNode->axis] < currNode->right.min[currNode->axis]) currNode = currNode->left; else currNode = currNode->right; /* while */

/* return the found leaf that contains point */ return (KDTNode *)currNode; /* LocateLeaf */

/* Sequential ray traversal algorithm */ Object RayTravAlgSEQ(KDTNode *rootNode, Ray ray)

float a, b; /* entry/exit point signed distances */ Point3D point; /* the point along the ray path */ KDTNode *currNode; /* pointer to a kd-tree node */ /* intersect the ray with sceneBox, find the entry and exit signed distance */ RayBoxIntersect(ray, rootNode, &a, &b); if (ray does not intersect sceneBox) return ["No object"]; /* start at the root node */ currNode = rootNode;

153

ù 154

Appendix A

/* when ray has the origin inside sceneBox, */ if (a < 0.0) /* use the point of origin for initial search, */ point = ray.origin; else /* otherwise use ray entry point for sceneBox */ point = ray.origin + ray.dir * (a + epsilon); /* starting from the root node, locate the first leaf */ currNode = LocateLeaf(rootNode, point); /* traverse through whole kd-tree until the object */ /* is intersected or the ray leaves the scene */ while (currNode points to leaf) /* find out signed distances to the leaf node bounding box */ RayBoxIntersect(ray, currNode, &a, &b); if (currNode is not empty leaf) "intersect ray with each object in the object list" "discarding those lying before (a) or farther than (b)"

if (any intersection exists) return ["object with the closest intersection point"]; /* if */

/* compute the point on the ray path in the next leaf */ point = ray.origin + ray.dir * (b + epsilon);

/* locate the next leaf along the ray path, if possible */ currNode = LocateLeaf(root, point); /* while */

/* all the leaves of the kd-tree along the ray path were tested */ return ["No object"]; /* RayTravAlgSEQ */

Appendix B

Appendix B C-pseudocode of recursive ray traversal algorithm TAArec for the kd -tree. /* Possible orientation of the splitting plane in the interior node of the kd-tree, */ /* ”No axis” denotes a leaf. */ enum Axes X axis, Y axis, Z axis, No axis ; /* Declaration of the node of kd-tree */ struct KDTreeNode Point3D min, max; /* extent of node . . . six float values */ GeomObjlist *objlist; /* list of enclosed objects */ struct KDTreeNode *left; /* pointer to the left child */ struct KDTreeNode *right; /* pointer to the right child */ Axes axis; /* orientation of the splitting plane */ ; /* Entry for stack operation. */ struct StackElem KDTreeNode* node; float a; /* entry signed distance (a) for the node */ float b; /* exit signed distance (b) for the node */ ; /* Recursive ray traversal algorithm, which suffers from a lack of robustness. */ Object RayTravAlgRECA(KDTreeNode *rootNode, Ray ray)

float a, b; /* entry/exit point signed distances */ float t; /* signed distance to the splitting plane */ /* intersect ray with sceneBox, find the entry and exit signed distance */ RayBoxIntersect(ray, rootNode, &a, &b); if (ray does not intersect sceneBox) return ["No object"]; /* stack to avoid recursive calls, required for efficiency */ StackElem stack[MAXDEPTH]; /* MAXDEPTH could be 50 */ int stackPtr = 0; /* pointer to the stack */ /* pointers to the children node and current node */ KDTreeNode *farChild, *nearChild, *currNode; /* push the initial values onto the stack */ "store rootNode, a, b onto the stack and increment stackPtr" /* until we traversed through the whole kd-tree */ while ( stack is not empty ) /* stackPtr 0 */ /* pop values from the stack */ "decrement stackPtr and retrieve currNode, a, and b from the stack" /* loop until a leaf is found */

155

û 156

Appendix B while (currNode is not a leaf) /* current node is an interior node */

/* for X axis, Y axis, Z axis compute difference between position of splitting plane and ray origin */ float diff = currNode->right.min[axis] - ray.origin[axis]; /* the signed distance to splitting plane */ t = diff / ray.dir[axis]; /* NEGATIVE or POSITIVE cases? */ /* the case ZERO is not recognized! */ if (diff > 0.0) /* NEGATIVE */ nearChild = currNode->left; farChild = currNode->right;

else /* POSITIVE */ nearChild = currNode->right; farChild = currNode->left; /* if */ /* distinguish between cases 1, 3, 4, and 5, */ /* but case 2 is not taken into account! */ if ( (t > b) or (t < 0.0) ) currNode = nearChild; /* case 3 or 1 */ else if (t < a ) currNode = farChild; /* case 5 */ else /* case 4 – push */ "store farNode, t, b onto the stack and increment stackPtr"

/* select the near child for further traversal */ currNode = nearChild; /* change the exit signed distance */ b = t; /* if */ /* if */ /* while */

/* current node is the leaf . . . empty or full */ "intersect ray with each object in the object list" "discarding those lying before (a) or farther than (b)"

if (any intersection exists) return ["object with the closest intersection point"]; /* while ( stack is not empty) */

/* if stack is empty, no intersection has been found */ return ["No object"]; /* RayTravAlgRECA */

Appendix C

Appendix C C-pseudocode of recursive ray traversal algorithm TABrec for the kd -tree. /* Possible orientation of the splitting plane in the interior node of the kd-tree, */ /* ”No axis” denotes a leaf. */ enum Axes X axis, Y axis, Z axis, No axis ; /* Declaration of the kd-tree node */ struct KDTNode GeomObjlist *objlist; /* list of enclosed objects */ struct KDTNode *left; /* pointer to the left child */ struct KDTNode *right; /* pointer to the right child */ Axes axis; /* orientation of the splitting plane */ float splitPlane; /* position of the splitting plane */ ; /* Entry for stack operation. */ struct StackElem KDTNode* node; /* pointer to far child */ float t; /* the entry/exit signed distance */ Point3D pb; /* the coordinates of entry/exit point */ int prev; /* the pointer to the previous stack item */ ; /* Recursive ray traversal algorithm. */ Object RayTravAlgRECB(KDTNode *rootNode, Ray ray)

float a, b; /* entry/exit signed distance */ float t; /* signed distance to the splitting plane */ /* intersect ray with sceneBox, find the entry and exit signed distance */ RayBoxIntersect(ray, rootNode, &a, &b); if (ray does not intersect sceneBox ) return ["No object"]; /* stack required for traversal to store far children */ StackElem stack[MAXDEPTH]; /* MAXDEPTH could be 50 */ /* pointers to the far child node and current node */ KDTNode *farChild, *currNode; currNode = root; /* start from the kd-tree root node */ int enPt = 0; /* setup initial entry point . . . enPt corresponds to pointer */ stack[enPt].t = a; /* set the signed distance */ /* distinguish between internal and external origin */ if (a >= 0.0) /* a ray with external origin */ stack[enPt].pb = ray.origin + ray.dir * a; else /* a ray with internal origin */

157

158ü

Appendix C stack[enPt].pb = ray.origin;

/* setup initial exit point in the stack */ int exPt = 1; /* pointer to the stack */ stack[exPt].t = b; stack[exPt].pb = ray.origin + ray.dir * b; stack[exPt].node = "nowhere"; /* set termination flag */ /* loop, traverse through the whole kd-tree, until an object is intersected or ray leaves the scene */ while (currNode does not point to "nowhere" ) /* loop until a leaf is found */ while (currNode is not a leaf) /* retrieve position of splitting plane */ float splitVal = currNode->splitPlane; /* similar code for all axes */ /* nextAxis is x y, y z, z /* prevAxis is z x, y x, y

x */ z */

if (stack[enPt].pb[axis] <= splitVal) if (stack[exPt].pb[axis] <= splitVal) /* case N1, N2, N3, P5, Z2, and Z3 */ currNode = currNode->left; continue;

if (stack[exPt].pb[axis] == splitVal) currNode = currNode->right; continue; /* case Z1 */ /* if */ /* case N4 */ farChild = currNode->right; currNode = currNode->left;

else /* (stack[enPt].pb[axis] splitVal) */ if (splitVal < stack[exPt].pb[axis]) /* case P1, P2, P3, and N5 */ currNode = currNode->right; continue; /* if */ /* case P4*/ farChild = currNode->left; currNode = currNode->right; /* if */ /* case P4 or N4 . . . traverse both children */

/* signed distance to the splitting plane */ t = (splitVal - ray.origin[axis]) / ray.dir[axis]; /* setup the new exit point */ int tmp = exPt; Increment(exPt);

Appendix C /* possibly skip current entry point so not to overwrite the data */ if (exPt == enPt) Increment(exPt);

/* push values onto the stack */ stack[exPt].prev = tmp; stack[exPt].t = t; stack[exPt].node = farChild; stack[exPt].pb[axis] = splitVal; stack[exPt].pb[nextAxis] = ray.origin[nextAxis] + t * ray.dir[nextAxis]; stack[exPt].pb[prevAxis] = ray.origin[prevAxis] + t * ray.dir[prevAxis]; /* while */

/* current node is the leaf . . . empty or full */ "intersect ray with each object in the object list, discarding " "those lying before stack[enPt].t or farther than stack[exPt].t" if ( any intersection exists ) return ["object with closest intersection point"]; /* pop from the stack */ enPt = exPt; /* the signed distance intervals are adjacent */ /* retrieve the pointer to the next node, it is possible that ray traversal terminates */ currNode = stack[exPt].node;

exPt = stack[enPt].prev; /* while */

/* currNode = ”nowhere”, ray leaves the scene */ return ["No object"]; /* RayTravAlgRECB */

159

ê 160

Appendix C

Appendix D

161

Appendix D C-pseudocodes concerning ray traversal algorithms for the kd -tree with neighbor-links.

Algorithm determining a single neighbor-link for one face of a leaf of the kd -tree. /* Possible orientation of the splitting plane in the interior node of the kd-tree, */ /* ”No axis” denotes a leaf. */ enum Axes X axis, Y axis, Z axis, No axis ; /* Denote the faces of kd-tree node */ enum Faces FLeft, FRight, FFront, FBack, FBottom, FTop ; /* Declaration of the kd-tree node */ struct KDTNode Point3D min, max; /* extent of node . . . six float values */ GeomObjlist *objlist; /* list of enclosed objects */ struct KDTNode *left; /* pointer to the left child */ struct KDTNode *right; /* pointer to the right child */ /* links from faces to neighbors, either single neighbor-links or neighbor-links trees */ struct KDTNode *flinks[6]; Axes axis; /* orientation of the splitting plane */ float splitPlane; /* position of the splitting plane */ ; /* Given a node and one of its faces, find out the single neighbor-link */ KDTNode* FindSingleNeighborLink(KDTNode *node, Faces face, KDTNode *rootNode)

KDTNode *currNode; /* currently accessed node */ if (node->box[face] is coplanar with a face of sceneBox) /* ray leaves the scene from the face of given node */ return ["No neighbor link exists"]; /* stack required for traversal to store far child nodes */ KDTNode *stack[MAXDEPTH]; /* MAXDEPTH could be 50 */ /* search starts from the root node */ stack.push(rootNode); /* loop until we find out the neighbor-link */ while (stack is not empty) /* get currNode node on the path */ currNode = stack.pop(); if (currNode is not leaf) /* currNode node is interior one */ if (currNode->axis == node->axis)

ô 162

Appendix D /* splitting plane parallel with face of node */ if (node->box.GetExtent(face) < currNode->splittingPlane - epsilon) stack.push(currNode->left); else if (node->box.GetExtent(face) > currNode->splittingPlane + epsilon) stack.push(currNode->right); else /* the splitting plane underlying the face, select the opposite, */ /* part of kd-tree that does not include node */ if (face is a min face of the node) /* it was a min limit – go to the left */ stack.push(currNode->left); else /* it was a max limit – go to the right */ stack.push(currNode->right); /* if */ /* if */

else /* it is some other axis so test if it splits the face or not */ if (node->min[currNode->axis] >= currNode->splittingPlane) /* greater, it avoids splitting the face of node */ stack.push(currNode->right); else if (node->max[currNode->axis] <= currNode->splittingPlane) /* smaller, it avoids splitting the face of node */ stack.push(currNode->left); /* if */ /* if axis . . . */ /* if cuurent node is a leaf*/ /* while stack is not empty */

/* neighbor-link is located, it points either to leaf or to the interior node */ return ["currNode"]; /* FindSingleNeighborLink */

Algorithm replacing a single link by a neighbor-links tree decomposed into two procedures.

/* Given a node and face, possibly replaces indirect neighbor-link by corresponding neighbor-links tree */ void BuildNeighborLinksTree(KDTNode *node, Faces face)

if (node->flinks[face] == NULL) return; if (node->flinks[face] is leaf) return; /* the neighbor-link points to an interior kd-tree node, */ /* it will be replaced a neighbor-links tree, it has sense */ node->flinks[face] = CreateNeighborLinkTree(node, node->flink[face], face); /* BuildNeighborLinksTree */

Appendix D

163

/* Always creates neighbor-links tree. */ KDTNode* CreateNeighborLinksTree(KDTNode *node, KDTNode *subtree, Faces face)

/* current node of the kd-tree */ KDTNode *currNode = subtree; while(currNode is not leaf) /* currNode is an interior node, we are descending to a leaf */ if (currNode->axis is perpendicular to face) if (node->box[face] is to the left of currNode->splitPlane) currNode = currNode->left; else if (node->box[face] is to the right of currNode->splitPlane) currNode = currNode->right; else /* limits are equal make decision regarding to the oposite limit */ if (face is a min face of currNode) currNode = currNode->left; /* it was a min limit – go left */ else currNode = currNode->right; /* it was a max limit – go right */ /* if */ /* if */

else /* it is some other axis so test if it splits the face or not */ if (node->box is to the right of currNode->splitPlane) currNode = currNode->right; /* greater */ else if (node->box is to the left of currNode->splitPlane) currNode = currNode->left; /* smaller */ else /* this node intersects the face – it must be added to the created neighbor-links tree */ KDTNode *result = new KDTNode; result->splitPlane = currNode->splitPlane; result->axis = currNode->axis; /* recusively call itself to get whole neighbor-links tree */ result->left = NextRopeTreeNode(node, currNode->left, face); result->right = NextRopeTreeNode(node, currNode->right, face); return result; /* if */ /* if axis .. */ /* if perpendicular to face */ /* while curr node is not a leaf */

/* return the leaf node of neigbhour-links tree */ return currNode; /* CreateNeighborLinksTree */

ù 164

Appendix D

Ray traversal algorithm for the kd -tree with single neighbor-links and neighbor-links trees.

/* Ray traversal algorithm for neighbor-links, */ /* leaf node where ray origin is located may be specified, if known. */ Object RayTravAlgNL(Ray ray, KDTNode *leafOrigin, KDTNode *rootNode)

KDTNode *currLeaf = leafOrigin; /* currently accessed leaf node of kd-tree */ Point3D exitPoint; /* exit point of current leaf node */ if (currLeaf is not valid leaf od kd-tree) float a, b; /* entry/exit point signed distances */

/* intersect ray with sceneBox, find the entry and exit signed distance */ RayBoxIntersect(ray, rootNode, &a, &b); if (ray does not intersect sceneBox) return ["No object"]; /* entry intersection point of ray and sceneBox */ exitPoint = ray.origin + ray.dir * (a + epsilon); /* locate the first leaf along the ray path */ currLeaf = LocateLeaf(rootNode, exitPoint);

/* traverse through whole kd-tree until the object is intersected or the ray leaves the scene */ while (true) /* endless loop */ if (currLeaf is not empty leaf) "intersect ray with each primitive in the object list" "discarding those lying outside the leaf bounding box"

if (any intersection exists) return ["object with the closest intersection point"]; /* if */

/* exit-face determination for exit intersection point */ nextExitFace = GetExitFace(currNode, ray, exitPoint);

if (currLeaf.flinks[nextExitFace] does point outside the sceneBox) return ["No object"]; /* the ray exits the scene */ if (currLeaf.flinks[nextExitFace] is a leaf) currLeaf = currLeaf.flinks[nextExitFace]; else currLeaf = LocateLeaf(currLeaf.flinks[nextExitFace], exitPoint); /* while */ /* RayTravAlgNL */

Appendix E

165

Appendix E The experimental results for SPD scenes are presented here primarily for testing procedure TPD . The experiments for setting described on lines 0–47 were conducted on PC, Intel Pentium II MMX, 466Mhz, 128MB RAM, running Linux operating system (kernel version 2.2.12-20), egcs-1.1.2 compiler release, optimization switches “-O2”. In addition, for lines 33–47 compiler setting “-DNDEBUG” was used (for this reason the results for line 45 and 13 differ in subset Θ). The results are presented in tabular form using the minimum testing output (described in Chapter 2). The results for each out of 30 SPD scenes are presented, with several invariants of the measurement related to TPD : Nhit :

number of primary rays intersecting the scene axis-aligned bounding box,

hit : N prim

number of primary rays intersecting an object,

Nsec :

number of secondary rays (reflected rays and refracted rays),

hit : Nsec

number of secondary rays intersecting an object (reflected and refracted rays),

Nshad :

number of shadow rays,

hit : Nshad

number of shadow rays hitting opaque objects,

TRMIN :

minimum application running time,

Tapp :

remaining application time,

MIN : TRSA

ideal ray shooting time. (See Chapter 2 for details.)

Several other scene characteristics are given in Table 3.2, Chapter 3. In experiments for the kd -trees built for special ray sets (parallel, perspective, spherical), these invariants do not hold, we do not state them here. Other settings used in Tables:

Line 0: “na¨ıve RSA” (for several experiments, integral counters overflew; the results are not fully reported), Line 48: BVH – bounding volume hierarchy (Subsection 3.5.2 and Subsection 1.6.2), Line 49: O84 – octree (Subsection 3.5.2 and Subsubsection 1.6.3.2), Line 50: O89 – octree (Subsection 3.5.2 and Subsubsection 1.6.3.2), Line 51: BSP – binary space partitioning tree (Subsection 3.5.2 and Subsubsection 1.6.3.1), Line 52: O93 – octree (Subsection 3.5.2 and Subsubsection 1.6.3.2), Line 53: UG – uniform grid (Subsection 3.5.2 and Subsubsection 1.6.3.3), Line 54: AG – adaptive grid (Subsection 3.5.2 and Subsubsection 1.6.3.4), Line 55: HUG – hierarchy of uniform grids (Subsection 3.5.2 and Subsubsection 1.6.3.4), Line 56: RG – recursive grids (Subsection 3.5.2 and Subsubsection 1.6.3.4), Line 57: O84A – Octree-R (Subsection 3.5.2 and Subsubsection 1.6.3.2), Line 58: O93A – Octree-R (Subsection 3.5.2 and Subsubsection 1.6.3.2), Line 59: KD – kd -tree (Subsection 3.5.2, Chapter 4, and Chapter 5),

û 166

Appendix E

Mnemonic Notation for Tables in Appendix E The setting for the experiments is described in two Sections. In Section 4.10, lines 1–32, the setting is described for kd -tree construction algorithms. In Section 5.5, lines 33–47, the setting is described for ray traversal algorithms. The second column in the tables gives mnemonic symbols for the setting used in the experiments, where the following symbols are used:

(atc):

automatic termination criteria (Section 4.5),

(dmax ,Nmax ):

ad hoc termination criteria with maximum leaf depth dmax and maximum number of objects in a leaf Nmax (Subsubsection 4.2.4.1),

objmed:

object median subdivision (Subsection 4.2.2),

spatmed:

spatial median subdivision (i.e. BSP tree, Subsection 4.2.2),

xyz:

cyclical change of splitting plane orientation starting with the x-axis (Subsection 4.2.1),

GCM,GCM2,GCM3: general cost model (Section 4.7), LC:

late empty space cutting off (Subsection 4.4.3),

OSAH:

ordinary surface area heuristic (Subsection 4.2.2),

PAR:

ray set induced by parallel projection (Section 4.8),

PARSAH:

parallel surface area heuristic (Subsection 4.8.1),

PER:

ray set induced by perspective projection (Section 4.8),

PERSAH:

perspective surface area heuristic (Subsection 4.8.2),

RMI:

search restricted to median interval (Subsubsection 4.2.3.4),

SPH:

ray set induced by spherical projection (Section 4.8),

SPHSAH:

spherical surface area heuristic (Subsection 4.8.3),

TAseq :

sequential ray traversal algorithm (Subsection 5.3.1),

TAArec :

slower recursive ray traversal algorithm (Subsection 5.3.2),

TABrec :

faster recursive ray traversal algorithm (Section 5.4),

TASNL :

ray traversal algorithm with single neighbour-links (Section 5.4),

TANLT :

ray traversal algorithm with neighbour-links trees (Section 5.4),

TPC:

two-plane empty space cutting off (Subsection 4.4.4).

If not stated explicitly by PAR, PER, or SPH, the distribution of rays was induced by the testing procedure TPD . The same holds for the ray traversal algorithm. If not stated explicitly, the recursive ray traversal algorithm TABrec is used. Symbol “–” denotes unknown value.

Appendix E

167 Scene = “balls3”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 526 4323 3829 1569 1569 1779 65 64 2411 1569 9063 3667 767 1098 790 732 813 2411 2409 2409 767 720 906 1535 791

0 527 4324 3830 1570 1570 1780 66 65 2412 1570 9064 3668 768 1099 791 733 814 2412 2410 2410 768 721 907 1536 792

1 219 185 110 295 295 306 3 3 663 295 1120 340 158 145 158 148 148 663 665 665 158 202 182 87 76

821 2373 8025 7192 2608 2608 2964 989 988 3076 2608 10930 6321 1674 2154 1732 1646 1829 3076 3074 3074 1492 1468 1859 3695 2040

– 78.07 56.10 61.00 9.78 9.78 9.68 64.44 64.45 9.01 9.78 8.30 9.49 12.38 11.91 12.38 12.39 12.38 9.01 9.01 9.01 11.97 11.22 11.95 14.25 19.44

0 57.46 66.82 63.32 19.56 19.56 21.87 11.36 11.36 21.47 19.56 24.15 20.51 18.57 22.26 18.57 18.51 18.57 21.47 21.48 21.48 18.44 18.29 19.54 22.71 21.63

0 11.02 14.68 14.92 3.98 3.98 4.64 2.69 2.69 4.34 3.98 4.78 4.13 3.87 4.39 3.87 3.86 3.87 4.34 4.34 4.34 3.86 3.85 4.05 4.58 4.13

0 5.83 0.96 0.43 1.01 1.01 1.74 0.17 0.17 1.24 1.01 1.38 1.03 0.96 1.23 0.96 0.96 0.96 1.24 1.24 1.24 0.98 1.19 1.05 0.92 0.58

0.00 0.02 0.13 0.16 0.13 0.13 0.07 0.06 0.06 0.13 0.13 0.30 0.18 0.09 0.13 0.09 0.11 0.13 0.17 0.17 0.17 0.12 0.15 2.52 4.65 2.38

314.77 27.02 28.28 29.44 12.66 12.26 12.49 18.20 18.01 12.64 12.66 13.22 12.54 12.61 13.02 12.61 12.56 11.60 11.72 11.82 11.75 11.83 12.03 12.39 13.35 13.81

12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07

1.0 0.42 0.31 0.34 0.21 0.21 0.19 0.75 0.75 0.18 0.21 0.16 0.20 0.26 0.22 0.26 0.26 0.26 0.18 0.18 0.18 0.25 0.25 0.25 0.25 0.32

749.45 52.26 55.26 58.02 18.07 17.12 17.67 31.26 30.81 18.02 18.07 19.40 17.79 17.95 18.93 17.95 17.83 15.55 15.83 16.07 15.90 16.10 16.57 17.43 19.71 20.81

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

767 812 767 1241 702 767 702

768 813 768 1242 703 768 703

158 174 158 284 129 158 129

1674 1723 1674 2142 1688 1674 1688

6.08 5.98 5.93 5.17 6.98 5.86 6.90

21.25 21.33 21.06 23.85 20.78 20.74 20.50

4.84 4.83 4.84 5.56 4.82 4.76 4.75

1.39 1.38 1.56 1.90 1.12 1.53 1.10

0.10 0.14 0.11 24.21 0.78 0.11 0.78

4.16 4.15 4.23 4.33 4.33 4.50 4.62

13.00 13.00 12.43 12.43 12.43 12.04 12.04

0.36 0.35 0.39 0.35 0.43 0.43 0.48

7.80 7.75 7.71 8.19 8.19 7.52 8.04

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

1569 1569 1569 1569 10510 2196 2196 2196 2196 14308 767 767 767 767 5207

1570 1570 1570 1570 1570 2197 2197 2197 2197 2197 768 768 768 768 768

295 295 295 295 295 337 337 337 337 337 158 158 158 158 158

2608 2608 2608 2608 2608 3492 3492 3492 3492 3492 1674 1674 1674 1674 1674

10.30 9.78 9.78 10.30 10.30 10.03 9.51 9.51 10.04 10.03 12.92 12.38 12.38 12.92 12.92

46.11 19.56 19.56 18.24 16.54 48.44 20.11 20.11 18.76 16.97 42.39 18.57 18.57 17.25 15.72

4.10 3.98 3.98 4.11 4.10 4.20 4.07 4.07 4.20 4.20 3.97 3.87 3.87 3.97 3.97

1.01 1.01 1.01 1.01 1.01 1.03 1.03 1.03 1.03 1.03 0.96 0.96 0.96 0.96 0.96

0.10 0.10 0.10 0.11 0.14 0.12 0.12 0.11 0.16 0.18 0.08 0.09 0.08 0.09 0.09

16.31 12.55 11.47 12.06 12.33 16.65 12.72 11.58 12.11 12.20 16.16 12.64 11.67 12.15 12.23

12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07 12.07

0.12 0.18 0.21 0.19 0.19 0.11 0.17 0.20 0.18 0.18 0.15 0.23 0.26 0.24 0.24

26.76 17.81 15.24 16.64 17.29 27.57 18.21 15.50 16.76 16.98 26.40 18.02 15.71 16.86 17.05

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

106 262 262 526 262 0 22 14 127 1350 1569

520 1835 1835 527 1835 4107 2754 852 2922 9451 1570

0 853 853 219 853 2720 1257 366 577 1418 295

821 4500 4500 2373 4500 2502 3734 1436 6856 15373 2608

37.19 69.15 69.34 78.07 68.21 141.67 17.88 681.67 25.92 13.09 9.78

25.89 119.30 76.73 57.46 66.33 5.21 7.07 6.48 10.31 41.51 19.56

17.40 19.56 19.55 11.02 26.33 5.21 6.28 3.37 7.36 8.61 3.98

0.00 12.16 12.16 5.83 19.02 2.58 3.13 1.33 2.63 4.60 1.01

0.05 0.04 0.03 0.85 0.03 0.03 0.08 0.03 0.03 0.19 0.30

197.49 73.33 50.37 82.40 68.55 45.45 23.18 281.01 28.96 32.29 21.40

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 1: Experimental results for scene “balls3”.

N á 821, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263169, Nsec á 151293, Nsec á 104440, hit MIN MIN Nshad á 921077, Nshad á 236800, TR ç sé]á 5 â 49, Tapp ç séOá 5 â 07, TRSA ç séOá 0 â 42.

168ü

Appendix E Scene = “gears2”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 2247 6356 8543 3295 3295 2157 90 90 5624 3295 14969 6977 3297 3265 3842 2412 3912 5626 5620 5622 3297 3302 4934 8020 3491

0 2248 6357 8544 3296 3296 2158 91 91 5625 3296 14970 6978 3298 3266 3843 2413 3913 5627 5621 5623 3298 3303 4935 8021 3492

1 156 490 619 40 40 90 11 11 165 40 1021 190 78 159 78 81 86 167 175 177 78 81 109 240 228

1169 9196 13596 15906 7546 7546 5325 1386 1386 9338 7546 21134 14573 6229 6267 7629 5095 8016 9338 9334 9334 5691 6182 8655 14595 6697

– 13.79 19.66 45.29 7.55 7.55 8.07 40.52 40.52 6.10 7.55 5.70 7.31 6.57 6.17 6.50 6.87 6.59 6.10 6.12 6.12 6.40 6.57 6.85 7.81 9.48

0 23.38 46.04 66.50 17.00 17.00 20.78 12.38 12.38 18.54 17.00 19.30 17.26 17.89 21.09 18.10 17.56 18.06 18.54 18.58 18.58 17.90 17.91 19.61 22.92 26.39

0 3.95 8.95 13.09 2.98 2.98 4.04 2.54 2.54 3.18 2.98 3.34 3.02 3.10 3.71 3.12 3.07 3.12 3.18 3.20 3.20 3.11 3.10 3.48 3.79 4.37

0 1.20 4.07 0.62 0.53 0.53 1.70 0.51 0.51 0.56 0.53 0.60 0.54 0.55 1.34 0.55 0.56 0.56 0.56 0.56 0.56 0.55 0.55 0.43 0.50 0.95

0.00 0.06 0.20 0.32 0.19 0.19 0.10 0.07 0.07 0.23 0.19 0.49 0.30 0.18 0.24 0.22 0.17 0.28 0.30 0.29 0.30 0.55 0.29 15.45 40.60 11.54

1166.83 27.75 36.85 61.04 23.71 23.45 24.74 35.40 35.36 22.69 23.71 22.90 22.97 22.73 23.98 23.04 23.04 21.80 21.49 21.78 22.70 22.59 23.09 23.41 24.54 26.00

6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82

1.0 0.49 0.41 0.52 0.42 0.42 0.38 0.84 0.84 0.34 0.42 0.32 0.40 0.37 0.32 0.36 0.38 0.37 0.34 0.34 0.34 0.36 0.37 0.36 0.35 0.36

783.11 11.81 17.91 34.15 9.09 8.92 9.79 16.94 16.91 8.41 9.09 8.55 8.60 8.44 9.28 8.64 8.64 7.81 7.60 7.80 8.42 8.34 8.68 8.89 9.65 10.63

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

3297 3889 3297 4046 4264 3297 4264

3298 3890 3298 4047 4265 3298 4265

78 136 78 170 634 78 634

6229 6925 6229 7290 9174 6229 9174

2.34 2.24 2.66 2.49 4.51 2.66 4.50

11.60 11.86 13.47 13.41 18.20 13.30 18.13

2.05 2.09 2.17 2.31 3.02 2.14 3.02

0.21 0.27 0.25 0.46 0.79 0.24 0.81

0.21 0.26 0.21 21.93 1.04 0.21 1.04

5.86 5.70 5.61 5.40 5.87 5.79 6.21

8.02 8.02 8.17 8.17 8.17 8.02 8.02

0.70 0.66 0.65 0.60 0.59 0.64 0.61

3.25 2.94 3.52 3.08 4.06 3.12 3.92

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

3295 3295 3295 3295 14579 4799 4799 4799 4799 20364 3297 3297 3297 3297 14410

3296 3296 3296 3296 3296 4800 4800 4800 4800 4800 3298 3298 3298 3298 3298

40 40 40 40 40 98 98 98 98 98 78 78 78 78 78

7546 7546 7546 7546 7546 10190 10190 10190 10190 10190 6229 6229 6229 6229 6229

10.58 7.55 7.55 10.58 10.58 10.26 7.33 7.33 10.26 10.26 9.24 6.57 6.57 9.24 9.24

31.36 17.00 17.00 15.41 14.60 31.99 17.20 17.20 15.58 14.75 33.53 17.89 17.89 16.20 15.09

3.01 2.98 2.98 3.01 3.01 3.03 3.01 3.01 3.04 3.04 3.13 3.10 3.10 3.13 3.13

0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.53 0.55 0.55 0.55 0.55 0.55

0.15 0.15 0.16 0.19 0.21 0.20 0.16 0.19 0.25 0.30 0.14 0.16 0.15 0.20 0.22

26.30 23.41 21.93 21.77 23.19 26.54 23.55 21.87 21.71 22.15 26.44 23.36 21.54 21.62 21.80

6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82 6.82

0.31 0.37 0.42 0.43 0.38 0.29 0.36 0.41 0.42 0.40 0.26 0.32 0.37 0.37 0.36

10.83 8.89 7.90 7.79 8.74 10.99 8.99 7.86 7.75 8.05 10.93 8.86 7.64 7.69 7.81

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

143 1185 1185 2247 1357 0 116 15 347 4349 3295

762 8296 8296 2248 9500 5887 5122 3480 6374 30444 3296

0 2820 2820 156 3508 4371 0 0 2464 7658 40

1169 17960 17960 9196 19636 4121 10112 8100 18917 53542 7546

37.59 11.36 10.64 13.79 9.74 18.95 9.41 36.82 20.30 8.60 7.55

35.70 27.93 21.48 23.38 20.46 8.42 4.41 4.95 8.36 30.98 17.00

27.66 5.65 5.49 3.95 7.95 8.42 3.40 1.69 6.42 6.23 2.98

0.00 2.81 2.81 1.20 5.55 5.56 0.00 0.00 3.18 3.33 0.53

0.07 0.09 0.07 1.15 0.09 0.04 0.23 0.08 0.06 0.57 0.60

444.67 48.97 41.43 121.18 49.88 40.22 45.39 73.80 50.71 49.40 38.96

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 2: Experimental results for scene “gears2”.

N á 1169, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 243148, Nsec á 270124, Nsec á 191717, hit MIN MIN Nshad á 1565640, Nshad á 369157, TR ç sé]á 11 â 65, Tapp ç sé]á 10 â 16, TRSA ç sé]á 1 â 49.

Appendix E

169 Scene = “jacks3”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 9091 5032 4879 2678 2678 2762 209 196 5499 2678 10584 4611 1687 2636 2658 1054 2347 5499 5506 5506 1687 1990 2591 5517 3810

0 9092 5033 4880 2679 2679 2763 210 197 5500 2679 10585 4612 1688 2637 2659 1055 2348 5500 5507 5507 1688 1991 2592 5518 3811

1 256 300 176 258 258 243 49 36 1108 258 1184 258 630 442 631 358 360 1108 1106 1106 630 1161 715 195 450

657 21468 9540 9517 4614 4614 4814 1270 1270 6546 4614 13622 8617 2330 3706 4607 1937 4913 6546 6561 6561 1656 1512 3393 10223 5965

– 34.87 37.03 39.74 22.40 22.40 22.34 56.62 57.65 16.53 22.40 16.39 22.78 22.07 20.74 22.63 26.42 26.59 16.53 16.54 16.54 17.82 9.78 17.95 24.69 23.66

0 32.72 35.27 34.32 26.50 26.50 25.46 14.92 14.55 32.57 26.50 35.87 28.08 24.37 28.02 27.12 21.64 25.89 32.57 32.57 32.57 24.45 25.85 28.13 33.83 32.18

0 6.31 6.95 6.81 5.08 5.08 4.92 3.12 3.04 6.09 5.08 6.59 5.33 4.71 5.33 5.21 4.21 4.98 6.09 6.08 6.08 4.73 5.00 5.43 6.24 5.88

0 0.84 0.76 0.23 1.27 1.27 1.08 1.01 0.76 2.55 1.27 2.57 1.27 2.24 1.82 2.24 1.87 1.87 2.55 2.54 2.54 2.24 3.26 2.43 1.38 1.63

0.00 0.08 0.13 0.20 0.14 0.14 0.09 0.05 0.05 0.20 0.14 0.33 0.20 0.11 0.18 0.15 0.10 0.18 0.24 0.24 0.26 0.18 0.18 7.04 15.02 10.89

155.50 9.86 9.64 10.18 7.32 7.27 7.03 8.96 9.03 7.11 7.32 7.42 7.52 6.78 7.02 7.25 6.86 7.31 6.81 6.84 6.93 6.09 5.27 6.75 8.17 7.68

7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65

1.0 0.49 0.49 0.51 0.44 0.44 0.44 0.78 0.78 0.32 0.44 0.29 0.43 0.45 0.40 0.43 0.53 0.48 0.32 0.32 0.32 0.40 0.26 0.37 0.40 0.40

676.09 35.22 34.26 36.61 24.17 23.96 22.91 31.30 31.61 23.26 24.17 24.61 25.04 21.83 22.87 23.87 22.17 24.13 21.96 22.09 22.48 18.83 15.26 21.70 27.87 25.74

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

1687 3806 1687 1765 3251 1687 3251

1688 3807 1688 1766 3252 1688 3252

630 274 630 580 802 630 802

2330 11374 2330 3129 8140 2330 8140

12.32 4.90 13.23 8.99 22.89 13.22 22.80

15.37 5.91 18.32 11.66 16.82 18.25 16.73

3.08 0.51 3.64 2.16 3.32 3.63 3.31

1.89 0.15 2.29 1.34 1.70 2.28 1.69

0.12 0.34 0.12 30.37 1.56 0.12 1.58

1.90 1.35 2.40 2.02 2.83 2.75 3.12

7.00 7.00 7.31 7.31 7.31 7.86 7.86

0.37 0.55 0.37 0.46 0.56 0.45 0.59

12.00 6.50 11.15 8.23 14.46 11.79 14.43

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2678 2678 2678 2678 18243 3436 3436 3436 3436 22027 1687 1687 1687 1687 12617

2679 2679 2679 2679 2679 3437 3437 3437 3437 3437 1688 1688 1688 1688 1688

258 258 258 258 258 258 258 258 258 258 630 630 630 630 630

4614 4614 4614 4614 4614 6114 6114 6114 6114 6114 2330 2330 2330 2330 2330

22.88 22.40 22.40 22.89 22.89 23.19 22.66 22.66 23.19 23.19 22.59 22.07 22.07 22.59 22.59

64.46 26.50 26.50 22.08 18.53 68.50 27.44 27.44 22.86 19.13 55.46 24.37 24.37 20.56 17.43

5.15 5.08 5.08 5.15 5.15 5.32 5.24 5.24 5.32 5.32 4.76 4.71 4.71 4.76 4.76

1.27 1.27 1.27 1.27 1.27 1.27 1.27 1.27 1.27 1.27 2.24 2.24 2.24 2.24 2.24

0.12 0.10 0.12 0.16 0.16 0.14 0.13 0.14 0.19 0.21 0.10 0.09 0.10 0.12 0.14

9.42 7.55 6.78 7.74 7.75 9.72 7.84 6.91 7.86 7.80 8.59 7.11 6.41 7.13 7.24

7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65 7.65

0.29 0.38 0.44 0.37 0.37 0.28 0.36 0.43 0.36 0.37 0.31 0.39 0.45 0.39 0.38

33.30 25.17 21.83 26.00 26.04 34.61 26.43 22.39 26.52 26.26 29.70 23.26 20.22 23.35 23.83

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

73 4143 4143 9091 4143 0 36 75 147 3929 2678

383 29002 29002 9092 29002 3120 1929 1866 2130 27504 2679

0 2046 2046 256 2046 1724 610 988 336 1226 258

657 56486 56486 21468 56486 4698 3926 2652 6782 58938 4614

63.43 36.77 36.74 34.87 35.94 36.58 48.04 51.83 48.27 31.80 22.40

25.04 48.72 33.47 32.72 32.41 7.05 11.06 13.29 7.41 42.49 26.50

17.77 8.48 8.48 6.31 14.45 7.05 8.95 8.99 5.44 7.72 5.08

0.00 1.96 1.96 0.84 8.04 3.58 2.00 3.80 1.64 2.00 1.27

0.05 0.18 0.14 0.31 0.17 0.03 0.12 0.04 0.03 0.41 0.43

79.72 18.58 15.49 15.94 18.15 10.46 21.12 19.85 14.01 16.56 10.63

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 3: Experimental results for scene “jacks3”.

N á 657, hit hit TPD : N prim á 263169, Nhit á 191115, N prim á 72548, Nsec á 119316, Nsec á 49502, hit MIN MIN Nshad á 97328, Nshad á 33063, TR ç sé]á 1 â 99, Tapp ç séOá 1 â 76, TRSA ç séOá 0 â 23.

ê 170

Appendix E Scene = “lattice6”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 9791 2680 1944 2171 2171 2037 255 255 8419 2171 8944 2171 6896 7239 7183 4886 5466 8619 8615 8615 6896 7271 7518 8425 6858

0 9792 2681 1945 2172 2172 2038 256 256 8420 2172 8945 2172 6897 7240 7184 4887 5467 8620 8616 8616 6897 7272 7519 8426 6859

1 0 388 0 200 200 95 1 1 682 200 682 200 682 577 682 675 669 685 683 683 682 688 737 399 441

1225 22056 4494 3818 3736 3736 3743 2019 2019 9502 3736 10027 3736 7979 8427 8266 5976 6570 9699 9697 9697 7979 8348 8546 9792 8182

– 16.02 11.21 13.91 10.37 10.37 10.98 37.03 37.03 4.96 10.37 4.94 10.37 5.18 5.57 5.11 6.03 5.61 4.95 4.90 4.90 5.18 5.05 4.61 5.88 6.80

0 35.05 29.70 28.46 26.82 26.82 25.92 15.76 15.76 35.70 26.82 35.73 26.82 35.41 36.04 35.49 34.07 34.34 35.72 35.83 35.83 35.41 35.52 36.30 36.65 36.50

0 6.21 5.32 5.26 4.88 4.88 4.69 3.28 3.28 6.11 4.88 6.11 4.88 6.10 6.19 6.10 6.01 5.95 6.07 6.05 6.05 6.10 6.11 6.07 6.34 5.97

0 0.00 0.88 0.00 0.72 0.72 0.36 0.01 0.01 2.54 0.72 2.54 0.72 2.54 2.29 2.54 2.53 2.49 2.51 2.51 2.51 2.54 2.60 2.93 2.11 1.42

0.01 0.12 0.11 0.13 0.14 0.14 0.09 0.08 0.08 0.26 0.14 0.28 0.14 0.24 0.33 0.28 0.21 0.29 0.35 0.31 0.37 0.42 0.62 19.42 22.15 17.83

1313.17 32.80 24.92 27.86 25.46 25.04 25.06 46.11 46.14 23.35 25.46 23.20 24.84 23.62 24.20 23.58 24.08 22.50 22.17 22.08 22.15 23.63 23.76 22.81 23.39 23.88

8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15

1.0 0.60 0.56 0.62 0.56 0.56 0.58 0.89 0.89 0.32 0.56 0.31 0.56 0.33 0.34 0.32 0.37 0.35 0.31 0.31 0.31 0.33 0.32 0.30 0.35 0.38

1193.79 21.67 14.51 17.18 15.00 14.62 14.64 33.77 33.80 13.08 15.00 12.95 14.44 13.33 13.85 13.29 13.75 12.31 12.01 11.93 11.99 13.34 13.45 12.59 13.12 13.56

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

6896 3919 6896 2011 3989 6896 3989

6897 3920 6897 2012 3990 6897 3990

682 1425 682 390 951 682 951

7979 4255 7979 3329 4863 7979 4863

11.00 11.36 5.05 64.50 14.82 5.04 14.47

45.40 26.51 47.88 17.98 36.42 47.70 36.30

8.65 6.16 9.35 3.48 6.79 9.30 6.75

4.69 3.06 4.75 0.50 2.52 4.72 2.52

0.28 0.24 0.28 9.42 0.71 0.28 0.73

4.09 3.68 6.73 15.54 7.82 7.18 8.22

8.11 8.11 8.18 8.18 8.18 8.30 8.30

0.35 0.55 0.42 0.94 0.68 0.47 0.69

13.42 11.26 6.78 26.36 9.20 6.98 9.19

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2164 2164 2164 2164 7855 2164 2164 2164 2164 7855 6968 6968 6968 6968 29056

2165 2165 2165 2165 2165 2165 2165 2165 2165 2165 6969 6969 6969 6969 6969

200 200 200 200 200 200 200 200 200 200 685 685 685 685 685

3729 3729 3729 3729 3729 3729 3729 3729 3729 3729 8048 8048 8048 8048 8048

10.39 10.22 10.22 10.39 10.39 10.39 10.22 10.22 10.39 10.39 5.24 5.12 5.12 5.24 5.24

60.08 26.84 26.84 17.56 16.99 60.08 26.84 26.84 17.56 16.99 84.03 35.46 35.46 22.71 21.25

4.91 4.83 4.83 4.91 4.91 4.91 4.83 4.83 4.91 4.91 6.13 6.06 6.06 6.13 6.13

0.75 0.74 0.74 0.75 0.75 0.75 0.74 0.74 0.75 0.75 2.51 2.51 2.51 2.51 2.51

0.12 0.12 0.11 0.13 0.16 0.12 0.11 0.11 0.15 0.16 0.21 0.20 0.20 0.31 0.32

30.54 25.88 23.57 24.46 24.87 30.65 25.86 23.53 24.47 24.57 31.54 25.35 22.17 21.99 22.01

8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15 8.15

0.38 0.48 0.56 0.53 0.51 0.38 0.48 0.56 0.53 0.52 0.19 0.26 0.32 0.32 0.32

19.62 15.38 13.28 14.09 14.46 19.72 15.36 13.25 14.10 14.19 20.53 14.90 12.01 11.85 11.86

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

102 4177 4177 9791 4177 0 514 1 9 1769 2171

759 29240 29240 9792 29240 6859 1279 2197 1298 12384 2172

0 0 0 0 0 0 0 0 0 584 200

1225 45296 45296 22056 45296 9799 3366 4843 7064 17176 3736

45.03 15.96 15.84 16.02 15.30 14.19 26.92 19.07 35.28 10.46 10.37

43.87 41.30 30.17 35.05 31.02 7.82 26.75 6.93 5.54 35.86 26.82

34.39 7.67 7.64 6.21 13.98 7.82 21.82 5.93 4.50 6.93 4.88

0.00 0.00 0.00 0.00 6.62 0.00 0.00 0.00 0.00 1.60 0.72

0.13 0.16 0.14 1.42 0.16 0.08 0.16 0.05 0.02 0.25 1.14

471.90 53.84 46.73 83.74 56.70 33.24 270.32 48.26 62.30 45.47 42.97

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 4: Experimental results for scene “lattice6”.

N á 1225, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 250614, Nsec á 214756, Nsec á 134665, hit MIN MIN Nshad á 1124636, Nshad á 788080, TR ç séOá 10 â 06, Tapp ç sé^á 8 â 96, TRSA ç sé]á 1 â 10.

Appendix E

171 Scene = “mount4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 8348 797 318 424 424 482 146 143 1068 424 1082 433 842 1019 848 764 782 1068 1067 1067 842 880 878 1325 991

0 8349 798 319 425 425 483 147 144 1069 425 1083 434 843 1020 849 765 783 1069 1068 1068 843 881 879 1326 992

1 922 117 8 88 88 95 37 34 478 88 478 88 422 404 422 373 374 478 477 477 422 438 425 448 371

516 22667 1317 609 633 633 726 606 606 887 633 912 653 707 958 722 678 704 887 887 887 706 730 731 1395 994

– 15.60 16.24 19.16 6.89 6.89 7.27 12.31 12.31 5.80 6.89 5.37 6.66 6.65 6.87 6.31 8.21 9.13 5.80 5.79 5.79 6.65 6.48 6.53 6.39 7.37

0 27.03 27.81 31.24 17.25 17.25 17.47 12.39 12.39 18.95 17.25 19.60 17.58 14.97 17.21 16.72 13.03 16.70 18.94 18.91 18.91 14.97 14.98 15.46 20.64 18.87

0 5.13 4.88 5.86 3.17 3.17 3.27 2.45 2.45 3.39 3.17 3.46 3.21 2.88 3.24 3.12 2.53 3.28 3.39 3.38 3.38 2.88 2.90 2.97 3.81 3.46

0 1.01 0.50 0.18 0.81 0.81 0.73 0.60 0.60 1.14 0.81 1.14 0.81 1.10 0.87 1.10 1.06 1.06 1.14 1.13 1.13 1.10 1.11 1.25 0.97 0.97

0.00 0.10 0.03 0.03 0.04 0.03 0.02 0.03 0.02 0.05 0.04 0.04 0.03 0.04 0.06 0.04 0.04 0.05 0.05 0.05 0.05 0.05 0.05 2.37 3.81 2.86

253.97 16.39 15.80 17.58 12.64 12.14 12.04 12.69 12.66 12.42 12.64 12.44 12.36 11.53 12.21 11.99 11.26 11.70 11.33 11.24 11.24 11.51 11.66 11.40 12.40 12.09

15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85

1.0 0.38 0.38 0.39 0.29 0.29 0.30 0.51 0.51 0.24 0.29 0.22 0.28 0.32 0.29 0.28 0.40 0.36 0.24 0.24 0.24 0.32 0.31 0.31 0.24 0.29

746.97 32.35 30.62 35.85 21.32 19.85 19.56 21.47 21.38 20.68 21.32 20.74 20.50 18.06 20.06 19.41 17.26 18.56 17.47 17.21 17.21 18.00 18.44 17.68 20.62 19.71

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

842 824 842 819 842 842 842

843 825 843 820 843 843 843

422 416 422 381 422 422 422

707 688 707 714 707 707 707

5.07 5.05 5.21 6.41 5.21 5.20 5.20

20.67 20.87 22.38 19.74 22.38 22.02 22.02

4.31 4.35 4.77 4.48 4.77 4.69 4.69

2.91 2.96 2.67 2.49 2.67 2.64 2.64

0.05 0.05 0.05 3.22 0.06 0.04 0.09

1.97 1.98 2.68 2.60 2.51 2.81 2.95

13.50 13.50 14.25 14.25 14.25 14.50 14.50

0.21 0.21 0.35 0.40 0.27 0.28 0.35

19.33 19.50 19.25 18.25 17.12 13.60 15.00

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

424 424 424 424 1829 431 431 431 431 1856 842 842 842 842 3473

425 425 425 425 425 432 432 432 432 432 843 843 843 843 843

88 88 88 88 88 88 88 88 88 88 422 422 422 422 422

633 633 633 633 633 648 648 648 648 648 707 707 707 707 707

7.37 6.89 6.89 7.38 7.38 7.16 6.69 6.68 7.17 7.17 7.12 6.65 6.65 7.12 7.12

29.98 17.25 17.25 11.88 10.89 30.86 17.56 17.56 12.06 11.04 25.26 14.97 14.97 10.42 9.66

3.17 3.17 3.17 3.17 3.17 3.21 3.21 3.21 3.21 3.21 2.88 2.88 2.88 2.88 2.88

0.81 0.82 0.81 0.81 0.81 0.81 0.82 0.81 0.81 0.81 1.10 1.10 1.10 1.10 1.10

0.02 0.02 0.02 0.03 0.03 0.03 0.03 0.03 0.03 0.04 0.03 0.03 0.03 0.04 0.05

13.79 12.10 11.25 10.26 10.54 13.95 12.15 11.28 10.27 10.32 12.77 11.20 10.61 9.75 9.80

15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85 15.85

0.20 0.25 0.29 0.35 0.33 0.19 0.24 0.28 0.34 0.33 0.23 0.29 0.32 0.38 0.38

24.71 19.74 17.24 14.32 15.15 25.18 19.88 17.32 14.35 14.50 21.71 17.09 15.35 12.82 12.97

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

42 4085 4085 8348 4085 0 160 1 137 5793 424

304 28596 28596 8349 28596 2548 4002 343 2443 40552 425

0 5069 5069 922 5069 1839 1292 172 564 3378 88

516 61738 61738 22667 61738 3629 9819 1853 9200 98764 633

32.28 15.84 15.87 15.60 15.34 16.61 22.01 28.48 18.80 11.94 6.89

21.59 33.55 24.05 27.03 23.73 6.27 9.71 4.43 5.83 30.41 17.25

16.22 6.28 6.27 5.13 10.13 6.27 8.06 3.43 4.37 5.81 3.17

0.00 1.62 1.62 1.01 5.63 3.09 3.46 0.88 1.22 1.94 0.81

0.02 0.18 0.15 0.21 0.15 0.02 0.14 0.01 0.03 0.64 0.09

185.02 31.01 27.19 36.70 32.65 20.28 36.49 27.35 23.44 27.98 18.88

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

N˜ ET S

N˜ EET S

Table 5: Experimental results for scene “mount4”.

N á 516, hit hit TPD : N prim á 263169, Nhit á 257871, N prim á 173250, Nsec á 707764, Nsec á 472407, hit MIN MIN Nshad á 358042, Nshad á 24257, TR ç sé]á 5 â 73, Tapp ç sé^á 5 â 39, TRSA ç sé]á 0 â 34.

ô 172

Appendix E Scene = “rings3”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 5839 11578 14628 2974 2974 3257 85 85 4250 2974 26501 15943 629 1894 1956 380 1802 4270 4269 4269 629 901 1742 3983 2016

0 5840 11579 14629 2975 2975 3258 86 86 4251 2975 26502 15944 630 1895 1957 381 1803 4271 4270 4270 630 902 1743 3984 2017

1 627 1175 1873 359 359 335 23 23 829 359 1794 498 98 212 98 55 55 835 829 829 98 316 341 134 268

841 18169 25092 28296 6519 6519 7287 1118 1118 7301 6519 44050 34813 2322 4295 7058 1892 7244 7323 7330 7330 1993 1961 3996 11430 5218

– 29.09 64.63 78.89 14.00 14.00 15.41 61.97 61.97 13.16 14.00 15.66 15.82 21.08 16.30 22.36 25.39 27.73 13.16 13.22 13.22 19.30 13.84 15.16 20.14 22.30

0 44.65 92.97 109.04 26.98 26.98 29.75 14.47 14.47 30.33 26.98 38.95 31.98 21.31 30.66 25.99 19.64 25.43 30.34 30.32 30.32 21.37 23.30 25.61 33.45 41.50

0 8.86 18.86 24.31 5.12 5.12 5.95 3.23 3.23 5.84 5.12 7.38 5.99 4.26 6.13 5.05 3.97 5.00 5.84 5.84 5.84 4.28 4.66 5.19 6.45 7.20

0 3.58 5.30 6.78 1.60 1.60 2.04 1.25 1.25 2.10 1.60 2.21 1.63 1.53 2.41 1.53 1.45 1.45 2.11 2.09 2.09 1.53 2.29 1.90 1.82 3.28

0.00 0.09 0.29 0.47 0.17 0.18 0.12 0.05 0.05 0.19 0.17 0.81 0.55 0.10 0.17 0.16 0.10 0.18 0.24 0.25 0.25 0.17 0.18 5.75 13.60 7.09

1014.57 40.62 78.94 94.97 25.30 24.70 26.35 47.78 47.73 24.89 25.30 29.95 28.21 27.46 26.95 30.35 29.55 32.76 23.89 23.92 24.01 25.83 22.27 24.91 31.07 33.00

4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63

1.0 0.62 0.64 0.64 0.57 0.57 0.56 0.91 0.91 0.52 0.57 0.50 0.55 0.71 0.57 0.68 0.76 0.73 0.52 0.52 0.52 0.69 0.60 0.60 0.60 0.57

798.87 27.35 57.53 70.15 15.29 14.82 16.12 32.99 32.95 14.97 15.29 18.95 17.58 16.99 16.59 19.27 18.64 21.17 14.18 14.20 14.28 15.71 12.91 14.98 19.83 21.35

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

629 1874 629 653 1557 629 1557

630 1875 630 654 1558 630 1558

98 91 98 116 300 98 300

2322 11039 2322 2412 6282 2322 6282

4.37 2.63 8.52 14.91 17.01 8.56 17.18

5.55 3.14 20.01 15.97 26.16 19.79 26.11

1.12 0.44 4.18 3.77 5.74 4.13 5.75

0.49 0.01 1.72 1.56 2.57 1.70 2.59

0.12 0.28 0.11 20.88 1.11 0.10 1.11

2.06 1.86 6.28 7.38 9.06 6.40 9.20

3.52 3.52 3.58 3.58 3.58 3.85 3.85

0.73 0.80 0.77 0.86 0.82 0.77 0.82

2.73 2.12 5.52 7.12 9.55 5.56 9.68

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2974 2974 2974 2974 18194 5344 5344 5344 5344 31614 630 630 630 630 3574

2975 2975 2975 2975 2975 5345 5345 5345 5345 5345 631 631 631 631 631

358 358 358 358 358 465 465 465 465 465 99 99 99 99 99

6527 6527 6527 6527 6527 11012 11012 11012 11012 11012 2327 2327 2327 2327 2327

14.95 14.00 14.00 14.95 14.95 15.12 14.04 14.04 15.12 15.12 22.10 20.99 20.99 22.10 22.10

65.01 26.95 26.95 23.89 21.28 72.82 28.69 28.69 25.49 22.56 45.68 21.35 21.35 18.91 17.21

5.29 5.11 5.11 5.29 5.29 5.61 5.40 5.40 5.61 5.61 4.38 4.27 4.27 4.38 4.38

1.60 1.60 1.60 1.60 1.60 1.63 1.63 1.63 1.63 1.63 1.53 1.53 1.53 1.53 1.53

0.15 0.14 0.14 0.20 0.22 0.22 0.20 0.20 0.30 0.29 0.08 0.09 0.09 0.07 0.10

31.35 25.72 23.73 26.70 26.93 33.37 26.68 24.39 27.61 27.70 32.02 27.66 26.20 29.77 29.69

4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63 4.63

0.40 0.51 0.57 0.49 0.48 0.37 0.49 0.55 0.47 0.47 0.55 0.66 0.71 0.60 0.61

20.06 15.62 14.06 16.39 16.57 21.65 16.38 14.57 17.11 17.18 20.58 17.15 16.00 18.81 18.75

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

67 2989 2989 5839 2989 0 11 23 222 3947 2974

506 20924 20924 5840 20924 4046 2523 1280 3911 27630 2975

0 2962 2962 627 2962 3140 112 0 1018 1770 359

841 47428 47428 18169 47428 4414 9162 6406 11803 73146 6519

52.44 29.44 29.46 29.09 29.20 42.37 27.88 353.31 36.69 24.33 14.00

30.88 67.92 45.57 44.65 42.62 12.63 7.00 6.02 12.81 49.30 26.98

22.98 11.97 11.97 8.86 17.95 12.63 5.21 2.73 9.87 9.33 5.12

0.00 5.43 5.43 3.58 11.44 8.71 0.12 0.00 5.25 3.23 1.60

0.06 0.14 0.13 0.93 0.12 0.04 0.13 0.05 0.04 0.45 0.45

282.66 76.77 65.08 119.09 75.17 59.47 61.55 209.44 64.49 63.52 40.39

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 6: Experimental results for scene “rings3”.

N á 841, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263168, Nsec á 177299, Nsec á 90933, hit MIN MIN Nshad á 937972, Nshad á 319722, TR ç séOá 7 â 15, Tapp ç sé]á 5 â 88, TRSA ç sé]á 1 â 27.

Appendix E

173 Scene = “sombrero1”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 9763 3948 977 1099 1099 1843 159 158 3702 1099 3722 1099 3509 3117 3509 3510 3510 3702 3717 3717 3509 3515 3544 4110 3165

0 9764 3949 978 1100 1100 1844 160 159 3703 1100 3723 1100 3510 3118 3510 3511 3511 3703 3718 3718 3510 3516 3545 4111 3166

1 1314 359 0 136 136 564 64 63 1827 136 1827 136 1826 1571 1826 1831 1831 1827 1832 1832 1826 1828 1757 1406 1372

1922 35107 10348 1953 1926 1926 2552 1926 1926 2820 1926 2840 1926 2628 2527 2628 2630 2630 2820 2835 2835 2617 2632 2734 4248 2977

– 35.23 75.13 26.80 6.66 6.66 7.96 58.57 58.69 6.04 6.66 6.05 6.66 5.89 6.73 5.89 5.98 5.98 6.04 6.14 6.14 5.87 5.88 6.10 7.89 7.55

0 27.40 31.89 20.35 12.76 12.76 16.36 9.17 9.14 15.64 12.76 15.65 12.76 15.48 16.89 15.48 15.88 15.88 15.64 16.07 16.07 15.48 15.47 18.75 18.64 19.32

0 5.06 5.78 4.11 2.32 2.32 3.18 1.99 1.98 3.05 2.32 3.05 2.32 2.99 3.01 2.99 3.12 3.12 3.05 3.18 3.18 2.99 2.99 3.71 3.44 3.82

0 2.28 0.48 0.00 1.28 1.28 1.97 1.00 0.98 1.90 1.28 1.90 1.28 1.90 1.69 1.90 2.00 2.00 1.90 2.00 2.00 1.90 1.89 2.54 1.91 2.37

0.00 0.18 0.21 0.14 0.16 0.16 0.13 0.12 0.12 0.21 0.16 0.20 0.16 0.20 0.26 0.20 0.21 0.23 0.24 0.24 0.25 0.22 0.24 10.87 14.86 10.36

415.55 4.16 5.72 3.61 2.49 2.37 2.67 4.28 4.15 2.55 2.49 2.59 2.49 2.57 2.55 2.57 2.61 2.31 2.41 2.32 2.44 2.58 2.33 2.46 2.50 2.55

12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33

1.0 0.46 0.61 0.47 0.26 0.26 0.25 0.81 0.81 0.21 0.26 0.21 0.26 0.20 0.21 0.20 0.20 0.20 0.21 0.20 0.20 0.20 0.20 0.18 0.22 0.21

4617.22 33.89 51.22 27.78 15.33 14.00 17.33 35.22 33.78 16.00 15.33 16.44 15.33 16.22 16.00 16.22 16.67 13.33 14.44 13.44 14.78 16.33 13.56 15.00 15.44 16.00

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

3509 3407 3509 2536 1930 3509 1930

3510 3408 3510 2537 1931 3510 1931

1826 1724 1826 1275 990 1826 990

2628 2630 2628 2446 2241 2628 2241

2.89 2.89 2.95 2.91 52.20 2.95 51.59

11.70 10.29 11.10 10.53 10.66 11.12 10.70

2.28 2.05 2.15 2.12 1.85 2.16 1.86

1.50 1.29 1.38 1.37 0.92 1.39 0.93

0.23 0.25 0.24 8.93 0.45 0.23 0.57

1.58 1.59 1.58 1.60 3.35 2.05 3.72

9.50 9.50 11.38 11.38 11.38 11.30 11.30

0.40 0.48 0.30 0.35 0.82 0.49 0.83

10.25 10.38 8.38 8.62 30.50 9.20 25.90

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

1099 1099 1099 1099 4253 1099 1099 1099 1099 4253 3509 3509 3509 3509 13132

1100 1100 1100 1100 1100 1100 1100 1100 1100 1100 3510 3510 3510 3510 3510

136 136 136 136 136 136 136 136 136 136 1826 1826 1826 1826 1826

1926 1926 1926 1926 1926 1926 1926 1926 1926 1926 2628 2628 2628 2628 2628

6.66 6.66 6.66 6.66 6.66 6.66 6.66 6.66 6.66 6.66 5.89 5.89 5.89 5.89 5.89

25.54 12.76 12.76 12.03 11.56 25.54 12.76 12.76 12.03 11.56 38.10 15.48 15.48 14.25 13.33

2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.99 2.99 2.99 2.99 2.99

1.28 1.28 1.28 1.28 1.28 1.28 1.28 1.28 1.28 1.28 1.90 1.90 1.90 1.90 1.90

0.13 0.11 0.13 0.14 0.14 0.12 0.13 0.13 0.13 0.15 0.26 0.16 0.16 0.22 0.22

2.82 2.33 2.16 2.17 2.17 2.76 2.34 2.21 2.30 2.17 3.18 2.48 2.26 2.36 2.24

12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33 12.33

0.16 0.22 0.26 0.26 0.26 0.17 0.23 0.26 0.24 0.27 0.11 0.17 0.20 0.18 0.20

19.00 13.56 11.67 11.78 11.78 18.33 13.67 12.22 13.22 11.78 23.00 15.22 12.78 13.89 12.56

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

205 5027 5027 9763 5027 0 360 1 319 6353 1099

1223 35190 35190 9764 35190 9583 11472 432 5788 44472 1100

0 8886 8886 1314 8886 7356 3905 205 2984 5260 136

1922 85306 85306 35107 85306 12374 30027 4448 15420 123036 1926

60.37 37.29 37.18 35.23 36.04 29.38 35.07 99.32 30.02 34.76 6.66

33.45 43.37 28.91 27.40 26.35 7.18 6.78 3.31 6.37 39.57 12.76

25.24 7.49 7.48 5.06 10.68 7.18 5.46 2.65 5.02 6.83 2.32

0.00 3.80 3.80 2.28 7.09 5.47 2.81 1.32 3.47 2.82 1.28

0.18 0.26 0.25 0.29 0.24 0.09 0.43 0.04 0.05 0.98 0.33

82.50 9.07 7.23 6.70 8.69 4.48 7.49 9.28 5.39 8.51 3.82

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 7: Experimental results for scene “sombrero1”.

N á 1922, hit hit TPD : N prim á 263169, Nhit á 136638, N prim á 112209, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 110241, Nshad á 2247, TR ç sé]á 1 â 20, Tapp ç séOá 1 â 11, TRSA ç séOá 0 â 09.

ù 174

Appendix E Scene = “teapot4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 6074 3130 3917 2635 2635 2094 172 154 3743 2635 5591 4078 1534 2090 1787 1222 1681 3748 3755 3760 1534 1579 2006 3728 1777

0 6075 3131 3918 2636 2636 2095 173 155 3744 2636 5592 4079 1535 2091 1788 1223 1682 3749 3756 3761 1535 1580 2007 3729 1778

1 959 397 482 375 375 339 34 24 881 375 987 402 436 459 446 370 397 886 885 890 436 584 479 521 332

1008 21619 7407 8654 5373 5373 4566 1461 1461 5933 5373 10207 9322 3069 4082 3955 2767 4169 5933 5952 5952 2863 2741 3747 9721 4263

– 37.24 82.77 76.44 19.56 19.56 18.82 48.61 55.06 11.60 19.56 11.55 19.54 14.54 14.18 13.83 16.31 14.84 11.51 11.59 11.50 14.27 11.52 12.71 16.33 18.87

0 30.46 47.73 71.14 19.12 19.12 19.61 14.93 13.06 22.77 19.12 23.20 19.42 20.74 22.91 21.24 19.93 20.84 22.80 22.78 22.82 20.51 20.30 21.45 25.60 25.22

0 6.38 9.75 13.69 4.07 4.07 4.14 3.47 3.12 4.76 4.07 4.83 4.12 4.41 4.67 4.48 4.27 4.41 4.76 4.76 4.76 4.35 4.30 4.47 4.94 4.75

0 2.79 1.40 1.04 0.95 0.95 1.05 1.58 0.43 2.46 0.95 2.48 0.96 2.34 2.28 2.38 2.25 2.29 2.50 2.47 2.50 2.37 2.43 2.21 2.18 1.85

0.00 0.11 0.11 0.18 0.14 0.15 0.09 0.06 0.05 0.17 0.14 0.25 0.21 0.11 0.17 0.13 0.11 0.15 0.21 0.21 0.20 0.14 0.20 5.94 11.77 5.51

414.85 12.31 18.26 20.78 9.54 9.22 9.11 11.58 12.01 9.09 9.54 9.16 9.45 8.88 9.42 8.86 8.92 8.46 8.37 8.33 8.36 8.14 8.81 8.82 9.53 9.85

20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11

1.0 0.47 0.55 0.43 0.42 0.42 0.41 0.70 0.75 0.27 0.42 0.26 0.42 0.33 0.31 0.32 0.37 0.34 0.26 0.27 0.26 0.32 0.29 0.30 0.31 0.35

2183.42 44.68 76.00 89.26 30.11 28.42 27.84 40.84 43.11 27.74 30.11 28.11 29.63 26.63 29.47 26.53 26.84 24.42 23.95 23.74 23.89 22.74 26.26 26.32 30.05 31.74

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

1534 1452 1534 1504 1111 1534 1111

1535 1453 1535 1505 1112 1535 1112

436 392 436 399 291 436 291

3069 2971 3069 3002 2782 3069 2782

5.24 5.38 4.61 4.85 6.37 4.61 6.37

15.23 14.75 18.59 16.40 17.33 18.19 16.95

3.30 3.17 4.12 3.69 3.84 4.03 3.76

2.05 1.78 2.66 2.05 2.34 2.60 2.29

0.14 0.15 0.13 10.15 0.39 0.19 0.40

2.59 2.53 3.15 3.15 3.23 3.46 3.52

14.64 14.64 17.08 17.08 17.08 17.36 17.36

0.41 0.39 0.24 0.33 0.35 0.33 0.41

8.91 8.36 7.15 7.15 7.77 7.36 7.79

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2635 2635 2635 2635 12142 3418 3418 3418 3418 14817 1534 1534 1534 1534 7550

2636 2636 2636 2636 2636 3419 3419 3419 3419 3419 1535 1535 1535 1535 1535

375 375 375 375 375 391 391 391 391 391 436 436 436 436 436

5373 5373 5373 5373 5373 7375 7375 7375 7375 7375 3069 3069 3069 3069 3069

19.69 19.56 19.56 19.70 19.70 19.65 19.52 19.52 19.66 19.66 14.70 14.54 14.54 14.70 14.70

42.00 19.12 19.12 15.62 14.26 42.93 19.34 19.34 15.75 14.37 44.85 20.74 20.74 16.74 15.28

4.09 4.07 4.07 4.09 4.09 4.13 4.11 4.11 4.13 4.13 4.42 4.41 4.41 4.42 4.42

0.95 0.95 0.95 0.95 0.95 0.96 0.96 0.96 0.96 0.96 2.34 2.34 2.34 2.34 2.34

0.12 0.12 0.13 0.16 0.16 0.15 0.14 0.15 0.20 0.19 0.09 0.09 0.08 0.12 0.13

11.69 9.43 8.68 8.78 8.91 11.75 9.47 8.73 8.83 8.80 11.60 9.17 8.35 8.47 8.44

20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11 20.11

0.26 0.36 0.42 0.41 0.40 0.26 0.36 0.42 0.41 0.41 0.19 0.28 0.33 0.32 0.32

41.42 29.53 25.58 26.11 26.79 41.74 29.74 25.84 26.37 26.21 40.95 28.16 23.84 24.47 24.32

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

113 2964 2964 6074 2964 0 472 303 626 5732 2635

623 20749 20749 6075 20749 4968 12374 2906 9861 40125 2636

0 4990 4990 959 4990 3717 3583 1566 2399 4606 375

1008 50544 50544 21619 50544 6466 30866 7550 38017 109596 5373

78.49 34.31 34.28 37.24 33.67 48.30 46.52 45.89 44.85 28.64 19.56

36.66 46.95 33.44 30.46 30.48 12.25 11.93 10.45 11.11 45.35 19.12

25.50 8.99 8.98 6.38 12.54 12.25 10.10 8.62 8.85 8.94 4.07

0.00 5.03 5.03 2.79 8.63 9.23 4.72 6.35 5.03 4.94 0.95

0.07 0.17 0.13 0.83 0.14 0.05 0.40 0.07 0.11 0.81 0.37

195.22 25.83 20.48 53.66 24.51 17.37 29.66 21.57 20.39 25.02 13.94

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 8: Experimental results for scene “teapot4”.

N á 1008, hit hit TPD : N prim á 263169, Nhit á 226198, N prim á 161203, Nsec á 228252, Nsec á 71224, hit MIN MIN Nshad á 408292, Nshad á 37577, TR ç sé]á 4 â 01, Tapp ç sé]á 3 â 82, TRSA ç sé]á 0 â 19.

Appendix E

175 Scene = “tetra5”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 13915 735 831 751 751 727 171 171 751 751 751 751 751 747 751 751 751 751 751 751 751 751 839 839 887

0 13916 736 832 752 752 728 172 172 752 752 752 752 752 748 752 752 752 752 752 752 752 752 840 840 888

1 4140 480 576 496 496 472 60 60 496 496 496 496 496 492 496 496 496 496 496 496 496 496 584 584 632

1024 46080 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024

– 43.38 16.17 10.62 10.62 10.62 11.35 41.69 41.69 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 11.05

0 28.85 18.41 27.52 11.76 11.76 13.32 9.05 9.05 11.76 11.76 11.76 11.76 11.76 13.31 11.76 11.79 11.79 11.76 11.79 11.79 11.76 11.76 12.79 12.79 16.46

0 5.62 3.70 6.02 2.32 2.32 2.68 1.97 1.97 2.32 2.32 2.32 2.32 2.32 2.40 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.44 2.44 3.02

0 3.85 2.93 5.51 1.81 1.81 2.15 1.12 1.12 1.81 1.81 1.81 1.81 1.81 1.89 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.94 1.94 2.50

0.01 0.18 0.04 0.07 0.07 0.07 0.05 0.05 0.05 0.07 0.07 0.07 0.06 0.07 0.09 0.06 0.07 0.07 0.08 0.07 0.08 0.07 0.08 2.55 2.59 2.80

121.80 3.09 2.01 2.17 1.80 1.69 1.74 2.22 2.22 1.70 1.80 1.71 1.70 1.69 1.69 1.68 1.69 1.58 1.57 1.57 1.57 1.69 1.63 1.73 1.72 1.75

26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58

1.0 0.40 0.28 0.15 0.29 0.29 0.28 0.67 0.67 0.29 0.29 0.29 0.29 0.29 0.26 0.29 0.29 0.29 0.29 0.29 0.29 0.29 0.29 0.27 0.27 0.23

3690.91 67.06 34.33 39.18 27.97 24.64 26.15 40.70 40.70 24.94 27.97 25.24 24.94 24.64 24.64 24.33 24.64 21.30 21.00 21.00 21.00 24.64 22.82 25.85 25.55 26.45

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

751 803 751 840 641 751 641

752 804 752 841 642 752 642

496 548 496 585 425 496 425

1024 1024 1024 1024 1024 1024 1024

8.37 8.37 7.53 7.53 23.15 7.52 22.91

10.32 11.14 10.54 11.28 13.46 10.41 13.27

2.12 2.31 2.15 2.34 2.64 2.12 2.60

1.75 1.93 1.76 1.96 2.09 1.74 2.06

0.08 0.07 0.08 1.99 0.13 0.06 0.13

1.38 1.27 1.30 1.42 1.70 1.67 2.05

17.67 17.67 31.00 31.00 31.00 23.25 23.25

0.60 0.50 0.49 0.53 0.59 0.54 0.61

28.33 24.67 34.00 40.00 54.00 18.50 28.00

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

751 751 751 751 2427 751 751 751 751 2427 751 751 751 751 2427

752 752 752 752 752 752 752 752 752 752 752 752 752 752 752

496 496 496 496 496 496 496 496 496 496 496 496 496 496 496

1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024 1024

10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62 10.62

23.98 11.76 11.76 11.15 10.46 23.98 11.76 11.76 11.15 10.46 23.98 11.76 11.76 11.15 10.46

2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32 2.32

1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81 1.81

0.05 0.05 0.06 0.06 0.07 0.04 0.05 0.05 0.06 0.07 0.05 0.06 0.05 0.06 0.07

2.16 1.71 1.55 1.65 1.64 2.15 1.72 1.55 1.64 1.64 2.17 1.72 1.55 1.64 1.64

26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58 26.58

0.15 0.23 0.29 0.25 0.26 0.15 0.23 0.29 0.26 0.26 0.15 0.23 0.29 0.26 0.26

38.88 25.24 20.39 23.42 23.12 38.58 25.55 20.39 23.12 23.12 39.18 25.55 20.39 23.12 23.12

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

102 6997 6997 13915 6997 0 151 1 261 7257 751

623 48980 48980 13916 48980 5832 5941 3375 5164 50800 752

0 19428 19428 4140 19428 4456 2791 2429 1116 18656 496

1024 128000 128000 46080 128000 8984 16868 6736 30784 138944 1024

88.29 47.59 47.49 43.38 45.83 51.51 72.05 63.02 73.71 29.86 10.62

21.10 44.16 29.92 28.85 25.13 8.05 9.69 7.41 6.24 33.90 11.76

16.86 7.61 7.61 5.62 9.36 8.05 8.29 6.74 5.08 6.26 2.32

0.00 5.53 5.53 3.85 7.35 6.47 5.45 5.08 3.03 4.87 1.81

0.09 0.31 0.27 0.30 0.28 0.06 0.19 0.07 0.05 0.91 0.10

48.29 6.98 5.34 4.95 6.55 3.56 6.66 4.51 4.67 5.49 2.48

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

N˜ ET S

N˜ EET S

Table 9: Experimental results for scene “tetra5”.

N á 1024, hit hit TPD : N prim á 263169, Nhit á 159213, N prim á 53807, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 50135, Nshad á 4650, TR ç sé]á 0 â 91, Tapp ç sé]á 0 â 877, TRSA ç sé]á 0 â 033.

û 176

Appendix E Scene = “tree8”

Line

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 307 4427 5041 1107 1107 1114 57 54 1433 1107 8578 3306 521 900 527 514 538 1433 1438 1438 521 512 569 880 548

0 308 4428 5042 1108 1108 1115 58 55 1434 1108 8579 3307 522 901 528 515 539 1434 1439 1439 522 513 570 881 549

1 218 661 809 312 312 334 8 7 545 312 1747 376 192 234 192 193 193 545 548 548 192 228 220 180 110

1023 1569 7239 8477 1930 1930 1961 1126 1126 2019 1930 9546 5649 1416 1818 1431 1403 1463 2019 2021 2021 1388 1335 1436 2377 1895

– 480.05 198.68 342.38 21.51 21.51 19.34 99.09 100.14 19.16 21.51 18.47 20.99 22.98 20.10 22.99 23.03 22.95 19.16 19.11 19.11 23.04 22.15 22.33 23.97 47.63

0 61.95 87.03 125.39 11.96 11.96 13.98 9.95 9.69 12.59 11.96 13.03 12.15 12.04 13.19 12.06 11.98 12.04 12.59 12.58 12.58 11.75 11.97 12.21 13.65 16.19

0 12.99 19.39 30.85 3.17 3.17 3.90 2.84 2.79 3.28 3.17 3.36 3.20 3.19 3.39 3.19 3.17 3.18 3.28 3.27 3.27 3.11 3.18 3.20 3.47 3.56

0 6.69 3.33 5.32 0.48 0.48 1.65 0.12 0.12 0.59 0.48 0.62 0.48 0.51 0.91 0.51 0.51 0.51 0.59 0.59 0.59 0.52 0.61 0.58 0.56 0.43

0.01 0.02 0.14 0.22 0.16 0.15 0.08 0.09 0.08 0.15 0.16 0.33 0.22 0.12 0.16 0.12 0.14 0.14 0.18 0.19 0.19 0.14 0.17 1.96 3.24 2.23

1544.43 109.94 58.63 102.18 10.09 9.89 10.48 22.95 22.99 9.63 10.09 9.66 9.81 10.43 10.18 10.48 10.30 9.54 9.07 8.94 8.99 9.78 9.91 9.81 10.53 15.19

18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67

1.0 0.76 0.48 0.52 0.42 0.42 0.36 0.80 0.80 0.38 0.42 0.36 0.41 0.43 0.38 0.43 0.43 0.43 0.38 0.38 0.38 0.43 0.42 0.42 0.41 0.54

7354.43 504.86 260.52 467.90 29.38 28.43 31.24 90.62 90.81 27.19 29.38 27.33 28.05 31.00 29.81 31.24 30.38 26.76 24.52 23.90 24.14 27.90 28.52 28.05 31.48 53.67

521 566 521 847 639 521 639

522 567 522 848 640 522 640

192 227 192 329 227 192 227

1416 1506 1416 1827 2175 1416 2175

4.75 3.38 8.33 10.11 10.68 8.28 10.66

15.97 12.66 18.59 14.83 15.63 18.43 15.55

4.46 3.40 4.95 4.12 4.18 4.90 4.16

0.82 0.84 1.08 1.94 1.70 1.07 1.68

0.14 0.19 0.14 34.83 1.23 0.13 1.24

5.60 5.18 4.48 4.62 4.93 4.76 5.21

18.24 18.24 17.87 17.87 17.87 16.53 16.53

0.57 0.55 0.51 0.63 0.67 0.55 0.69

8.43 6.43 12.00 12.93 15.00 11.47 14.12

1107 1107 1107 1107 7341 1630 1630 1630 1630 10377 521 521 521 521 3644

1108 1108 1108 1108 1108 1631 1631 1631 1631 1631 522 522 522 522 522

312 312 312 312 312 356 356 356 356 356 192 192 192 192 192

1930 1930 1930 1930 1930 2554 2554 2554 2554 2554 1416 1416 1416 1416 1416

21.60 21.51 21.51 21.61 21.61 21.17 21.08 21.08 21.18 21.18 23.07 22.98 22.98 23.08 23.08

25.40 11.96 11.96 11.82 10.99 25.81 12.06 12.06 11.92 11.08 25.16 12.04 12.04 11.89 11.06

3.18 3.17 3.17 3.18 3.18 3.20 3.19 3.19 3.20 3.20 3.19 3.19 3.19 3.20 3.20

0.48 0.48 0.48 0.48 0.48 0.48 0.48 0.48 0.48 0.48 0.51 0.51 0.51 0.51 0.51

0.13 0.12 0.12 0.15 0.14 0.15 0.14 0.15 0.14 0.18 0.10 0.10 0.10 0.10 0.12

13.13 9.99 9.35 9.77 10.42 13.12 9.95 9.29 9.71 9.87 13.55 10.37 9.70 10.69 10.29

18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67 18.67

0.25 0.38 0.42 0.39 0.35 0.24 0.36 0.41 0.38 0.37 0.26 0.39 0.43 0.37 0.39

43.86 28.90 25.86 27.86 30.95 43.81 28.71 25.57 27.57 28.33 45.86 30.71 27.52 32.24 30.33

163 145 145 307 145 0 170 10 45 471 1107

672 1016 1016 308 1016 4324 6285 3064 3511 3298 1108

0 859 859 218 859 2158 4014 2322 1778 1364 312

1023 2013 2013 1569 2013 3217 5248 2550 4177 4836 1930

32.72 419.41 418.74 480.05 418.25 1137.20 19.75 55.55 53.77 22.74 21.51

10.33 128.56 83.80 61.95 71.05 6.29 7.05 4.53 14.39 31.57 11.96

6.59 21.27 21.27 12.99 28.38 6.29 6.40 1.94 11.75 7.99 3.17

0.00 14.00 14.00 6.69 21.12 2.87 4.62 1.18 6.48 5.62 0.48

0.08 0.03 0.03 1.31 0.03 0.04 0.13 0.06 0.02 0.12 0.34

99.60 176.30 152.31 232.96 167.79 299.79 18.49 35.93 29.01 28.84 18.39

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 10: Experimental results for scene “tree8”.

N á 1023, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 166900, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 1088879, Nshad á 632, TR ç sé]á 4 â 13, Tapp ç sé]á 3 â 92, TRSA ç sé]á 0 â 21.

Appendix E

177

balls3

gears2

jacks3

lattice6

mount4

Figure 1: Visualization of the G3SPD scenes using the testing procedure TPD .

178ü

Appendix E

rings3

sombrero1

teapot4

tetra5

tree8

Figure 2: Visualization of the G3SPD scenes using the testing procedure TPD .

Appendix E

179 Scene = “balls4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 583 31398 26852 4374 4374 4833 82 79 5243 4374 64019 29159 7892 12720 8014 7707 8253 5249 5264 5270 7892 7427 9515 16783 12955

0 584 31399 26853 4375 4375 4834 83 80 5244 4375 64020 29160 7893 12721 8015 7708 8254 5250 5265 5271 7893 7428 9516 16784 12956

1 258 1213 502 746 746 828 6 5 938 746 9756 2724 1479 1641 1485 1469 1476 944 959 965 1479 2003 1761 1067 1446

7382 11433 64190 55119 13784 13784 14839 7931 7930 14410 13784 79636 51935 17323 23484 17592 17092 18312 14410 14410 14410 14565 14055 19438 39824 27378

– 330.66 104.21 109.17 16.23 16.23 17.18 410.73 411.16 15.01 16.23 9.72 11.73 12.79 11.85 12.41 12.81 12.41 14.62 14.99 14.61 11.55 10.94 11.99 14.03 16.52

0 56.74 133.92 130.18 24.10 24.10 25.44 13.16 12.61 26.09 24.10 31.83 27.47 27.13 33.66 27.35 27.08 27.36 26.29 26.11 26.31 27.21 26.67 28.35 31.60 34.89

0 10.76 28.30 27.86 4.77 4.77 5.13 3.12 2.99 5.18 4.77 6.13 5.29 5.34 6.02 5.35 5.33 5.35 5.18 5.18 5.18 5.37 5.27 5.55 6.02 6.21

0 5.96 2.98 1.12 1.26 1.26 1.40 0.36 0.27 1.51 1.26 1.99 1.44 1.61 2.34 1.80 1.61 1.80 1.71 1.52 1.71 1.65 1.94 1.85 1.40 1.43

0.02 0.18 1.24 1.71 1.38 1.35 0.78 0.93 0.92 1.40 1.38 2.85 2.12 1.48 2.07 1.49 1.51 1.81 1.58 1.67 1.67 1.72 2.26 28.87 49.25 39.85

6248.25 81.39 57.94 60.17 16.07 15.68 15.84 87.10 85.93 15.94 16.07 16.55 15.91 15.74 16.61 15.67 15.80 14.72 14.92 14.93 15.00 14.81 15.32 15.59 16.78 17.73

15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05

1.0 0.77 0.31 0.32 0.28 0.28 0.28 0.95 0.95 0.25 0.28 0.15 0.19 0.21 0.17 0.20 0.21 0.20 0.24 0.25 0.24 0.19 0.19 0.19 0.20 0.21

16887.16 204.92 141.54 147.57 28.38 27.32 27.76 220.35 217.19 28.03 28.38 29.68 27.95 27.49 29.84 27.30 27.65 24.73 25.27 25.30 25.49 24.97 26.35 27.08 30.30 32.86

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

7892 7751 7892 13264 7535 7892 7535

7893 7752 7893 13265 7536 7893 7536

1479 1465 1479 2942 1447 1479 1447

17323 17185 17323 22531 17392 17323 17392

6.55 6.41 6.30 4.93 6.42 6.21 6.35

31.18 29.76 30.98 32.53 30.42 30.51 30.02

6.86 6.49 6.87 7.04 6.58 6.76 6.50

2.39 2.09 2.66 3.22 2.46 2.61 2.42

1.66 2.06 1.69 302.22 10.21 1.71 10.27

4.58 4.57 4.61 4.65 4.57 4.99 4.88

12.85 12.85 14.11 14.11 14.11 12.30 12.30

0.31 0.34 0.34 0.32 0.33 0.37 0.35

10.05 10.00 11.50 11.72 11.28 9.39 8.91

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

4374 4374 4374 4374 31999 9171 9171 9171 9171 65310 7892 7892 7892 7892 56963

4375 4375 4375 4375 4375 9172 9172 9172 9172 9172 7893 7893 7893 7893 7893

746 746 746 746 746 1545 1545 1545 1545 1545 1479 1479 1479 1479 1479

13784 13784 13784 13784 13784 19724 19724 19724 19724 19724 17323 17323 17323 17323 17323

17.23 16.23 16.23 17.24 17.23 13.66 12.87 12.87 13.68 13.67 13.56 12.78 12.79 13.59 13.57

59.40 24.10 24.10 21.84 19.53 65.76 25.67 25.67 23.28 20.70 69.73 27.13 27.13 24.46 21.66

4.92 4.77 4.77 4.92 4.92 5.18 5.01 5.01 5.19 5.19 5.50 5.34 5.34 5.51 5.50

1.27 1.26 1.26 1.27 1.27 1.38 1.37 1.37 1.38 1.38 1.61 1.61 1.61 1.61 1.61

1.15 1.12 1.14 1.21 1.21 1.28 1.28 1.28 1.44 1.53 1.22 1.22 1.22 1.35 1.41

21.33 16.29 14.76 15.50 15.87 21.63 16.30 14.49 15.35 15.49 22.37 16.53 14.69 15.56 15.66

15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05 15.05

0.16 0.24 0.28 0.26 0.25 0.12 0.18 0.22 0.20 0.20 0.11 0.17 0.21 0.19 0.19

42.59 28.97 24.84 26.84 27.84 43.41 29.00 24.11 26.43 26.81 45.41 29.62 24.65 27.00 27.27

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1107 295 295 583 295 0 1762 21 1578 2291 4374

4967 2066 2066 584 2066 40836 48624 22226 40061 16038 4375

0 1032 1032 258 1032 33867 13627 17429 11669 3486 746

7382 15491 15491 11433 15497 15959 58410 17954 122122 34692 13784

122.94 210.05 211.91 330.66 207.90 215.52 23.49 67.25 31.07 16.57 16.23

141.01 118.54 75.40 56.74 64.76 10.25 13.37 15.51 16.32 46.88 24.10

97.54 19.17 19.16 10.76 25.30 10.25 11.57 11.92 12.63 9.12 4.77

0.00 12.37 12.37 5.96 18.59 7.11 7.41 8.61 7.41 4.97 1.26

0.80 0.22 0.21 1.03 0.22 0.26 3.16 0.42 0.36 1.05 1.78

1251.74 120.07 96.63 164.56 116.43 78.59 32.25 73.71 38.12 37.96 27.06

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 11: Experimental results for scene “balls4”.

N á 7382, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263169, Nsec á 179881, Nsec á 134360, hit MIN MIN Nshad á 959197, Nshad á 285156, TR ç sé]á 5 â 94, Tapp ç séOá 5 â 57, TRSA ç séOá 0 â 37.

ê 180

Appendix E Scene = “gears4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 7647 29232 44931 10703 10703 6487 155 155 13258 10703 123517 64863 27471 28542 35245 17292 33368 13310 13592 13621 27471 27823 40823 67659 29857

0 7648 29233 44932 10704 10704 6488 156 156 13259 10704 123518 64864 27472 28543 35246 17293 33369 13311 13593 13622 27472 27824 40824 67660 29858

1 800 1942 619 254 254 366 0 0 399 254 5805 1509 964 1158 1133 493 646 451 794 823 964 1034 1429 3167 1773

9345 61644 88854 100913 34667 34667 24030 10146 10146 35902 34667 187329 137422 53681 57549 71987 42666 74810 35902 36039 36039 45855 53622 74910 125698 60050

– 29.81 45.42 72.90 9.85 9.85 11.92 148.88 148.88 8.41 9.85 6.09 8.29 7.55 7.14 7.33 8.22 7.71 8.41 8.37 8.37 7.37 7.59 7.33 8.49 12.78

0 23.49 63.23 95.02 19.73 19.73 24.96 10.84 10.84 21.02 19.73 23.81 21.23 21.85 23.89 22.37 21.16 22.05 21.02 21.06 21.06 21.86 21.92 23.29 27.24 31.51

0 3.42 12.48 18.89 3.04 3.04 4.43 1.99 1.99 3.19 3.04 3.62 3.23 3.29 3.57 3.35 3.19 3.29 3.19 3.19 3.19 3.31 3.30 3.54 4.02 4.57

0 0.89 3.38 0.76 0.55 0.55 2.12 0.00 0.00 0.64 0.55 0.76 0.58 0.68 1.00 0.69 0.64 0.65 0.65 0.69 0.69 0.68 0.68 0.49 0.71 0.95

0.01 0.41 1.66 2.50 1.60 1.61 0.86 1.03 0.91 1.63 1.60 4.55 3.06 1.99 2.87 2.51 1.85 3.00 1.83 1.98 1.86 5.18 3.43 133.51 329.59 108.55

11753.20 29.66 47.83 73.31 21.78 21.43 23.23 69.48 69.67 21.00 21.78 21.04 21.22 21.26 21.38 21.31 21.30 19.96 20.24 20.08 20.69 20.36 21.10 21.26 22.52 25.07

6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93

1.0 0.62 0.48 0.50 0.39 0.39 0.38 0.95 0.95 0.34 0.39 0.25 0.33 0.31 0.28 0.30 0.33 0.31 0.34 0.34 0.34 0.30 0.31 0.29 0.29 0.34

9794.33 17.78 32.92 54.16 11.22 10.93 12.43 50.97 51.12 10.57 11.22 10.60 10.75 10.78 10.88 10.82 10.82 9.70 9.93 9.80 10.31 10.03 10.65 10.78 11.83 13.96

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

27471 30419 27471 32118 33660 27471 33660

27472 30420 27472 32119 33661 27472 33661

964 1190 964 1362 3526 964 3526

53681 58126 53681 60436 73892 53681 73892

2.83 2.57 3.25 2.90 6.10 3.24 6.06

15.09 14.88 17.26 17.38 23.71 17.03 23.59

2.33 2.25 2.50 2.48 3.51 2.47 3.50

0.31 0.35 0.37 0.52 0.98 0.36 1.00

2.26 2.72 2.27 180.33 8.70 2.24 8.65

5.79 5.90 5.52 5.46 6.18 5.87 6.50

6.76 6.76 6.74 6.74 6.74 6.89 6.89

0.62 0.65 0.57 0.55 0.56 0.59 0.57

3.22 3.41 3.48 3.37 4.70 3.40 4.51

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

10703 10703 10703 10703 52747 23394 23394 23394 23394 105680 27471 27471 27471 27471 120779

10704 10704 10704 10704 10704 23395 23395 23395 23395 23395 27472 27472 27472 27472 27472

254 254 254 254 254 371 371 371 371 371 964 964 964 964 964

34667 34667 34667 34667 34667 57919 57919 57919 57919 57919 53681 53681 53681 53681 53681

13.54 9.85 9.85 13.54 13.54 11.71 8.80 8.80 11.71 11.71 10.18 7.56 7.55 10.18 10.18

39.72 19.73 19.73 17.96 16.98 42.84 20.75 20.75 18.81 17.68 45.22 21.85 21.85 19.61 18.15

3.12 3.04 3.04 3.12 3.12 3.23 3.16 3.16 3.23 3.23 3.34 3.29 3.29 3.34 3.34

0.56 0.55 0.55 0.56 0.56 0.57 0.56 0.56 0.57 0.57 0.68 0.68 0.68 0.68 0.68

1.24 1.27 1.36 1.42 1.48 1.76 1.59 1.59 2.03 2.14 1.67 1.66 1.67 2.16 2.28

24.99 22.14 19.95 20.37 21.41 24.79 22.22 20.13 19.94 20.39 24.58 22.24 20.10 19.65 20.05

6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93 6.93

0.27 0.33 0.39 0.38 0.35 0.25 0.30 0.35 0.36 0.34 0.22 0.26 0.31 0.32 0.31

13.89 11.52 9.69 10.04 10.91 13.72 11.58 9.84 9.68 10.06 13.55 11.60 9.82 9.44 9.78

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1166 4033 4033 7647 4217 0 2374 25 8503 6937 10703

6104 28232 28232 7648 29520 46284 64448 27984 117394 48560 10704

0 10996 10996 800 12092 35148 5992 17304 28116 11408 254

9345 111612 111612 61644 114496 39226 126321 50030 512128 148456 34667

123.90 23.11 22.10 29.81 18.78 22.86 14.90 22.89 23.41 13.69 9.85

131.42 28.43 22.01 23.49 17.73 12.15 5.14 11.36 11.62 33.34 19.73

95.25 5.15 5.10 3.42 6.62 12.15 3.78 6.37 9.04 6.38 3.04

0.00 2.50 2.50 0.89 4.45 8.96 0.25 3.26 5.52 3.64 0.55

1.76 0.62 0.51 1.54 0.56 0.44 5.12 0.89 1.31 2.32 2.38

1764.88 47.29 41.45 93.96 46.64 38.07 48.29 91.52 50.42 45.75 36.24

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 12: Experimental results for scene “gears4”.

N á 9345, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 332, Nsec á 181015, Nsec á 122023, hit MIN MIN Nshad á 1388067, Nshad á 356484, TR ç sé]á 9 â 52, Tapp ç sé]á 8 â 32, TRSA ç sé^á 1 â 20.

Appendix E

181 Scene = “jacks4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 20041 32567 29962 13793 13793 14392 239 227 20374 13793 92622 46495 17878 27834 28499 10982 25060 20374 20419 20419 17878 20021 32010 71773 47308

0 20042 32568 29963 13794 13794 14393 240 228 20375 13794 92623 46496 17879 27835 28500 10983 25061 20375 20420 20420 17879 20022 32011 71774 47309

1 1042 1664 645 1343 1343 1384 30 18 4485 1343 7933 1731 4883 3456 4883 2929 2939 4485 4506 4506 4883 11476 6011 1265 3604

5265 60390 70776 64557 28098 28098 29831 7969 7969 31264 28098 133860 94355 25902 39802 52653 20034 54064 31264 31306 31306 15941 14186 45479 130911 79081

– 36.75 44.85 52.25 24.62 24.62 24.93 180.92 181.64 19.68 24.62 18.63 24.93 21.04 20.86 21.83 24.15 24.90 19.68 19.66 19.66 14.82 8.90 18.83 24.07 24.60

0 40.56 58.12 60.54 36.71 36.71 36.91 15.09 14.72 41.58 36.71 53.91 43.23 39.51 45.75 43.52 35.72 41.80 41.58 41.57 41.57 39.71 40.38 45.63 53.33 53.50

0 7.49 11.09 11.98 6.70 6.70 6.73 3.00 2.91 7.58 6.70 9.64 7.80 7.20 8.06 7.92 6.52 7.61 7.58 7.56 7.56 7.24 7.36 8.22 9.38 9.08

0 1.47 1.42 0.24 1.42 1.42 1.56 0.71 0.45 3.05 1.42 3.36 1.46 3.08 2.78 3.08 2.71 2.71 3.05 3.04 3.04 3.08 4.78 3.11 1.98 2.57

0.01 0.36 1.12 1.62 1.25 1.24 0.76 0.61 0.61 1.38 1.25 3.25 2.18 1.33 2.09 1.78 1.27 2.06 1.63 1.68 1.71 2.25 2.56 89.81 201.24 140.55

2844.89 17.95 22.03 24.66 14.73 14.45 14.45 38.40 38.38 13.85 14.73 15.82 15.85 13.70 14.78 14.79 13.90 14.94 13.56 13.59 13.60 11.98 10.32 14.22 17.24 16.74

8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56

1.0 0.65 0.61 0.64 0.58 0.58 0.58 0.96 0.96 0.49 0.58 0.41 0.54 0.52 0.48 0.50 0.58 0.55 0.49 0.49 0.49 0.43 0.31 0.46 0.48 0.48

8890.28 47.53 60.28 68.50 37.47 36.59 36.59 111.44 111.38 34.72 37.47 40.87 40.97 34.25 37.62 37.66 34.88 38.12 33.81 33.91 33.94 28.88 23.69 35.88 45.31 43.75

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

17878 35331 17878 18430 40984 17878 40984

17879 35332 17879 18431 40985 17879 40985

4883 3093 4883 4913 7339 4883 7339

25902 182417 25902 32776 134683 25902 134683

11.14 7.22 12.09 8.27 46.45 12.06 46.27

24.19 8.15 28.95 18.73 32.21 28.79 32.02

4.71 0.59 5.58 3.30 6.21 5.55 6.17

2.86 0.16 3.48 2.04 2.93 3.45 2.90

1.54 4.64 1.64 322.41 21.44 1.54 21.43

2.64 2.12 3.22 2.65 6.12 3.53 6.32

7.17 7.17 6.79 6.79 6.79 7.84 7.84

0.40 0.71 0.43 0.51 0.72 0.37 0.71

14.83 10.50 16.21 12.14 36.93 10.74 25.42

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

13793 13793 13793 13793 107720 22195 22195 22195 22195 158186 17878 17878 17878 17878 135333

13794 13794 13794 13794 13794 22196 22196 22196 22196 22196 17879 17879 17879 17879 17879

1343 1343 1343 1343 1343 1694 1694 1694 1694 1694 4883 4883 4883 4883 4883

28098 28098 28098 28098 28098 42201 42201 42201 42201 42201 25902 25902 25902 25902 25902

25.22 24.62 24.62 25.23 25.23 24.82 24.21 24.21 24.83 24.83 21.67 21.04 21.04 21.67 21.67

98.26 36.71 36.71 29.26 24.12 111.17 39.72 39.72 31.59 25.93 107.51 39.51 39.51 31.43 25.92

6.81 6.70 6.70 6.81 6.81 7.35 7.22 7.22 7.35 7.35 7.30 7.20 7.20 7.30 7.30

1.42 1.42 1.42 1.42 1.42 1.46 1.46 1.46 1.46 1.46 3.08 3.08 3.08 3.08 3.08

1.07 1.04 1.04 1.29 1.38 1.30 1.23 1.25 1.61 1.76 1.17 1.14 1.13 1.42 1.55

18.64 15.10 13.80 15.55 15.40 19.74 15.66 14.23 16.06 15.85 18.38 14.50 13.05 14.77 14.71

8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56 8.56

0.40 0.52 0.58 0.50 0.51 0.37 0.49 0.55 0.47 0.48 0.34 0.46 0.52 0.45 0.45

49.69 38.62 34.56 40.03 39.56 53.12 40.38 35.91 41.62 40.97 48.87 36.75 32.22 37.59 37.41

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

580 10071 10071 20041 10071 0 360 131 1627 12833 13793

3387 70498 70498 20042 70498 25792 19909 16092 24050 89832 13794

0 8366 8366 1042 8366 16345 7652 9937 3588 7534 1343

5265 145262 145262 60390 145262 36966 37576 21771 85556 184006 28098

187.42 33.89 33.82 36.75 32.80 38.00 51.68 59.71 51.46 31.29 24.62

145.28 64.27 41.11 40.56 39.01 11.86 19.93 20.02 12.85 65.26 36.71

108.17 10.24 10.23 7.49 16.72 11.86 17.02 15.43 9.60 10.54 6.70

0.00 2.92 2.92 1.47 9.59 7.27 7.61 7.92 3.73 3.10 1.42

0.82 0.51 0.49 0.63 0.46 0.32 3.91 0.36 0.28 1.83 2.15

647.28 31.37 25.86 25.49 31.06 19.82 39.96 38.66 29.39 31.04 20.22

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 13: Experimental results for scene “jacks4”.

N á 5265, hit hit TPD : N prim á 263169, Nhit á 220685, N prim á 98411, Nsec á 206471, Nsec á 123014, hit MIN MIN Nshad á 162163, Nshad á 73920, TR ç sé]á 3 â 06, Tapp ç sé^á 2 â 74, TRSA ç sé]á 0 â 32.

ô 182

Appendix E Scene = “lattice12”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 22015 18238 14529 15225 15225 14185 255 255 32452 15225 61634 15645 42624 41769 43079 32549 36455 32557 32756 32756 42624 43552 43635 52846 44736

0 22016 18239 14530 15226 15226 14186 256 256 32453 15226 61635 15646 42625 41770 43080 32550 36456 32558 32757 32757 42625 43553 43636 52847 44737

1 464 2462 0 2044 2044 981 0 0 5155 2044 5155 2044 5155 4139 5155 4961 4991 5197 5245 5245 5155 5152 5404 3776 3506

8281 49328 31902 27981 25350 25350 25856 11103 11103 39466 25350 68648 25770 49638 49812 50093 39757 43782 39529 39680 39680 49638 50569 50400 61268 53405

– 15.01 11.65 14.05 9.79 9.79 10.91 103.39 103.39 5.49 9.79 4.83 9.78 5.13 5.69 5.12 5.79 5.38 5.36 5.33 5.33 5.13 5.06 4.77 6.06 7.08

0 38.15 36.15 35.67 34.14 34.14 33.35 12.23 12.23 41.93 34.14 43.32 34.16 42.76 42.55 42.78 40.83 41.74 41.83 41.83 41.83 42.76 42.85 43.11 45.04 44.19

0 6.44 6.00 5.96 5.71 5.71 5.51 2.07 2.07 6.88 5.71 6.95 5.71 6.93 6.86 6.93 6.52 6.73 6.79 6.77 6.77 6.93 6.94 6.82 7.23 6.96

0 0.20 1.01 0.00 1.33 1.33 0.75 0.00 0.00 3.04 1.33 3.04 1.33 3.04 2.53 3.04 3.19 3.01 3.12 3.12 3.12 3.04 3.08 3.22 2.41 1.69

0.02 0.43 0.88 1.26 1.26 1.26 0.82 0.72 0.71 1.63 1.26 2.28 1.26 1.84 2.53 2.02 1.77 2.34 2.04 1.97 2.08 2.98 5.17 111.58 134.96 114.56

10596.70 38.28 31.56 34.89 31.45 30.80 31.06 120.54 120.53 29.35 31.45 28.66 30.71 28.86 29.60 29.41 28.94 27.40 27.95 27.77 27.86 28.89 29.65 28.45 29.51 30.33

8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72

1.0 0.59 0.54 0.59 0.51 0.51 0.54 0.97 0.97 0.32 0.51 0.29 0.51 0.30 0.33 0.30 0.34 0.32 0.32 0.32 0.32 0.30 0.30 0.29 0.33 0.37

9057.01 24.00 18.26 21.10 18.16 17.61 17.83 94.31 94.30 16.37 18.16 15.78 17.53 15.95 16.58 16.42 16.02 14.70 15.17 15.02 15.09 15.97 16.62 15.60 16.50 17.21

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

42624 22812 42624 14633 24185 42624 24185

42625 22813 42625 14634 24186 42625 24186

5155 8645 5155 3037 5611 5155 5611

49638 26248 49638 22943 30861 49638 30861

18.91 20.02 5.36 316.69 61.92 5.35 59.31

82.86 44.07 54.13 17.46 45.95 53.96 45.80

15.63 10.63 10.22 3.19 8.12 10.17 8.06

9.11 5.19 5.16 0.33 3.43 5.12 3.40

2.15 2.08 2.16 65.51 5.12 2.11 4.91

6.21 5.17 7.27 66.80 23.69 7.69 23.48

8.63 8.63 7.98 7.98 7.98 9.16 9.16

0.31 0.53 0.42 0.99 0.91 0.44 0.91

24.05 18.58 7.49 134.15 42.43 8.32 44.20

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

15233 15233 15233 15233 58797 15641 15641 15641 15641 60837 42904 42904 42904 42904 178081

15234 15234 15234 15234 15234 15642 15642 15642 15642 15642 42905 42905 42905 42905 42905

2053 2053 2053 2053 2053 2053 2053 2053 2053 2053 5197 5197 5197 5197 5197

25349 25349 25349 25349 25349 25757 25757 25757 25757 25757 49876 49876 49876 49876 49876

9.84 9.63 9.63 9.84 9.84 9.83 9.62 9.62 9.83 9.83 5.09 4.94 4.94 5.09 5.09

85.88 33.90 33.90 22.39 21.18 85.89 33.90 33.90 22.40 21.18 113.84 42.78 42.78 27.37 25.26

5.70 5.58 5.58 5.70 5.70 5.70 5.58 5.58 5.70 5.70 6.94 6.85 6.85 6.94 6.94

1.37 1.35 1.35 1.37 1.37 1.37 1.35 1.35 1.37 1.37 3.12 3.12 3.12 3.12 3.12

1.08 1.06 1.07 1.31 1.38 1.09 1.09 1.09 1.37 1.39 1.58 1.55 1.58 2.35 2.42

38.27 31.65 28.68 29.58 29.90 38.56 31.81 28.66 29.59 29.73 39.53 30.99 27.03 26.86 26.69

8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72 8.72

0.34 0.44 0.51 0.49 0.48 0.33 0.44 0.51 0.49 0.48 0.17 0.24 0.30 0.30 0.31

23.99 18.33 15.79 16.56 16.84 24.24 18.47 15.78 16.57 16.69 25.07 17.77 14.38 14.24 14.09

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

810 8777 8777 22015 8777 0 3614 1 289 7441 15225

5374 61440 61440 22016 61440 42875 4614 12167 11568 52088 15226

0 7744 7744 464 7744 9616 0 0 0 6928 2044

8281 88184 88184 49328 88184 61003 14548 61535 61156 66424 25350

123.84 11.58 11.52 15.01 10.40 13.55 66.28 35.88 36.97 9.24 9.79

155.40 51.96 33.60 38.15 31.98 8.46 82.96 6.83 6.09 48.75 34.14

119.25 8.44 8.41 6.44 13.17 8.46 62.19 5.83 5.00 8.01 5.71

0.00 1.71 1.71 0.20 7.03 1.37 0.00 0.00 0.00 2.58 1.33

3.02 0.53 0.48 1.82 0.49 0.63 2.16 0.32 0.17 1.48 3.97

1908.83 59.89 48.88 68.72 61.09 39.95 1005.04 85.77 75.90 54.82 52.88

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 14: Experimental results for scene “lattice12”.

N á 8281, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 261170, Nsec á 243215, Nsec á 178786, hit MIN MIN Nshad á 1180750, Nshad á 943153, TR ç sé]á 11 â 37, Tapp ç sé]á 10 â 20, TRSA ç sé]á 1 â 17.

Appendix E

183 Scene = “mount6”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 8837 12089 4414 6701 6701 8003 194 194 10102 6701 16495 7005 13191 16578 13203 12996 13028 10104 10094 10096 13191 14301 15020 20943 15240

0 8838 12090 4415 6702 6702 8004 195 195 10103 6702 16496 7006 13192 16579 13204 12997 13029 10105 10095 10097 13192 14302 15021 20944 15241

1 1989 1398 17 1302 1302 1848 43 43 3866 1302 6633 1311 5649 5795 5651 5562 5564 3868 3838 3840 5649 6193 6460 6166 5296

8196 43940 23189 8754 10382 10382 13451 9016 9016 11210 10382 14567 10409 12237 16317 12258 12137 12176 11210 11251 11251 11919 12837 13237 22986 16054

– 21.34 34.04 63.15 7.49 7.49 8.67 94.45 94.45 6.81 7.49 5.67 6.90 6.57 6.75 5.99 7.48 7.17 6.81 6.81 6.81 6.55 6.15 6.53 7.25 7.38

0 27.24 53.05 98.72 22.07 22.07 24.30 13.11 13.11 22.90 22.07 24.82 22.81 20.73 24.75 22.19 19.42 21.70 22.90 22.89 22.89 20.73 20.68 21.20 27.75 28.56

0 5.13 9.51 17.55 3.90 3.90 4.20 2.48 2.48 3.97 3.90 4.22 3.97 3.71 4.27 3.87 3.52 3.88 3.97 3.97 3.97 3.72 3.72 3.79 4.73 4.63

0 1.41 1.37 0.05 1.45 1.45 1.48 0.80 0.80 1.59 1.45 1.73 1.45 1.69 1.66 1.69 1.69 1.69 1.59 1.59 1.59 1.70 1.73 1.72 1.40 1.86

0.01 0.33 0.80 0.93 1.07 1.07 0.71 0.74 0.77 1.14 1.07 1.27 1.05 1.17 1.69 1.19 1.19 1.38 1.28 1.29 1.29 1.25 1.76 45.38 66.00 47.69

6971.35 18.72 25.87 46.40 13.73 13.27 13.66 35.12 34.96 13.50 13.73 13.57 13.64 12.74 13.82 13.05 12.66 12.26 12.52 12.37 12.41 12.74 12.71 12.51 14.07 14.05

16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03

1.0 0.38 0.34 0.34 0.21 0.21 0.22 0.85 0.85 0.19 0.21 0.15 0.19 0.20 0.18 0.18 0.23 0.21 0.19 0.19 0.19 0.20 0.19 0.20 0.17 0.17

20503.97 39.03 60.06 120.44 24.35 23.00 24.15 87.26 86.79 23.68 24.35 23.88 24.09 21.44 24.62 22.35 21.21 20.03 20.79 20.35 20.47 21.44 21.35 20.76 25.35 25.29

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

13191 13186 13191 13281 13191 13191 13191

13192 13187 13192 13282 13192 13192 13192

5649 5718 5649 5612 5649 5649 5649

12237 12127 12237 12503 12237 12237 12237

4.71 4.65 4.51 6.19 4.51 4.50 4.50

28.48 28.55 30.94 30.09 30.94 30.36 30.36

5.51 5.50 6.09 6.07 6.09 5.97 5.97

4.08 4.10 3.96 4.00 3.96 3.89 3.89

1.49 1.41 1.55 61.17 2.06 1.56 1.95

2.28 2.27 2.77 2.84 2.83 3.06 3.06

13.83 13.83 12.67 12.67 12.67 13.18 13.18

0.13 0.12 0.16 0.21 0.19 0.16 0.16

24.17 24.00 18.11 18.89 18.78 14.64 14.64

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

6701 6701 6701 6701 32628 7000 7000 7000 7000 33397 13191 13191 13191 13191 58078

6702 6702 6702 6702 6702 7001 7001 7001 7001 7001 13192 13192 13192 13192 13192

1302 1302 1302 1302 1302 1311 1311 1311 1311 1311 5649 5649 5649 5649 5649

10382 10382 10382 10382 10382 10397 10397 10397 10397 10397 12237 12237 12237 12237 12237

8.01 7.49 7.49 8.01 8.01 7.47 6.98 6.98 7.47 7.47 7.02 6.57 6.57 7.02 7.02

44.39 22.07 22.07 15.53 13.92 45.91 22.70 22.70 15.90 14.21 41.74 20.73 20.73 14.68 13.23

3.90 3.90 3.90 3.90 3.90 3.95 3.95 3.95 3.96 3.96 3.72 3.72 3.71 3.72 3.72

1.45 1.45 1.45 1.45 1.45 1.45 1.45 1.45 1.45 1.45 1.69 1.69 1.69 1.69 1.69

0.84 0.83 0.83 0.94 0.98 0.85 0.84 0.85 0.95 0.99 0.96 0.93 0.93 1.16 1.24

16.17 13.73 12.49 11.44 11.81 16.20 13.77 12.44 11.35 11.35 15.48 12.96 11.89 10.95 10.94

16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03 16.03

0.14 0.18 0.21 0.25 0.23 0.13 0.17 0.20 0.24 0.24 0.13 0.17 0.20 0.23 0.23

31.53 24.35 20.71 17.62 18.71 31.62 24.47 20.56 17.35 17.35 29.50 22.09 18.94 16.18 16.15

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1058 4358 4358 8837 4358 0 3342 1 10035 7352 6701

5398 30507 30507 8838 30507 41650 91277 512 136153 51465 6702

0 10649 10649 1989 10649 36218 34893 310 16704 11955 1302

8196 85916 85916 43940 85916 37369 217830 14641 713095 145308 10382

170.39 17.68 17.75 21.34 16.91 24.23 29.47 131.92 26.98 13.32 7.49

175.62 34.39 24.43 27.24 23.46 14.90 21.48 4.92 12.47 33.32 22.07

124.24 6.37 6.37 5.13 9.65 14.90 18.96 3.93 10.30 6.22 3.90

0.00 2.27 2.27 1.41 5.71 9.53 12.50 1.31 5.22 2.69 1.45

0.97 0.46 0.44 0.50 0.42 0.40 3.76 0.15 1.67 2.19 1.48

1583.37 32.36 28.15 39.38 33.95 25.97 46.60 72.35 31.80 29.95 21.06

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 15: Experimental results for scene “mount6”.

N á 8196, hit hit TPD : N prim á 263169, Nhit á 257871, N prim á 173685, Nsec á 707764, Nsec á 472358, hit MIN MIN Nshad á 361043, Nshad á 30032, TR ç sé]á 5 â 79, Tapp ç sé^á 5 â 45, TRSA ç sé]á 0 â 34.

ù 184

Appendix E Scene = “rings7”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 12403 46174 48671 11835 11835 12417 182 182 13332 11835 195354 111055 14913 34413 29750 9466 31616 13348 13349 13351 14913 16196 28565 71878 29923

0 12404 46175 48672 11836 11836 12418 183 183 13333 11836 195355 111056 14914 34414 29751 9467 31617 13349 13350 13352 14914 16197 28566 71879 29924

1 1196 4973 4085 1600 1600 2063 32 32 2323 1600 16920 4959 2525 3882 2527 1591 1596 2328 2345 2347 2525 5846 5726 2372 3578

8401 63348 121417 120391 32785 32785 34704 9506 9506 33401 32785 303710 231712 35825 66152 82883 28073 97448 33432 33412 33412 27424 24791 55492 183905 67553

– 42.38 87.28 83.20 19.51 19.51 21.13 221.71 221.71 19.09 19.51 16.98 17.12 19.23 17.07 19.34 22.04 22.68 19.13 19.12 19.12 16.79 12.41 15.86 21.48 125.24

0 51.03 106.86 109.24 36.59 36.59 39.47 16.75 16.75 37.70 36.59 55.97 46.91 37.39 49.77 41.83 35.08 42.03 37.71 37.71 37.71 37.53 38.82 43.31 53.92 47.34

0 9.56 20.87 23.20 6.47 6.47 7.21 3.54 3.54 6.74 6.47 9.96 8.11 6.67 8.70 7.34 6.29 7.39 6.74 6.74 6.74 6.71 6.97 7.73 9.35 7.19

0 3.87 4.66 6.46 2.63 2.63 3.07 1.24 1.24 2.92 2.63 3.50 2.79 2.90 3.50 2.90 2.75 2.75 2.92 2.92 2.92 2.90 4.00 3.27 2.78 2.80

0.02 0.42 1.84 2.69 1.58 1.58 0.99 0.83 0.82 1.60 1.58 6.36 4.58 1.72 2.96 2.34 1.63 2.91 1.84 1.90 1.90 2.82 3.13 87.98 233.09 98.25

15987.20 71.81 141.02 137.09 42.62 42.19 44.81 196.30 196.74 41.94 42.62 47.68 44.91 42.05 45.25 44.56 44.28 46.84 40.78 41.12 40.81 39.08 34.36 40.69 52.27 144.65

5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33

1.0 0.72 0.72 0.70 0.62 0.62 0.63 0.98 0.98 0.61 0.62 0.49 0.53 0.62 0.52 0.59 0.66 0.63 0.61 0.61 0.61 0.58 0.50 0.53 0.55 0.89

10517.89 41.91 87.45 84.86 22.71 22.43 24.15 123.82 124.11 22.26 22.71 26.04 24.22 22.34 24.44 23.99 23.80 25.49 21.50 21.72 21.52 20.38 17.28 21.44 29.06 89.84

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

14913 24119 14913 4920 20335 14913 20335

14914 24120 14914 4921 20336 14914 20336

2525 1680 2525 1029 4429 2525 4429

35825 210932 35825 21254 74874 35825 74874

7.82 7.61 10.66 85.89 21.43 10.68 21.51

31.48 13.07 41.58 26.48 55.89 41.13 55.59

5.80 1.12 8.18 5.71 11.17 8.11 11.14

2.95 0.09 4.39 2.84 6.50 4.37 6.55

1.96 5.15 1.94 202.02 13.41 1.97 13.43

6.74 6.49 8.27 24.16 11.94 8.44 12.06

5.17 5.17 5.05 5.05 5.05 5.02 5.02

0.65 0.85 0.65 0.94 0.72 0.66 0.72

7.55 7.08 9.21 36.60 15.53 8.60 14.44

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

11842 11842 11842 11842 73110 24269 24269 24269 24269 150093 14951 14951 14951 14951 92519

11843 11843 11843 11843 11843 24270 24270 24270 24270 24270 14952 14952 14952 14952 14952

1598 1598 1598 1598 1598 2827 2827 2827 2827 2827 2527 2527 2527 2527 2527

32814 32814 32814 32814 32814 55360 55360 55360 55360 55360 35894 35894 35894 35894 35894

20.68 19.56 19.56 20.68 20.68 17.39 16.35 16.35 17.39 17.39 20.43 19.32 19.32 20.43 20.43

92.74 36.59 36.59 30.28 27.12 107.41 40.26 40.26 33.50 29.81 97.14 37.40 37.40 30.91 27.58

6.64 6.47 6.47 6.64 6.64 7.24 7.02 7.02 7.24 7.24 6.84 6.67 6.67 6.85 6.85

2.63 2.63 2.63 2.63 2.63 2.71 2.71 2.71 2.71 2.71 2.90 2.90 2.90 2.90 2.90

1.34 1.30 1.31 1.53 1.56 1.70 1.67 1.69 2.11 2.21 1.46 1.45 1.45 1.68 1.78

52.83 44.21 40.55 45.50 45.62 53.01 43.30 39.15 43.89 43.85 53.21 44.16 40.29 45.42 45.29

5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33 5.33

0.46 0.57 0.63 0.55 0.54 0.39 0.49 0.56 0.49 0.49 0.44 0.55 0.62 0.53 0.54

29.43 23.76 21.35 24.61 24.68 29.55 23.16 20.43 23.55 23.52 29.68 23.72 21.18 24.55 24.47

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

960 6476 6476 12403 6476 0 3998 1749 2083 11302 11835

5385 45333 45333 12404 45333 41650 73611 13765 35411 79115 11836

0 8289 8289 1196 8289 32408 11870 2305 8840 7387 1600

8401 130108 130108 63348 130108 50233 182079 61621 111974 193220 32785

185.15 35.22 35.20 42.38 34.97 45.40 47.32 65.25 48.54 26.00 19.51

179.96 79.26 51.13 51.03 47.10 19.54 14.76 13.71 17.93 65.41 36.59

128.64 13.05 13.05 9.56 19.18 19.54 10.60 9.03 14.43 11.03 6.47

0.00 6.28 6.28 3.87 12.44 14.45 0.19 4.90 7.75 4.49 2.63

1.45 0.65 0.52 1.32 0.55 0.50 2.65 0.50 0.35 2.25 2.38

2222.09 114.89 98.64 144.15 114.91 91.48 154.98 132.47 111.94 94.90 64.76

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 16: Experimental results for scene “rings7”.

N á 8401, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263168, Nsec á 312998, Nsec á 175756, hit MIN MIN Nshad á 1077448, Nshad á 510854, TR ç sé]á 9 â 62, Tapp ç sé]á 8 â 10, TRSA ç sé^á 1 â 52.

Appendix E

185 Scene = “sombrero2”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 9405 16777 4025 4641 4641 4320 175 174 11616 4641 15712 4641 13927 14000 13927 13892 13896 11616 11618 11618 13927 14208 14312 15561 12464

0 9406 16778 4026 4642 4642 4321 176 175 11617 4642 15713 4642 13928 14001 13928 13893 13897 11617 11619 11619 13928 14209 14313 15562 12465

1 1892 1695 0 521 521 885 68 67 7124 521 7688 521 7677 7167 7677 7666 7666 7124 7126 7126 7677 7772 7379 5359 5199

7938 48754 40753 8049 8142 8142 10208 8016 8016 8482 8142 12008 8142 10234 11093 10234 10210 10214 8482 8482 8482 10170 10424 10916 15924 12888

– 46.92 96.63 50.09 7.78 7.78 13.68 240.30 240.30 5.82 7.78 6.36 7.78 6.05 8.07 6.05 6.06 6.06 5.82 5.83 5.83 6.03 5.89 6.22 8.15 11.31

0 27.46 51.02 38.64 16.10 16.10 19.39 9.57 9.57 18.30 16.10 19.29 16.10 18.88 22.81 18.88 18.85 18.85 18.30 18.29 18.29 18.88 18.91 21.58 22.60 24.33

0 5.08 9.38 7.69 2.85 2.85 3.61 2.09 2.09 3.35 2.85 3.64 2.85 3.51 3.94 3.51 3.50 3.51 3.35 3.35 3.35 3.51 3.51 4.04 3.92 4.47

0 2.77 0.80 0.00 1.59 1.59 2.14 1.05 1.05 2.35 1.59 2.39 1.59 2.39 2.24 2.39 2.39 2.39 2.35 2.35 2.35 2.39 2.42 2.86 2.28 2.95

0.02 0.35 1.09 0.83 1.01 1.12 0.72 0.75 0.74 1.10 1.01 1.20 1.11 1.12 1.47 1.15 1.13 1.47 1.38 1.28 1.25 1.17 1.37 42.23 55.24 41.00

2229.08 4.62 7.59 5.30 2.73 2.57 2.91 10.81 10.72 2.72 2.73 2.76 2.65 2.72 2.81 2.69 2.72 2.44 2.40 2.40 2.39 2.75 2.50 2.60 2.71 2.90

12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22

1.0 0.54 0.57 0.47 0.25 0.25 0.33 0.95 0.95 0.18 0.25 0.19 0.25 0.18 0.20 0.18 0.18 0.18 0.18 0.18 0.18 0.18 0.18 0.17 0.20 0.24

24767.56 39.11 72.11 46.67 18.11 16.33 20.11 107.89 106.89 18.00 18.11 18.44 17.22 18.00 19.00 17.67 18.00 14.89 14.44 14.44 14.33 18.33 15.56 16.67 17.89 20.00

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

13927 13729 13927 10072 6092 13927 6092

13928 13730 13928 10073 6093 13928 6093

7677 7287 7677 5396 3236 7677 3236

10234 10436 10234 9696 8683 10234 8683

2.89 2.92 2.98 2.94 56.40 2.97 55.54

14.08 11.88 13.42 12.53 11.65 13.46 11.72

2.60 2.23 2.48 2.38 2.27 2.49 2.29

1.83 1.48 1.71 1.63 1.62 1.72 1.63

1.38 1.27 1.36 40.18 2.31 1.46 2.30

1.80 1.61 1.68 1.69 4.05 2.07 4.32

9.62 9.62 9.64 9.64 9.64 10.62 10.62

0.46 0.44 0.14 0.21 0.85 0.23 0.84

12.88 10.50 5.64 5.73 27.18 5.31 22.62

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

4641 4641 4641 4641 18971 4641 4641 4641 4641 18971 13927 13927 13927 13927 55873

4642 4642 4642 4642 4642 4642 4642 4642 4642 4642 13928 13928 13928 13928 13928

521 521 521 521 521 521 521 521 521 521 7677 7677 7677 7677 7677

8142 8142 8142 8142 8142 8142 8142 8142 8142 8142 10234 10234 10234 10234 10234

7.79 7.78 7.78 7.79 7.79 7.79 7.78 7.78 7.79 7.79 6.05 6.05 6.05 6.05 6.05

35.79 16.10 16.10 14.87 14.06 35.79 16.10 16.10 14.87 14.06 49.54 18.88 18.88 17.01 15.81

2.86 2.85 2.85 2.86 2.86 2.86 2.85 2.85 2.86 2.86 3.51 3.51 3.51 3.51 3.51

1.59 1.59 1.59 1.59 1.59 1.59 1.59 1.59 1.59 1.59 2.39 2.39 2.39 2.39 2.39

0.77 0.75 0.74 0.84 1.03 0.77 0.77 0.77 0.82 0.82 0.89 0.89 0.87 1.12 1.19

3.23 2.62 2.36 2.52 2.35 3.19 2.63 2.41 2.43 2.52 3.64 2.75 2.43 2.55 2.40

12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22 12.22

0.15 0.21 0.25 0.22 0.25 0.16 0.21 0.25 0.25 0.23 0.09 0.15 0.18 0.17 0.18

23.67 16.89 14.00 15.78 13.89 23.22 17.00 14.56 14.78 15.78 28.22 18.33 14.78 16.11 14.44

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1094 4741 4741 9405 4741 0 1534 1 1484 7034 4641

5255 33188 33188 9406 33188 39672 49014 675 29607 49239 4642

0 11479 11479 1892 11479 33848 18116 343 12558 10432 521

7938 96034 96034 48754 96034 38651 108836 14548 87792 150250 8142

181.14 41.16 40.96 46.92 39.13 35.69 35.28 242.76 37.16 39.81 7.78

135.81 42.70 28.70 27.46 25.40 11.34 9.50 3.67 9.71 42.40 16.10

92.42 7.45 7.43 5.08 9.84 11.34 8.01 3.01 7.86 7.32 2.85

0.00 4.54 4.54 2.77 7.04 9.71 5.26 1.53 5.65 3.92 1.59

1.38 0.50 0.43 0.50 0.44 0.37 2.38 0.15 0.28 2.07 1.38

325.22 9.16 7.29 7.40 8.80 4.97 7.80 19.37 6.43 8.91 4.00

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 17: Experimental results for scene “sombrero2”.

N á 7938, hit hit TPD : N prim á 263169, Nhit á 136638, N prim á 112239, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 110608, Nshad á 2523, TR ç sé]á 1 â 19, Tapp ç séOá 1 â 10, TRSA ç séOá 0 â 09.

û 186

Appendix E Scene = “teapot12”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 6866 26569 22186 10198 10198 10248 213 213 11852 10198 51751 34811 23502 35521 24880 18888 22949 11858 11875 11881 23502 23822 30936 55335 28719

0 6867 26570 22187 10199 10199 10249 214 214 11853 10199 51752 34812 23503 35522 24881 18889 22950 11859 11876 11882 23503 23823 30937 55336 28720

1 1800 3712 2390 1942 1942 1983 26 26 3030 1942 11450 4295 6337 8102 6364 5132 5255 3036 3051 3057 6337 9466 8478 9907 6446

9264 41612 61429 47489 27347 27347 28294 11345 11345 27771 27347 79971 70563 38220 56119 43086 34319 45003 27771 27892 27892 33869 32983 46548 108251 52848

– 61.95 161.76 219.60 17.71 17.71 20.73 305.93 305.93 14.50 17.71 10.98 14.82 12.01 13.16 11.80 13.53 12.71 14.48 14.51 14.50 12.68 9.34 11.16 13.67 13.67

0 36.01 105.53 181.84 26.28 26.28 30.76 15.76 15.76 27.56 26.28 30.16 27.67 28.97 36.64 29.15 27.99 28.55 27.57 27.56 27.56 28.02 28.32 30.78 33.81 38.24

0 7.57 20.58 35.38 5.12 5.12 6.25 3.47 3.47 5.26 5.12 5.71 5.35 5.50 6.53 5.53 5.33 5.41 5.26 5.26 5.26 5.29 5.36 5.73 6.07 6.76

0 4.52 3.38 3.30 2.79 2.79 3.78 1.62 1.62 3.16 2.79 3.40 2.85 3.33 3.80 3.33 3.21 3.22 3.16 3.15 3.16 3.26 3.68 3.52 3.54 4.20

0.02 0.34 1.24 1.58 1.46 1.48 0.93 0.90 0.92 1.51 1.46 2.66 2.23 1.80 2.91 1.93 1.87 2.30 1.77 1.78 1.81 2.21 3.65 100.93 183.16 95.24

6036.10 15.11 34.04 52.53 10.32 9.99 10.58 36.32 36.55 10.03 10.32 10.06 10.13 9.72 10.90 9.69 9.63 9.29 9.29 9.30 9.32 7.28 9.64 9.77 10.36 10.88

17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32

1.0 0.52 0.49 0.43 0.30 0.30 0.30 0.92 0.92 0.25 0.30 0.19 0.25 0.21 0.18 0.20 0.23 0.22 0.25 0.25 0.25 0.19 0.17 0.18 0.20 0.18

27436.82 51.36 137.41 221.45 29.59 28.09 30.77 147.77 148.82 28.27 29.59 28.41 28.73 26.86 32.23 26.73 26.45 24.91 24.91 24.95 25.05 15.77 26.50 27.09 29.77 32.14

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

23502 23233 23502 21147 18627 23502 18627

23503 23234 23503 21148 18628 23503 18628

6337 6227 6337 5836 5154 6337 5154

38220 38196 38220 35610 34050 38220 34050

3.74 3.74 3.42 3.35 3.86 3.42 3.87

19.32 19.27 23.39 21.58 21.85 22.92 21.42

3.75 3.69 4.59 4.36 4.40 4.49 4.31

2.63 2.62 3.25 3.09 3.09 3.17 3.02

2.13 2.37 2.18 129.81 5.82 2.12 5.90

2.62 2.70 3.28 3.23 3.28 3.59 3.57

16.33 16.33 13.64 13.64 13.64 16.14 16.14

0.36 0.40 0.35 0.38 0.39 0.34 0.38

12.78 13.67 9.79 9.43 9.79 9.50 9.36

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

10198 10198 10198 10198 59876 18672 18672 18672 18672 100779 23502 23502 23502 23502 117412

10199 10199 10199 10199 10199 18673 18673 18673 18673 18673 23503 23503 23503 23503 23503

1942 1942 1942 1942 1942 3159 3159 3159 3159 3159 6337 6337 6337 6337 6337

27347 27347 27347 27347 27347 37728 37728 37728 37728 37728 38220 38220 38220 38220 38220

17.83 17.71 17.71 17.84 17.84 15.36 15.26 15.26 15.36 15.36 12.13 12.01 12.01 12.13 12.13

60.99 26.28 26.28 21.13 18.96 64.58 27.18 27.18 21.82 19.50 70.11 28.97 28.97 22.81 20.27

5.13 5.12 5.12 5.13 5.13 5.28 5.27 5.27 5.28 5.28 5.52 5.50 5.50 5.52 5.52

2.79 2.79 2.79 2.79 2.79 2.84 2.84 2.84 2.84 2.84 3.33 3.33 3.33 3.33 3.33

1.23 1.22 1.19 1.38 1.42 1.45 1.41 1.41 1.77 1.84 1.52 1.50 1.50 1.89 2.03

13.44 10.55 9.42 9.39 9.48 13.47 10.50 9.30 9.31 9.26 13.61 10.45 9.16 9.15 9.12

17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32

0.17 0.25 0.30 0.30 0.30 0.15 0.21 0.26 0.26 0.26 0.11 0.17 0.21 0.21 0.21

43.77 30.64 25.50 25.36 25.77 43.91 30.41 24.95 25.00 24.77 44.55 30.18 24.32 24.27 24.14

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1393 3372 3372 6866 3372 0 2983 2305 5541 7066 10198

6001 23605 23605 6867 23605 48020 103883 30596 93034 49463 10199

0 9421 9421 1800 9421 42398 38995 21099 23297 12222 1942

9264 72493 72493 41612 72493 37819 248398 51033 461085 142944 27347

290.51 44.06 44.08 61.95 42.76 60.94 53.59 51.47 53.29 29.21 17.71

183.11 55.80 38.91 36.01 33.88 26.00 19.88 20.44 19.27 55.29 26.28

119.20 10.41 10.40 7.57 13.13 26.00 16.85 17.54 16.84 10.50 5.12

0.00 7.26 7.26 4.52 10.05 23.03 11.83 14.83 12.95 7.56 2.79

1.66 0.48 0.43 1.14 0.46 0.45 4.60 0.63 1.08 2.19 2.22

1121.33 29.25 22.91 58.51 27.54 20.93 32.59 26.97 23.24 27.43 15.66

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 18: Experimental results for scene “teapot12”.

N á 9264, hit hit TPD : N prim á 263169, Nhit á 226198, N prim á 161546, Nsec á 226089, Nsec á 67517, hit MIN MIN Nshad á 406274, Nshad á 34744, TR ç sé]á 4 â 03, Tapp ç sé]á 3 â 81, TRSA ç sé]á 0 â 22.

Appendix E

187 Scene = “tetra6”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 12623 2891 3199 2971 2971 2819 175 175 2971 2971 2971 2971 2971 2859 2971 2961 2961 2971 2961 2961 2971 2971 3210 3288 3467

0 12624 2892 3200 2972 2972 2820 176 176 2972 2972 2972 2972 2972 2860 2972 2962 2962 2972 2962 2962 2972 2972 3211 3289 3468

1 4392 1868 2176 1948 1948 1820 30 30 1948 1948 1948 1948 1948 1836 1948 1938 1938 1948 1938 1938 1948 1948 2187 2265 2447

4096 49152 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096

– 53.58 16.28 10.47 10.47 10.47 13.24 166.63 166.63 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 11.02

0 29.07 30.27 52.50 14.83 14.83 17.20 9.37 9.37 14.83 14.83 14.83 14.83 14.83 18.51 14.83 14.81 14.81 14.83 14.81 14.81 14.83 14.83 16.03 16.38 25.52

0 5.75 6.24 11.07 2.79 2.79 3.36 2.05 2.05 2.79 2.79 2.79 2.79 2.79 3.04 2.79 2.78 2.78 2.79 2.78 2.78 2.79 2.79 2.96 3.01 4.44

0 4.10 5.50 10.59 2.31 2.31 2.79 0.98 0.98 2.31 2.31 2.31 2.31 2.31 2.56 2.31 2.30 2.30 2.31 2.30 2.30 2.31 2.31 2.48 2.53 3.94

0.01 0.27 0.25 0.36 0.40 0.39 0.26 0.30 0.28 0.40 0.40 0.43 0.39 0.40 0.49 0.38 0.38 0.44 0.44 0.44 0.42 0.40 0.43 10.39 10.77 11.29

821.40 3.30 2.37 2.86 1.83 1.84 1.87 4.43 4.44 1.78 1.83 1.78 1.77 1.77 1.92 1.77 1.75 1.76 1.76 1.66 1.76 1.85 1.82 1.72 1.73 2.03

29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00

1.0 0.39 0.16 0.07 0.20 0.20 0.21 0.86 0.86 0.20 0.20 0.20 0.20 0.20 0.17 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.20 0.19 0.18 0.13

27380.00 81.00 50.00 66.33 32.00 32.33 33.33 118.67 119.00 30.33 32.00 30.33 30.00 30.00 35.00 30.00 29.33 29.67 29.67 26.33 29.67 32.67 31.67 28.33 28.67 38.67

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

2971 3155 2971 3291 2459 2971 2459

2972 3156 2972 3292 2460 2972 2460

1948 2132 1948 2268 1636 1948 1636

4096 4096 4096 4096 4096 4096 4096

8.42 8.42 7.55 7.55 33.13 7.54 32.92

13.01 13.85 13.33 15.80 18.33 13.17 18.07

2.55 2.75 2.59 3.10 3.40 2.56 3.35

2.21 2.41 2.24 2.75 2.81 2.21 2.77

0.50 0.42 0.49 8.22 0.70 0.46 0.72

1.33 1.35 1.45 1.46 1.93 1.71 2.25

26.50 26.50 30.50 30.50 30.50 19.80 19.80

0.46 0.44 0.48 0.39 0.54 0.40 0.53

40.00 41.00 42.00 42.50 66.00 14.40 25.20

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2971 2971 2971 2971 10067 2971 2971 2971 2971 10067 2971 2971 2971 2971 10067

2972 2972 2972 2972 2972 2972 2972 2972 2972 2972 2972 2972 2972 2972 2972

1948 1948 1948 1948 1948 1948 1948 1948 1948 1948 1948 1948 1948 1948 1948

4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096 4096

10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47 10.47

32.60 14.83 14.83 13.62 12.55 32.60 14.83 14.83 13.62 12.55 32.60 14.83 14.83 13.62 12.55

2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79 2.79

2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31 2.31

0.32 0.31 0.30 0.35 0.37 0.32 0.32 0.31 0.35 0.37 0.32 0.30 0.31 0.36 0.37

2.48 1.88 1.73 1.69 1.70 2.49 1.87 1.65 1.72 1.70 2.49 1.88 1.73 1.81 1.71

29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00 29.00

0.11 0.17 0.20 0.21 0.21 0.10 0.16 0.20 0.18 0.19 0.11 0.17 0.20 0.18 0.20

53.67 33.67 28.67 27.33 27.67 54.00 33.33 26.00 28.33 27.67 54.00 33.67 28.67 31.33 28.00

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

458 6189 6189 12623 6189 0 969 1 1565 6449 2971

2629 43324 43324 12624 43324 21952 35045 12167 40404 45144 2972

0 21100 21100 4392 21100 18444 18069 9913 14220 20976 1948

4096 110592 110592 49152 110592 25612 88028 19600 218944 120336 4096

196.66 47.07 46.96 53.58 44.82 70.84 80.99 78.62 118.31 25.43 10.47

60.58 44.33 30.14 29.07 24.96 12.82 15.88 11.22 12.31 35.86 14.83

45.93 7.71 7.71 5.75 9.12 12.82 14.03 10.56 9.98 6.59 2.79

0.00 5.96 5.97 4.10 7.45 11.08 10.95 8.91 6.83 5.55 2.31

0.76 0.40 0.32 0.41 0.36 0.21 1.60 0.26 0.47 1.33 0.48

148.45 6.91 5.27 5.36 6.46 4.16 7.48 4.98 6.96 5.54 2.66

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

N˜ ET S

N˜ EET S

Table 19: Experimental results for scene “tetra6”.

N á 4096, hit hit TPD : N prim á 263169, Nhit á 159213, N prim á 49950, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 46262, Nshad á 5552, TR ç sé]á 0 â 9, Tapp ç sé]á 0 â 87, TRSA ç sé]á 0 â 03.

188ü

Appendix E Scene = “tree11”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 318 24036 20719 2736 2736 2868 58 56 3014 2736 32483 17687 4369 9579 4397 4364 4463 3014 3073 3073 4369 4301 4834 8042 5868

0 319 24037 20720 2737 2737 2869 59 57 3015 2737 32484 17688 4370 9580 4398 4365 4464 3015 3074 3074 4370 4302 4835 8043 5869

1 226 4478 2295 937 937 955 6 6 1159 937 10236 3105 1743 2424 1743 1775 1775 1159 1215 1215 1743 1935 1885 1742 1517

8191 9377 41989 40789 10501 10501 10843 8451 8451 10543 10501 35110 27487 11327 16533 11396 11274 11511 10543 10540 10540 11107 10684 11764 19799 15894

– 2680.80 390.41 625.25 27.22 27.22 33.52 632.46 633.45 25.04 27.22 18.71 21.38 22.24 17.41 22.24 22.37 22.32 25.04 24.96 24.96 21.58 21.07 21.30 22.89 58.22

0 63.27 175.41 198.81 14.17 14.17 17.43 10.33 10.08 14.78 14.17 15.76 14.80 14.94 18.58 14.97 14.88 14.97 14.78 14.79 14.79 14.65 14.68 15.24 18.65 23.21

0 13.21 37.70 48.32 3.57 3.57 4.61 2.90 2.85 3.68 3.57 3.84 3.67 3.70 4.22 3.71 3.69 3.71 3.68 3.68 3.68 3.63 3.66 3.72 4.37 4.77

0 7.02 5.27 3.49 0.77 0.77 1.74 0.13 0.13 0.87 0.77 1.00 0.83 0.91 1.82 0.91 0.91 0.91 0.87 0.87 0.87 0.92 1.01 0.97 1.44 0.97

0.01 0.23 1.11 1.67 1.59 1.58 0.91 1.14 1.14 1.61 1.59 2.50 2.18 1.65 2.17 1.66 1.74 2.00 1.76 1.90 1.91 1.74 2.26 18.36 30.71 24.57

14902.70 746.48 125.86 198.83 12.30 12.04 14.44 155.95 157.25 11.74 12.30 10.69 10.86 11.53 11.39 11.39 11.40 10.82 11.19 11.17 11.19 10.59 10.76 10.69 11.87 18.80

18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14

1.0 0.94 0.45 0.54 0.41 0.41 0.41 0.96 0.96 0.38 0.41 0.30 0.35 0.35 0.26 0.35 0.36 0.35 0.38 0.38 0.38 0.34 0.34 0.34 0.31 0.48

67739.55 3374.95 553.95 885.64 37.77 36.59 47.50 690.73 696.64 35.23 37.77 30.45 31.23 34.27 33.64 33.64 33.68 31.05 32.73 32.64 32.73 30.00 30.77 30.45 35.82 67.32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

4369 5196 4369 7047 4400 4369 4400

4370 5197 4370 7048 4401 4370 4401

1743 2031 1743 2750 1456 1743 1456

11327 12654 11327 14300 19110 11327 19110

4.59 3.64 7.94 4.62 126.16 7.88 126.27

19.90 16.05 23.89 19.69 21.46 23.69 21.33

5.18 4.01 5.96 5.13 5.25 5.90 5.21

1.44 0.92 1.95 2.75 2.30 1.94 2.26

1.87 2.44 1.91 376.04 13.69 1.93 13.67

5.79 5.48 4.76 4.16 26.10 5.04 26.92

16.22 16.22 11.76 11.76 11.76 16.00 16.00

0.51 0.54 0.56 0.54 0.96 0.45 0.96

8.96 7.61 16.24 12.71 141.76 12.00 133.56

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

2736 2736 2736 2736 19843 5370 5370 5370 5370 37771 4369 4369 4369 4369 31626

2737 2737 2737 2737 2737 5371 5371 5371 5371 5371 4370 4370 4370 4370 4370

937 937 937 937 937 1792 1792 1792 1792 1792 1743 1743 1743 1743 1743

10501 10501 10501 10501 10501 12405 12405 12405 12405 12405 11327 11327 11327 11327 11327

27.33 27.22 27.22 27.35 27.35 23.12 23.01 23.01 23.13 23.13 22.34 22.24 22.24 22.36 22.36

32.45 14.17 14.17 13.78 12.54 33.85 14.51 14.51 14.11 12.81 34.95 14.94 14.94 14.52 13.16

3.58 3.57 3.57 3.58 3.58 3.63 3.62 3.62 3.63 3.63 3.71 3.70 3.70 3.71 3.71

0.77 0.77 0.77 0.77 0.77 0.81 0.81 0.81 0.81 0.81 0.91 0.91 0.91 0.91 0.91

1.33 1.32 1.33 1.38 1.37 1.46 1.45 1.43 1.53 1.57 1.38 1.36 1.38 1.43 1.47

15.84 12.20 11.54 12.80 12.61 15.22 11.51 10.64 11.10 11.27 15.38 11.58 10.71 12.02 11.26

18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14 18.14

0.26 0.38 0.41 0.35 0.36 0.22 0.33 0.37 0.35 0.34 0.21 0.31 0.35 0.29 0.32

53.86 37.32 34.32 40.05 39.18 51.05 34.18 30.23 32.32 33.09 51.77 34.50 30.55 36.50 33.05

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1411 150 150 318 150 0 916 9 498 931 2736

5513 1051 1051 319 1051 34968 52609 24680 32174 6518 2737

0 886 886 226 886 26207 37188 21625 19626 3001 937

8191 10402 10402 9377 10402 17297 37463 15584 60830 15157 10501

118.86 2002.70 2000.20 2680.80 1997.50 2311.90 19.63 58.40 74.48 32.13 27.22

50.11 132.22 85.94 63.27 72.66 12.40 13.31 6.70 14.55 38.13 14.17

33.17 21.82 21.81 13.21 28.90 12.40 12.57 4.07 12.83 9.10 3.57

0.00 14.57 14.57 7.02 21.65 8.68 10.65 3.18 8.25 6.10 0.77

1.38 0.26 0.25 1.54 0.26 0.30 1.78 0.50 0.27 1.16 2.00

495.04 676.30 653.08 908.03 664.61 679.65 20.61 38.62 32.55 35.03 22.26

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

N˜ ET S

N˜ EET S

Table 20: Experimental results for scene “tree11”.

N á 8191, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 169904, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 1097802, Nshad á 42522, TR ç séOá 4 â 21, Tapp ç sé]á 3 â 99, TRSA ç sé]á 0 â 22.

Appendix E

189

balls4

gears4

jacks4

lattice12

mount6

Figure 3: Visualization of the G4SPD scenes using the testing procedure TPD .

ê 190

Appendix E

rings7

sombrero2

teapot12

tetra6

tree11

Figure 4: Visualization of the G4SPD scenes using the testing procedure TPD .

Appendix E

191 Scene = “balls5”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 589 65201 65390 7130 7130 8073 99 97 7949 7130 243317 161104 80093 139692 80573 79446 80549 7982 8065 8099 80093 73702 97257 184866 146592

0 590 65202 65391 7131 7131 8074 100 98 7950 7131 243318 161105 80094 139693 80574 79447 80550 7983 8066 8100 80094 73703 97258 184867 146593

1 260 506 19 888 888 1048 13 13 1105 888 50718 21479 14316 16599 14363 14231 14274 1138 1212 1246 14316 20822 17400 10488 16774

66431 77205 268019 234572 84927 84927 87604 68241 68240 85475 84927 347811 295835 173040 249757 174144 172312 174925 85475 85468 85468 136996 130584 196761 425021 305799

– 2094.20 290.57 288.46 46.33 46.33 47.58 3100.80 3100.80 45.76 46.33 11.44 12.82 13.52 12.24 12.95 13.52 12.95 45.19 45.70 45.13 11.12 10.69 12.96 14.49 28.27

0 55.39 178.26 194.32 27.48 27.48 29.90 13.68 13.46 28.90 27.48 37.63 34.38 35.17 44.56 35.49 35.16 35.49 29.21 28.96 29.27 35.69 34.42 36.04 41.04 50.31

0 10.41 37.57 42.20 5.26 5.26 5.89 3.15 3.08 5.55 5.26 7.01 6.38 6.59 7.40 6.60 6.59 6.60 5.56 5.57 5.57 6.70 6.42 6.68 7.36 8.42

0 5.73 0.86 0.02 1.44 1.44 2.28 0.32 0.32 1.58 1.44 2.36 1.99 2.14 3.10 2.44 2.14 2.44 1.87 1.60 1.90 2.19 2.74 2.25 2.10 2.67

0.13 2.00 9.21 14.37 14.57 14.64 8.72 10.27 10.17 14.63 14.57 22.50 20.81 18.02 25.30 18.16 19.12 21.76 16.38 17.20 17.26 21.02 32.29 293.40 545.81 469.09

56977.20 798.14 136.08 147.99 24.85 24.33 24.90 1387.11 1382.27 24.74 24.85 19.48 18.92 19.27 20.97 19.15 19.26 18.30 23.54 23.60 23.56 18.17 18.34 19.11 20.86 25.87

12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21

1.0 0.96 0.51 0.49 0.52 0.52 0.51 0.99 0.99 0.51 0.52 0.17 0.20 0.20 0.15 0.19 0.20 0.19 0.50 0.51 0.50 0.17 0.17 0.19 0.19 0.27

121228.09 1685.96 277.32 302.66 40.66 39.55 40.77 2939.09 2928.79 40.43 40.66 29.23 28.04 28.79 32.40 28.53 28.77 26.72 37.87 38.00 37.91 26.45 26.81 28.45 32.17 42.83

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

80093 80741 80093 134766 72559 80093 72559

80094 80742 80094 134767 72560 80094 72560

14316 14464 14316 26449 13467 14316 13467

173040 174575 173040 233120 168828 173040 168828

6.82 6.72 6.57 5.85 6.72 6.48 6.67

39.41 38.47 39.13 39.52 38.14 38.55 37.66

8.27 8.00 8.24 8.12 8.03 8.11 7.94

3.14 2.98 3.37 3.21 3.34 3.32 3.29

20.75 25.29 20.75 3515.50 118.20 20.68 118.09

5.09 5.07 5.06 5.27 5.07 5.36 5.36

10.65 10.65 9.92 9.92 9.92 10.74 10.74

0.33 0.34 0.29 0.34 0.31 0.29 0.31

11.48 11.39 9.54 10.35 9.58 9.11 9.11

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

7130 7130 7130 7130 55966 18889 18889 18889 18889 145833 80093 80093 80093 80093 590522

7131 7131 7131 7131 7131 18890 18890 18890 18890 18890 80094 80094 80094 80094 80094

888 888 888 888 888 2866 2866 2866 2866 2866 14316 14316 14316 14316 14316

84927 84927 84927 84927 84927 101020 101020 101020 101020 101020 173040 173040 173040 173040 173040

50.30 46.33 46.33 50.34 50.34 25.66 23.68 23.68 25.69 25.69 14.56 13.51 13.52 14.60 14.59

69.09 27.48 27.48 24.78 22.04 79.35 29.95 29.95 26.95 23.77 102.19 35.17 35.17 31.30 27.12

5.45 5.26 5.26 5.45 5.45 5.87 5.66 5.66 5.87 5.87 6.84 6.59 6.59 6.85 6.85

1.44 1.44 1.44 1.44 1.44 1.67 1.66 1.66 1.67 1.67 2.15 2.14 2.14 2.15 2.15

11.95 11.93 11.93 12.10 12.10 12.92 12.87 12.92 13.24 13.40 14.87 14.82 14.79 16.44 16.92

31.75 24.98 23.14 24.89 25.17 28.20 21.15 19.06 20.09 20.23 29.14 20.63 18.22 19.40 19.38

12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21 12.21

0.35 0.47 0.52 0.47 0.47 0.20 0.29 0.34 0.32 0.31 0.11 0.17 0.20 0.18 0.18

55.34 40.94 37.02 40.74 41.34 47.79 32.79 28.34 30.53 30.83 49.79 31.68 26.55 29.06 29.02

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

9936 301 301 589 301 0 13736 21 20704 3063 7130

45674 2108 2108 590 2108 329219 508722 199570 669181 21442 7131

0 1071 1071 260 1071 298274 147921 176549 127121 5275 888

66431 87856 87856 77205 87856 106007 583874 142830 5540218 120197 84927

827.35 1117.00 1131.40 2094.20 1107.10 403.52 31.44 36.00 65.97 37.28 46.33

1116.40 115.41 73.33 55.39 62.98 19.49 24.57 25.61 25.09 51.08 27.48

780.87 18.63 18.62 10.41 24.49 19.49 22.17 22.03 20.81 9.59 5.26

0.00 12.07 12.07 5.73 18.02 15.89 16.36 18.77 13.82 5.45 1.44

17.11 1.93 1.91 2.88 1.91 2.67 287.72 4.55 10.91 8.83 16.77

12335.30 463.25 447.03 862.52 464.01 157.70 43.76 67.30 60.05 48.61 42.00

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 21: Experimental results for scene “balls5”.

N á 66431, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263169, Nsec á 195150, Nsec á 150302, hit MIN MIN Nshad á 979728, Nshad á 314095, TR ç sé]á 6 â 21, Tapp ç séOá 5 â 74, TRSA ç séOá 0 â 47.

ô 192

Appendix E Scene = “gears9”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 14427 57571 65535 21814 21814 16173 155 155 23474 21814 864109 498216 392145 425150 425416 271549 361216 23486 24355 24359 392145 399893 606852 997410 592775

0 14428 57572 65536 21815 21815 16174 156 156 23475 21815 864110 498217 392146 425151 425417 271550 361217 23487 24356 24360 392146 399894 606853 997411 592776

1 1760 4708 0 4114 4114 2576 0 0 4114 4114 37583 17194 16164 19119 17751 10789 12355 4126 4857 4861 16164 18269 26635 43167 27821

106435 344324 492923 386404 163635 163635 151279 109968 109968 163747 163635 1386750 1081377 757083 811415 838913 618988 811187 163747 164056 164056 603661 757150 1056116 1812599 1086071

– 79.83 198.54 187.29 19.53 19.53 27.50 1128.00 1128.00 18.77 19.53 6.52 8.96 7.56 7.53 7.42 8.40 8.02 18.77 18.48 18.48 7.03 7.51 7.71 7.83 13.27

0 20.55 74.87 79.83 19.97 19.97 25.01 9.59 9.59 20.49 19.97 27.47 24.91 26.16 27.58 26.43 25.37 25.97 20.49 20.57 20.57 26.35 26.18 28.26 30.39 36.44

0 2.76 14.40 18.06 2.62 2.62 4.04 1.56 1.56 2.66 2.62 3.65 3.29 3.41 3.55 3.44 3.30 3.37 2.66 2.68 2.68 3.52 3.41 3.74 4.04 4.60

0 0.74 1.99 0.00 0.48 0.48 2.08 0.00 0.00 0.48 0.48 0.76 0.58 0.66 0.79 0.67 0.63 0.63 0.48 0.50 0.50 0.68 0.67 0.54 0.70 0.84

0.22 4.12 16.71 21.34 19.36 19.90 12.91 14.28 14.30 20.02 19.36 67.35 46.74 38.07 54.57 41.07 33.68 47.72 22.69 23.20 23.44 106.75 79.88 1982.32 4111.98 2257.84

129294.00 45.91 127.53 130.08 31.77 32.83 33.93 560.61 563.81 30.03 31.77 42.04 35.97 31.51 34.85 33.57 28.37 33.15 27.96 27.83 29.68 35.90 33.99 38.46 47.49 41.25

6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49

1.0 0.93 0.91 0.89 0.78 0.78 0.80 1.00 1.00 0.77 0.78 0.46 0.56 0.51 0.50 0.50 0.54 0.53 0.77 0.76 0.76 0.48 0.51 0.50 0.48 0.57

99456.92 28.82 91.61 93.57 17.95 18.76 19.61 424.75 427.21 16.61 17.95 25.85 21.18 17.75 20.32 19.33 15.33 19.01 15.02 14.92 16.34 21.12 19.65 23.09 30.04 25.24

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

392145 408406 392145 472765 481568 392145 481568

392146 408407 392146 472766 481569 392146 481569

16164 17952 16164 20764 37096 16164 37096

757083 770637 757083 872136 1048342 757083 1048342

2.79 2.67 3.13 3.04 6.58 3.12 6.52

18.24 18.00 20.87 20.75 28.49 20.56 28.42

2.46 2.42 2.62 2.65 3.79 2.59 3.80

0.39 0.41 0.44 0.51 0.98 0.43 1.02

43.20 49.98 44.93 2379.36 133.25 45.11 136.83

12.56 14.30 13.55 13.20 16.93 13.27 18.80

6.55 6.55 6.05 6.05 6.05 6.03 6.03

0.89 0.91 0.89 0.89 0.89 0.89 0.90

15.10 18.10 15.81 15.24 21.26 14.70 23.34

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

21814 21814 21814 21814 106625 57821 57821 57821 57821 311851 392145 392145 392145 392145 1656236

21815 21815 21815 21815 21815 57822 57822 57822 57822 57822 392146 392146 392146 392146 392146

4114 4114 4114 4114 4114 4647 4647 4647 4647 4647 16164 16164 16164 16164 16164

163635 163635 163635 163635 163635 253474 253474 253474 253474 253474 757083 757083 757083 757083 757083

26.06 19.53 19.53 26.07 26.07 17.20 12.72 12.72 17.20 17.20 9.70 7.56 7.56 9.71 9.71

38.74 19.97 19.97 18.03 17.29 46.19 21.96 21.96 19.93 18.95 60.34 26.16 26.16 23.16 21.35

2.70 2.62 2.62 2.70 2.70 3.00 2.89 2.89 3.00 3.00 3.49 3.41 3.41 3.49 3.49

0.49 0.48 0.48 0.49 0.49 0.50 0.49 0.49 0.50 0.50 0.67 0.66 0.66 0.67 0.67

15.81 15.94 16.00 16.78 16.96 17.18 17.38 17.16 18.92 20.48 35.25 33.16 33.56 45.09 67.40

32.18 29.81 27.73 25.48 26.42 27.75 27.78 25.63 25.71 35.13 41.06 34.35 30.69 44.82 57.12

6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49 6.49

0.63 0.70 0.78 0.88 0.84 0.61 0.60 0.68 0.68 0.44 0.35 0.44 0.51 0.31 0.23

18.26 16.44 14.84 13.11 13.83 14.85 14.88 13.22 13.28 20.53 25.09 19.93 17.12 27.98 37.45

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

13245 7461 7461 14427 7461 0 12 1 20325 14601 21814

70137 52228 52228 14428 52228 528384 285159 8712 659907 102208 21815

0 16228 16228 1760 16228 420540 83202 5600 184514 25372 4114

106435 583908 583908 344324 583908 523636 879690 238592 4277637 709952 163635

456.47 56.62 54.38 79.83 41.55 25.16 28.40 270.82 45.72 29.41 19.53

459.87 22.89 18.62 20.55 14.26 22.64 7.47 7.14 17.57 32.77 19.97

340.89 4.13 4.13 2.76 5.04 22.64 5.73 6.14 15.16 5.99 2.62

0.00 1.89 1.89 0.74 3.26 19.67 0.48 4.12 11.38 3.47 0.48

160.09 5.22 4.81 5.93 4.67 7.78 306.02 2.43 15.10 22.52 22.97

10934.30 60.22 55.77 112.41 56.16 44.49 1308.26 184.99 134.40 54.41 40.59

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 22: Experimental results for scene “gears9”.

N á 106435, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 247173, Nsec á 163990, Nsec á 111495, hit MIN MIN Nshad á 1365418, Nshad á 365632, TR ç sé]á 9 â 74, Tapp ç sé]á 8 â 44, TRSA ç sé^á 1 â 30.

Appendix E

193 Scene = “jacks5”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 25669 64941 65146 36148 36148 36456 249 249 39427 36148 622889 350355 193204 291149 288425 128243 250710 39430 39609 39614 193204 200237 372561 888130 500942

0 25670 64942 65147 36149 36149 36457 250 250 39428 36149 622890 350356 193205 291150 288426 128244 250711 39431 39610 39615 193205 200238 372562 888131 500943

1 1678 940 67 2911 2911 2868 18 18 5273 2911 52924 11861 39615 26100 39623 27301 27328 5276 5443 5448 39615 106481 46743 8329 28712

42129 172204 267797 242024 132833 132833 138336 53569 53569 133547 132833 966375 737969 288851 438877 542868 222654 534275 133547 133601 133601 152791 140781 576245 1640883 887100

– 52.52 76.95 78.18 33.52 33.52 35.88 756.18 756.18 31.55 33.52 19.48 24.67 20.08 21.10 20.83 21.77 22.57 31.55 31.54 31.54 12.71 8.16 19.40 24.40 25.71

0 36.70 61.14 66.66 38.63 38.63 38.77 13.94 13.94 39.97 38.63 65.98 54.98 53.98 61.02 57.71 50.00 55.61 39.97 40.00 40.00 54.33 53.38 60.89 70.04 71.16

0 6.41 11.54 12.88 6.68 6.68 6.72 2.66 2.66 6.92 6.68 11.47 9.57 9.40 10.25 10.07 8.69 9.70 6.92 6.93 6.93 9.47 9.26 10.55 11.88 11.57

0 1.73 0.48 0.01 1.81 1.81 1.68 0.45 0.45 2.57 1.81 4.01 2.16 3.83 3.25 3.83 3.54 3.54 2.57 2.58 2.58 3.83 6.00 3.83 2.21 3.18

0.09 1.55 6.77 10.67 10.23 10.20 6.05 6.55 6.55 10.28 10.23 25.83 19.58 14.98 22.83 19.19 14.31 21.86 11.85 12.51 12.55 26.34 45.07 1111.27 2736.52 1584.85

26711.80 28.38 41.88 44.81 23.39 23.26 24.23 235.41 235.31 22.81 23.39 24.32 24.17 21.76 23.80 23.09 21.47 22.70 22.28 22.27 22.29 18.43 15.48 23.14 27.82 27.08

7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79

1.0 0.79 0.77 0.75 0.69 0.69 0.71 0.99 0.99 0.67 0.69 0.43 0.54 0.49 0.47 0.48 0.53 0.51 0.67 0.67 0.67 0.38 0.28 0.45 0.48 0.48

55649.58 51.33 79.46 85.56 40.94 40.67 42.69 482.65 482.44 39.73 40.94 42.88 42.56 37.54 41.79 40.31 36.94 39.50 38.62 38.60 38.65 30.60 24.46 40.42 50.17 48.62

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

193204 318387 193204 204890 525576 193204 525576

193205 318388 193205 204891 525577 193205 525577

39615 24304 39615 43813 70994 39615 70994

288851 3198608 288851 363136 1803037 288851 1803037

10.19 11.98 11.15 7.67 68.63 11.10 68.55

32.57 10.31 38.94 25.30 51.01 38.72 50.72

6.09 0.64 7.24 4.28 9.42 7.20 9.36

3.69 0.16 4.52 2.66 4.69 4.50 4.66

17.05 66.62 17.05 3176.03 253.52 17.11 253.71

3.42 3.10 4.10 3.29 10.85 4.53 11.09

4.09 4.09 5.83 5.83 5.83 7.16 7.16

0.40 0.78 0.40 0.47 0.76 0.43 0.76

11.45 10.00 16.94 12.44 54.44 16.68 51.21

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

36148 36148 36148 36148 312886 88006 88006 88006 88006 717084 193204 193204 193204 193204 1421499

36149 36149 36149 36149 36149 88007 88007 88007 88007 88007 193205 193205 193205 193205 193205

2911 2911 2911 2911 2911 7223 7223 7223 7223 7223 39615 39615 39615 39615 39615

132833 132833 132833 132833 132833 209203 209203 209203 209203 209203 288851 288851 288851 288851 288851

34.39 33.52 33.52 34.39 34.39 26.18 25.49 25.49 26.19 26.19 20.74 20.08 20.08 20.75 20.75

103.31 38.63 38.63 29.66 24.66 132.74 45.43 45.43 35.10 28.71 170.30 53.98 53.98 41.62 33.72

6.78 6.68 6.68 6.78 6.78 8.02 7.89 7.89 8.02 8.02 9.54 9.40 9.40 9.55 9.55

1.81 1.81 1.81 1.81 1.81 2.04 2.04 2.04 2.05 2.05 3.83 3.83 3.83 3.83 3.83

8.58 8.60 8.59 9.21 9.51 10.17 10.01 10.05 11.76 12.46 12.82 12.54 12.55 16.31 20.80

28.29 24.01 22.63 24.63 24.54 28.75 23.03 21.44 23.40 23.18 30.22 22.92 20.78 23.00 37.59

7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79 7.79

0.53 0.64 0.69 0.62 0.63 0.42 0.54 0.59 0.53 0.54 0.32 0.44 0.49 0.43 0.25

51.15 42.23 39.35 43.52 43.33 52.10 40.19 36.88 40.96 40.50 55.17 39.96 35.50 40.12 70.52

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

5214 14001 14001 25669 14001 0 4511 131 21873 22039 36148

27779 98008 98008 25670 98008 210145 194294 126560 324096 154274 36149

0 12164 12164 1678 12164 139360 72699 83138 30464 17066 2911

42129 329450 329450 172204 329450 299206 360849 194877 1493122 423070 132833

676.63 36.09 36.05 52.52 34.63 38.81 62.72 59.80 63.32 32.27 33.52

759.71 56.84 36.06 36.70 33.80 18.95 35.50 25.33 19.36 63.54 38.63

551.99 8.87 8.85 6.41 13.70 18.95 31.43 20.92 15.05 9.87 6.68

0.00 3.06 3.06 1.73 8.10 13.49 20.21 13.06 7.40 3.39 1.81

23.41 2.15 1.89 2.09 1.92 2.80 191.34 3.56 3.87 8.74 12.47

5106.45 39.53 33.80 39.28 40.03 30.69 64.66 55.27 50.37 40.20 31.82

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 23: Experimental results for scene “jacks5”.

N á 42129, hit hit TPD : N prim á 263169, Nhit á 231363, N prim á 115562, Nsec á 281028, Nsec á 197780, hit MIN MIN Nshad á 218826, Nshad á 116281, TR ç sé]á 4 â 22, Tapp ç séOá 3 â 74, TRSA ç séOá 0 â 48.

ù 194

Appendix E Scene = “lattice29”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 65535 65223 65535 64531 64531 64433 255 255 65159 64531 799987 188841 710837 710635 713210 574762 608309 65159 65159 65159 710837 764409 704739 795680 697883

0 65536 65224 65536 64532 64532 64434 256 256 65160 64532 799988 188842 710838 710636 713211 574763 608310 65160 65160 65160 710838 764410 704740 795681 697884

1 0 2640 0 4969 4969 2668 0 0 5597 4969 75291 23435 75291 53642 75291 75161 75168 5596 5592 5592 75291 78072 74907 40160 47683

105300 330512 243821 233480 216163 216163 221091 120512 120512 216163 216163 881297 322007 792147 813609 794520 656202 689767 216164 216168 216168 791392 842938 786433 912121 806865

– 28.28 21.18 20.73 19.62 19.62 18.80 708.01 708.01 19.61 19.62 5.01 11.57 5.11 6.24 5.11 5.66 5.40 18.93 20.03 20.03 5.11 4.98 4.81 7.19 7.73

0 35.71 34.70 35.70 36.86 36.86 35.43 11.58 11.58 36.87 36.86 54.96 42.61 54.80 54.50 54.80 53.99 53.81 36.86 36.88 36.88 54.82 54.49 52.26 52.24 52.79

0 5.03 4.78 5.17 5.29 5.29 4.98 1.53 1.53 5.29 5.29 7.63 5.87 7.62 7.60 7.62 7.58 7.43 5.23 5.29 5.29 7.63 7.57 7.45 7.52 7.84

0 0.00 0.02 0.00 0.06 0.06 0.02 0.00 0.00 0.06 0.06 3.50 0.76 3.50 2.55 3.50 3.46 3.47 0.07 0.07 0.07 3.51 3.53 3.62 1.58 1.89

0.22 3.75 12.70 19.46 19.39 19.39 12.65 14.40 14.39 19.46 19.39 37.82 22.80 34.46 46.59 36.84 32.65 42.59 23.43 23.69 23.61 50.74 142.20 1892.45 2130.64 1860.60

129143.00 53.39 40.54 41.70 42.76 42.89 41.71 835.01 831.06 43.29 42.76 35.69 34.30 31.86 34.37 31.90 32.72 30.78 41.80 43.39 43.20 31.84 33.20 32.18 35.18 35.12

8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73

1.0 0.72 0.67 0.66 0.64 0.64 0.64 1.00 1.00 0.64 0.64 0.23 0.47 0.23 0.27 0.23 0.26 0.25 0.63 0.64 0.64 0.23 0.23 0.23 0.31 0.32

107619.17 35.76 25.05 26.02 26.90 27.01 26.02 687.11 683.82 27.34 26.90 21.01 19.85 17.82 19.91 17.85 18.53 16.92 26.10 27.42 27.27 17.80 18.93 18.08 20.58 20.53

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

710837 371502 710837 242344 293285 710837 293285

710838 371503 710838 242345 293286 710838 293286

75291 114060 75291 42679 62804 75291 62804

792147 412682 792147 341976 395893 792147 395893

37.94 44.15 4.89 1971.70 11.07 4.87 10.69

190.83 95.94 65.95 17.40 48.59 65.77 48.21

35.81 24.04 10.02 3.06 8.36 9.98 8.29

22.44 10.08 5.26 0.63 2.18 5.23 2.18

39.62 40.72 39.74 997.97 73.71 39.34 73.69

12.57 10.08 7.33 577.37 7.96 8.74 8.15

8.56 8.56 9.28 9.28 9.28 7.86 7.86

0.34 0.57 0.33 1.00 0.58 0.49 0.58

61.28 47.44 9.51 1471.15 11.13 9.98 8.78

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

64529 64529 64529 64529 148943 142744 142744 142744 142744 519230 721357 721357 721357 721357 -

64530 64530 64530 64530 64530 142745 142745 142745 142745 142745 721358 721358 721358 721358 -

4966 4966 4966 4966 4966 22200 22200 22200 22200 22200 75310 75310 75310 75310 -

216164 216164 216164 216164 216164 277145 277145 277145 277145 277145 802648 802648 802648 802648 -

19.36 18.94 18.94 19.36 19.36 12.57 12.29 12.29 12.57 12.57 5.18 5.00 5.00 5.18 -

93.30 36.85 36.85 18.93 18.37 105.44 41.23 41.23 23.12 22.33 154.48 54.68 54.68 32.33 -

5.32 5.22 5.22 5.32 5.32 5.75 5.64 5.64 5.75 5.75 7.61 7.49 7.49 7.61 -

0.07 0.07 0.07 0.07 0.07 0.71 0.70 0.70 0.71 0.71 3.48 3.48 3.48 3.48 -

16.87 16.65 16.70 18.04 18.19 18.68 18.42 18.54 21.43 22.37 33.48 28.85 29.30 119.94 -

50.22 44.15 41.24 41.91 41.75 43.02 37.49 33.77 33.14 35.86 60.08 35.35 30.09 44.79 -

8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 8.73 -

0.49 0.58 0.63 0.62 0.62 0.35 0.42 0.49 0.50 0.45 0.09 0.18 0.23 0.13 -

33.12 28.06 25.63 26.19 26.06 27.12 22.51 19.41 18.88 21.15 41.33 20.73 16.34 28.59 -

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

11839 37449 37449 65535 37449 0 0 1 1 37449 64531

69959 262144 262144 65536 262144 531441 0 166375 106032 262144 64532

0 26944 26944 0 26944 138248 0 0 0 14408 4969

105300 561600 561600 330512 561600 749107 0 1596184 1602704 507656 216163

490.70 20.31 20.21 28.28 17.34 14.09 0.00 78.06 84.67 17.90 19.62

758.26 51.66 32.17 35.71 29.71 9.23 0.00 7.37 5.87 50.76 36.86

583.97 7.38 7.32 5.03 11.87 9.23 0.00 6.37 4.87 7.25 5.29

0.00 0.29 0.29 0.00 5.67 2.81 0.00 0.00 0.00 0.07 0.06

294.53 4.82 4.51 5.51 4.44 8.54 0.00 5.53 4.23 18.48 24.42

13803.80 71.58 61.47 79.71 72.89 43.65 1.00 157.03 140.24 66.32 71.33

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 24: Experimental results for scene “lattice29”.

N á 105300, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 262875, Nsec á 253547, Nsec á 193215, hit MIN MIN Nshad á 1143451, Nshad á 927614, TR ç séOá 11 â 68, Tapp ç sé^á 10 â 48, TRSA ç séOá 1 â 2.

Appendix E

195 Scene = “mount8”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 7121 62451 65535 13321 13321 14912 161 161 13766 13321 253236 106592 199904 268007 199934 199330 199368 13776 13859 13868 199904 216996 242670 340273 240175

0 7122 62452 65536 13322 13322 14913 162 162 13767 13322 253237 106593 199905 268008 199935 199331 199369 13777 13860 13869 199905 216997 242671 340274 240176

1 1895 2156 0 3247 3247 3443 30 30 3595 3247 112343 22695 94141 101388 94157 93448 93465 3605 3652 3661 94141 104728 111817 106521 87303

131076 227180 415452 136619 153865 153865 180978 134845 134845 153961 153865 217276 160289 182738 262321 182770 183062 183094 153961 154140 154140 164729 190004 207413 379434 267534

– 58.19 222.20 324.91 18.63 18.63 22.80 921.70 921.70 18.17 18.63 5.42 6.37 6.11 5.83 5.63 6.30 5.80 18.17 18.19 18.18 6.08 5.53 6.07 5.93 6.16

0 22.19 83.55 420.70 18.90 18.90 20.65 13.14 13.14 19.78 18.90 22.63 20.84 20.84 25.69 21.79 20.20 21.31 19.78 19.83 19.83 20.98 21.02 22.29 23.74 26.23

0 4.23 16.01 69.76 3.44 3.44 3.60 2.60 2.60 3.54 3.44 4.00 3.76 3.77 4.14 3.88 3.67 3.79 3.54 3.54 3.54 3.81 3.74 3.69 4.13 4.32

0 1.05 0.27 0.00 1.23 1.23 1.55 0.64 0.64 1.24 1.23 1.66 1.44 1.63 1.86 1.63 1.63 1.63 1.24 1.24 1.25 1.66 1.76 1.86 1.54 1.84

0.29 3.76 17.34 24.44 24.31 24.39 16.02 18.57 18.53 24.39 24.31 30.18 27.36 29.01 41.45 29.15 30.54 34.64 28.81 28.81 28.84 31.55 71.46 767.14 1145.86 792.90

106528.00 27.68 99.02 282.94 16.21 15.99 17.33 458.56 459.26 16.64 16.21 12.92 13.09 12.90 13.44 12.85 12.58 11.98 15.19 15.22 15.54 12.96 12.68 12.85 13.11 14.34

16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94

1.0 0.72 0.73 0.44 0.50 0.50 0.52 0.99 0.99 0.48 0.50 0.19 0.23 0.23 0.19 0.21 0.24 0.21 0.48 0.48 0.48 0.22 0.21 0.21 0.20 0.19

343638.71 72.35 302.48 895.77 35.35 34.65 38.97 1462.29 1464.55 36.74 35.35 24.74 25.29 24.68 26.42 24.52 23.65 21.71 32.06 32.16 33.19 24.87 23.97 24.52 25.35 29.32

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

199904 199293 199904 189614 199904 199904 199904

199905 199294 199905 189615 199905 199905 199905

94141 94879 94141 89494 94141 94141 94141

182738 180812 182738 178835 182738 182738 182738

4.46 4.42 4.43 5.14 4.43 4.42 4.42

29.63 29.84 30.07 29.23 30.07 29.54 29.54

5.50 5.55 5.83 5.92 5.83 5.74 5.74

4.50 4.56 4.04 4.21 4.04 3.98 3.98

34.27 36.79 34.35 1086.66 43.82 34.35 43.42

2.38 2.37 2.75 2.82 2.74 3.04 3.03

12.33 12.33 17.83 17.83 17.83 15.22 15.22

0.14 0.13 0.15 0.21 0.15 0.16 0.16

27.33 27.17 28.00 29.17 27.83 18.56 18.44

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

13321 13321 13321 13321 83081 35967 35967 35967 35967 202751 199904 199904 199904 199904 866851

13322 13322 13322 13322 13322 35968 35968 35968 35968 35968 199905 199905 199905 199905 199905

3247 3247 3247 3247 3247 9251 9251 9251 9251 9251 94141 94141 94141 94141 94141

153865 153865 153865 153865 153865 157995 157995 157995 157995 157995 182738 182738 182738 182738 182738

20.04 18.64 18.63 20.05 20.05 10.90 10.13 10.13 10.90 10.90 6.58 6.11 6.11 6.58 6.58

34.36 18.90 18.90 12.68 11.42 37.85 19.77 19.77 13.38 11.98 43.72 20.84 20.84 14.13 12.60

3.44 3.44 3.44 3.44 3.44 3.58 3.58 3.58 3.58 3.58 3.77 3.77 3.77 3.77 3.77

1.23 1.23 1.23 1.23 1.23 1.34 1.34 1.34 1.34 1.34 1.63 1.63 1.63 1.63 1.63

20.29 20.23 20.24 20.45 20.49 21.12 21.12 21.20 21.79 21.93 24.08 23.75 23.81 28.16 30.10

18.18 16.03 15.04 14.32 13.95 16.01 14.10 13.03 12.71 11.81 15.30 13.06 12.00 10.80 16.60

16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94 16.94

0.38 0.45 0.50 0.54 0.56 0.25 0.30 0.34 0.35 0.40 0.15 0.20 0.23 0.28 0.14

41.71 34.77 31.58 29.26 28.06 34.71 28.55 25.10 24.06 21.16 32.42 25.19 21.77 17.90 36.61

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

19371 3450 3450 7121 3450 0 0 1 1 8048 13321

88075 24151 24151 7122 24151 649522 0 2925 129792 56337 13322

0 11599 11599 1895 11599 614359 0 2328 116818 20760 3247

131076 301800 301800 227180 301809 393479 0 220930 685163 493257 153865

531.51 34.28 34.35 58.19 32.41 37.48 0.00 407.04 97.90 22.09 18.63

713.61 26.68 19.53 22.19 19.37 38.27 0.00 7.81 23.23 28.71 18.90

500.34 5.19 5.19 4.23 8.02 38.27 0.00 6.82 22.24 5.63 3.44

0.00 1.60 1.60 1.05 4.57 27.56 0.00 3.62 15.09 2.54 1.23

189.59 4.36 4.36 4.69 4.20 6.37 0.00 2.59 1.87 21.19 25.82

10436.00 35.60 31.82 53.49 36.61 37.94 1.00 186.87 56.14 30.69 25.14

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 25: Experimental results for scene “mount8”.

N á 131076, hit hit TPD : N prim á 263169, Nhit á 256915, N prim á 145240, Nsec á 707764, Nsec á 461009, hit MIN MIN Nshad á 290405, Nshad á 20438, TR ç sé]á 5 â 56, Tapp ç sé^á 5 â 25, TRSA ç sé]á 0 â 31.

û 196

Appendix E Scene = “rings17”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 18516 65529 65521 27155 27155 30217 214 214 27282 27155 1104245 671174 304937 782981 443984 185686 415491 27287 27282 27287 304937 346502 723135 2066109 759588

0 18517 65530 65522 27156 27156 30218 215 215 27283 27156 1104246 671175 304938 782982 443985 185687 415492 27288 27283 27288 304938 346503 723136 2066110 759589

1 1248 323 34 4828 4828 2798 34 34 4918 4828 174514 60011 54285 92071 54291 32735 32780 4931 4917 4922 54285 103509 140521 54421 99396

107101 306974 488212 408570 201209 201209 216262 114951 114951 201228 201209 1703735 1389504 615672 1347114 1052788 448858 1166698 201228 201262 201262 415765 438117 1199435 4462230 1425964

– 101.10 230.62 166.27 39.12 39.12 41.21 1460.10 1460.10 39.09 39.12 16.55 17.19 17.93 18.03 17.75 19.85 19.64 39.04 39.09 39.09 14.86 10.48 16.00 22.36 902.25

0 50.36 89.41 81.06 40.25 40.25 44.34 15.82 15.82 40.28 40.25 66.00 57.92 54.64 70.28 57.85 51.82 57.12 40.28 40.27 40.28 55.05 55.97 63.80 79.27 58.92

0 9.12 17.19 15.94 6.88 6.88 7.84 3.34 3.34 6.89 6.88 11.35 9.65 9.33 11.73 9.77 8.85 9.60 6.89 6.89 6.89 9.42 9.51 10.80 13.23 7.66

0 3.85 0.21 0.01 3.70 3.70 4.25 0.75 0.75 3.71 3.70 4.97 4.18 4.55 5.05 4.55 4.36 4.36 3.71 3.71 3.71 4.56 5.91 5.09 4.70 3.69

0.23 3.63 16.10 23.69 21.23 21.22 13.68 15.32 15.30 21.19 21.23 56.93 43.38 33.04 59.81 39.47 30.25 46.30 24.18 24.78 24.80 53.26 87.78 2182.55 7393.55 2492.78

231049.00 145.11 355.92 260.66 69.33 69.94 74.08 1795.05 1813.31 69.67 69.33 70.27 54.19 52.27 59.37 53.85 53.69 55.14 68.46 68.41 68.44 48.43 43.00 56.09 104.61 1173.18

5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58

1.0 0.87 0.89 0.87 0.76 0.76 0.75 1.00 1.00 0.76 0.76 0.45 0.49 0.51 0.45 0.50 0.55 0.52 0.76 0.76 0.76 0.46 0.38 0.45 0.47 0.98

132028.00 77.34 197.81 143.37 34.04 34.39 36.75 1020.17 1030.60 34.23 34.04 34.58 25.39 24.29 28.35 25.19 25.10 25.93 33.54 33.51 33.53 22.10 18.99 26.47 54.20 664.81

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

304937 278251 304937 165924 480358 304937 480358

304938 278252 304938 165925 480359 304938 480359

54285 32148 54285 46239 96766 54285 96766

615672 4029917 615672 416731 1472702 615672 1472702

19.37 36.45 11.62 312.51 64.70 11.61 68.72

77.76 19.34 71.19 46.13 111.80 70.46 110.89

14.82 1.19 13.84 9.29 21.64 13.73 21.51

6.25 0.06 8.41 5.64 14.35 8.41 14.35

37.97 92.79 37.99 3801.09 256.53 37.87 256.39

13.36 20.58 10.14 82.71 26.95 10.29 27.80

5.41 5.41 5.26 5.26 5.26 5.37 5.37

0.66 0.95 0.55 0.97 0.79 0.55 0.80

17.24 29.47 12.22 137.34 41.21 11.23 39.47

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

27154 27154 27154 27154 145233 78599 78599 78599 78599 466434 304350 304350 304350 304350 1897332

27155 27155 27155 27155 27155 78600 78600 78600 78600 78600 304351 304351 304351 304351 304351

4835 4835 4835 4835 4835 13421 13421 13421 13421 13421 54206 54206 54206 54206 54206

201209 201209 201209 201209 201209 297456 297456 297456 297456 297456 614753 614753 614753 614753 614753

40.78 39.07 39.07 40.78 40.78 25.77 24.51 24.51 25.77 25.77 18.98 17.94 17.94 18.98 18.98

102.45 40.24 40.24 32.46 29.27 129.29 46.65 46.65 37.61 33.57 167.09 54.63 54.63 44.46 39.10

6.99 6.88 6.88 6.99 6.99 8.10 7.94 7.94 8.10 8.10 9.55 9.33 9.33 9.55 9.55

3.70 3.70 3.70 3.70 3.70 3.93 3.93 3.93 3.93 3.93 4.56 4.56 4.56 4.56 4.56

17.36 17.28 17.31 17.77 17.99 19.82 19.79 19.77 21.28 21.69 27.75 27.33 27.46 33.48 48.70

83.38 72.27 67.71 109.78 76.33 72.33 60.60 55.12 61.06 61.02 72.06 56.91 50.63 56.42 87.07

5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58 5.58

0.60 0.70 0.76 0.44 0.66 0.46 0.56 0.63 0.56 0.56 0.33 0.44 0.51 0.45 0.27

42.07 35.72 33.11 57.15 38.04 35.75 29.05 25.92 29.31 29.29 35.60 26.94 23.35 26.66 44.18

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

13107 10123 10123 18516 10123 0 44528 24356 26722 19177 27155

70607 70862 70862 18517 70862 536726 757129 174237 478957 134240 27156

0 7846 7846 1248 7846 417221 159833 30198 112574 17096 4828

107101 493503 493503 306974 493503 677753 1625581 791828 1526782 573705 201209

738.71 59.94 60.08 101.10 59.86 48.27 91.19 71.18 54.53 39.03 39.12

912.51 83.55 50.66 50.36 46.03 32.10 32.03 19.37 26.07 71.95 40.25

664.19 12.77 12.77 9.12 18.14 32.10 23.39 13.62 22.17 11.32 6.88

0.00 6.77 6.77 3.85 12.15 25.76 0.44 9.20 14.42 6.03 3.70

145.68 4.90 4.45 5.14 4.42 7.51 45.51 6.99 4.95 20.09 23.55

18030.50 168.69 149.94 234.40 168.63 116.81 371.98 170.66 144.67 128.81 106.61

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

Table 26: Experimental results for scene “rings17”.

N á 107101, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 263168, Nsec á 386859, Nsec á 220131, hit MIN MIN Nshad á 1147577, Nshad á 609678, TR ç séOá 11 â 55, Tapp ç sé^á 9 â 76, TRSA ç sé]á 1 â 75.

Appendix E

197 Scene = “sombrero4”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 8904 63668 65243 18329 18329 14525 203 200 19544 18329 259695 85618 232128 250211 232130 232027 232033 19546 19566 19568 232128 240183 260242 241938 217347

0 8905 63669 65244 18330 18330 14526 204 201 19545 18330 259696 85619 232129 250212 232131 232028 232034 19547 19567 19569 232129 240184 260243 241939 217348

1 2759 1280 0 2966 2966 2594 72 69 4113 2966 127424 16730 127080 122973 127082 126996 126998 4115 4132 4134 127080 137075 128272 87307 97756

130050 230502 390996 130696 135207 135207 168265 131812 131812 135263 135207 198890 135619 171667 198040 171667 171663 171667 135263 135303 135303 161899 169784 201278 242076 203990

– 217.75 315.83 190.33 41.70 41.70 101.52 3645.70 3645.80 41.29 41.70 7.14 9.21 6.85 8.82 6.82 6.85 6.82 41.26 41.27 41.25 6.29 6.24 7.12 8.52 10.57

0 26.97 84.09 159.75 19.17 19.17 27.41 10.12 10.10 19.32 19.17 26.44 23.02 26.02 38.33 26.02 26.01 26.02 19.32 19.32 19.33 26.02 26.04 29.27 31.28 34.68

0 4.98 15.17 29.25 3.28 3.28 5.07 2.18 2.17 3.31 3.28 4.66 3.84 4.53 5.85 4.53 4.53 4.53 3.31 3.31 3.31 4.53 4.53 4.97 5.15 5.73

0 3.23 0.49 0.00 1.81 1.81 2.68 1.10 1.08 1.90 1.81 3.20 2.27 3.20 3.92 3.21 3.20 3.21 1.91 1.91 1.92 3.20 3.36 3.53 3.42 3.92

0.27 4.17 19.55 21.77 24.98 24.95 17.52 17.46 17.59 24.93 24.98 30.20 27.00 29.58 41.42 29.72 31.38 34.89 29.21 29.39 29.37 32.20 41.31 797.48 850.17 713.35

39868.90 11.73 19.51 21.08 4.23 4.14 6.92 268.58 268.46 4.39 4.23 3.37 3.23 3.27 3.75 3.27 3.26 3.03 3.92 3.93 3.93 3.25 3.09 3.23 3.40 3.65

11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70

1.0 0.86 0.73 0.47 0.61 0.61 0.73 1.00 1.00 0.61 0.61 0.17 0.23 0.16 0.14 0.16 0.16 0.16 0.61 0.61 0.61 0.15 0.15 0.15 0.17 0.18

398689.00 105.60 183.40 199.10 30.60 29.70 57.50 2674.10 2672.90 32.20 30.60 22.00 20.60 21.00 25.80 21.00 20.90 18.60 27.50 27.60 27.60 20.80 19.20 20.60 22.30 24.80

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

232128 235137 232128 165968 196334 232128 196334

232129 235138 232129 165969 196335 232129 196335

127080 125205 127080 90071 105204 127080 105204

171667 176143 171667 160178 166309 171667 166309

3.08 2.88 3.24 2.96 24.95 3.24 24.88

18.97 15.51 18.16 17.06 19.82 18.25 19.92

3.26 2.71 3.12 3.01 3.50 3.15 3.52

2.40 1.96 2.25 2.25 2.75 2.27 2.76

35.24 33.27 35.23 846.63 65.01 35.15 64.97

2.06 2.08 2.07 2.13 3.54 2.46 3.94

8.10 8.10 7.55 7.55 7.55 7.67 7.67

0.42 0.53 0.44 0.50 0.72 0.47 0.73

12.50 12.70 11.27 11.82 24.64 8.73 18.60

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

18329 18329 18329 18329 87481 50412 50412 50412 50412 238564 232128 232128 232128 232128 981401

18330 18330 18330 18330 18330 50413 50413 50413 50413 50413 232129 232129 232129 232129 232129

2966 2966 2966 2966 2966 13578 13578 13578 13578 13578 127080 127080 127080 127080 127080

135207 135207 135207 135207 135207 135600 135600 135600 135600 135600 171667 171667 171667 171667 171667

41.72 41.70 41.70 41.72 41.72 16.83 16.83 16.83 16.84 16.84 6.85 6.85 6.85 6.85 6.85

45.29 19.17 19.17 17.74 16.39 55.23 21.69 21.69 19.80 18.17 77.55 26.02 26.02 23.13 20.94

3.28 3.28 3.28 3.28 3.28 3.68 3.68 3.68 3.68 3.68 4.53 4.53 4.53 4.53 4.53

1.81 1.81 1.81 1.81 1.81 2.21 2.21 2.21 2.21 2.21 3.20 3.20 3.20 3.20 3.20

20.35 20.30 20.34 20.63 20.74 21.29 21.27 21.20 22.09 22.27 24.19 23.98 24.14 28.65 31.82

5.00 4.32 3.91 3.93 3.94 4.43 3.57 3.13 3.15 3.16 4.80 3.52 3.00 3.01 7.22

11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70 11.70

0.44 0.53 0.61 0.61 0.60 0.22 0.29 0.36 0.36 0.35 0.08 0.12 0.16 0.16 0.05

38.30 31.50 27.40 27.60 27.70 32.60 24.00 19.60 19.80 19.90 36.30 23.50 18.30 18.40 60.50

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

20050 4422 4422 8904 4422 0 41042 1 11610 8706 18329

87858 30955 30955 8905 30955 643860 1275612 1936 491228 60943 18330

0 15168 15168 2759 15168 606223 386764 1332 264152 20848 2966

130050 305664 305664 230502 305664 396633 2950955 173298 2172750 461512 135207

1048.30 134.48 133.70 217.75 125.40 58.06 42.64 1202.80 74.94 105.39 41.70

906.27 41.07 27.87 26.97 24.02 28.11 22.34 4.95 21.03 46.40 19.17

614.35 7.23 7.22 4.98 8.85 28.11 19.21 4.29 18.83 7.93 3.28

0.00 5.12 5.12 3.23 6.83 26.43 15.43 3.00 16.17 5.19 1.81

172.53 4.63 4.27 4.65 4.38 6.45 205.66 2.87 5.62 22.41 27.63

3324.59 14.40 12.91 19.08 13.74 7.08 23.11 87.69 9.58 12.96 6.90

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 27: Experimental results for scene “sombrero4”.

N á 130050, hit hit TPD : N prim á 263169, Nhit á 136638, N prim á 112239, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 110608, Nshad á 2622, TR ç sé]á 1 â 27, Tapp ç séOá 1 â 17, TRSA ç séOá 0 â 10.

198ü

Appendix E Scene = “teapot40”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 8598 63340 65496 18593 18593 18737 225 225 19630 18593 425323 290100 264722 456849 266473 247373 257162 19652 19765 19786 264722 264471 344056 616614 194922

0 8599 63341 65497 18594 18594 18738 226 226 19631 18594 425324 290101 264723 456850 266474 247374 257163 19653 19766 19787 264723 264472 344057 616615 194923

1 2943 2356 798 3495 3495 4197 30 30 3883 3495 117627 49364 75531 111778 75670 70794 71050 3905 4021 4042 75531 106643 101428 119164 43680

103680 180581 371822 265120 181085 181085 185251 112307 112307 181133 181085 638251 574014 435017 707674 443763 418298 446531 181133 181186 181186 377180 375965 523736 1234612 467933

– 251.00 670.84 1372.80 57.40 57.40 67.47 2803.00 2803.00 55.42 57.40 11.87 15.55 12.77 12.97 12.73 13.27 13.08 55.42 55.41 55.40 12.77 10.17 11.89 14.16 19.98

0 40.45 142.05 486.86 30.48 30.48 36.55 16.73 16.73 31.04 30.48 39.24 36.91 38.29 52.13 38.32 37.94 38.12 31.04 31.06 31.06 37.35 37.64 39.81 45.00 50.58

0 8.61 29.53 98.65 5.58 5.58 7.20 3.68 3.68 5.61 5.58 6.96 6.63 6.79 8.42 6.79 6.73 6.76 5.61 5.62 5.62 6.61 6.63 6.94 7.69 8.01

0 5.97 0.88 1.15 3.41 3.41 4.79 1.89 1.89 3.54 3.41 4.49 3.97 4.42 5.75 4.42 4.38 4.38 3.54 3.55 3.55 4.36 4.83 4.59 4.93 5.18

0.22 3.24 12.99 18.40 19.38 19.38 12.10 13.97 14.01 19.37 19.38 31.41 28.58 27.20 43.85 27.59 28.43 32.72 22.42 22.83 22.83 32.04 92.72 1138.43 2144.94 701.48

67216.90 35.97 110.67 318.57 14.80 14.68 16.06 454.21 455.98 14.92 14.80 11.70 11.77 11.31 13.28 11.33 11.33 11.05 14.15 14.28 14.13 9.88 11.21 11.50 12.74 13.77

17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32

1.0 0.80 0.75 0.64 0.54 0.54 0.54 0.99 0.99 0.53 0.54 0.16 0.21 0.17 0.13 0.17 0.18 0.18 0.53 0.53 0.53 0.16 0.14 0.16 0.16 0.20

305531.36 146.18 485.73 1430.73 49.95 49.41 55.68 2047.27 2055.32 50.50 49.95 35.86 36.18 34.09 43.05 34.18 34.18 32.91 47.00 47.59 46.91 27.59 33.64 34.95 40.59 45.27

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

264722 256679 264722 233036 215290 264722 215290

264723 256680 264723 233037 215291 264723 215291

75531 72330 75531 68652 64843 75531 64843

435017 428579 435017 393621 386727 435017 386727

3.61 3.54 3.31 3.28 3.62 3.32 3.64

23.84 23.42 29.31 27.60 27.60 28.73 27.09

4.32 4.17 5.34 5.11 5.10 5.23 5.00

3.19 3.09 3.98 3.80 3.75 3.90 3.68

32.11 35.37 33.42 1653.61 80.40 32.14 80.36

2.95 2.91 3.74 3.60 3.62 3.95 3.93

14.70 14.70 12.50 12.50 12.50 16.14 16.14

0.39 0.38 0.36 0.34 0.35 0.35 0.38

14.80 14.40 10.88 10.00 10.12 12.07 11.93

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

18593 18593 18593 18593 122836 48421 48421 48421 48421 313831 264722 264722 264722 264722 1371881

18594 18594 18594 18594 18594 48422 48422 48422 48422 48422 264723 264723 264723 264723 264723

3495 3495 3495 3495 3495 9840 9840 9840 9840 9840 75531 75531 75531 75531 75531

181085 181085 181085 181085 181085 227380 227380 227380 227380 227380 435017 435017 435017 435017 435017

57.83 57.40 57.40 57.84 57.84 30.41 30.18 30.18 30.42 30.42 12.88 12.77 12.77 12.89 12.89

70.26 30.48 30.48 23.81 21.34 81.23 33.18 33.18 25.88 23.00 101.70 38.29 38.29 29.39 25.78

5.59 5.58 5.58 5.59 5.59 6.05 6.04 6.04 6.05 6.05 6.80 6.79 6.79 6.80 6.80

3.41 3.41 3.41 3.41 3.41 3.68 3.68 3.68 3.68 3.68 4.42 4.42 4.42 4.42 4.42

16.04 15.98 16.04 16.47 16.62 17.39 17.28 17.26 18.56 18.57 22.64 22.18 22.37 27.79 36.00

18.93 15.68 14.15 13.95 13.90 16.94 13.46 11.84 11.61 11.60 16.94 12.61 10.76 10.94 32.08

17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32 17.32

0.37 0.47 0.54 0.55 0.55 0.22 0.30 0.36 0.37 0.37 0.09 0.13 0.17 0.17 0.04

68.73 53.95 47.00 46.09 45.86 59.68 43.86 36.50 35.45 35.41 59.68 40.00 31.59 32.41 128.50

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

16404 4066 4066 8598 4066 0 20337 19260 4820 8704 18593

69920 28463 28463 8599 28463 515570 1006860 316436 352189 60929 18594

0 14364 14364 2943 14364 487803 495117 264487 196005 21144 3495

103680 231325 231325 180581 231325 289369 2225562 380083 2025201 360490 181085

1011.90 132.13 132.05 251.00 126.61 94.13 64.22 77.49 92.30 64.62 57.40

835.87 64.02 43.01 40.45 36.52 57.12 45.94 42.19 37.92 58.95 30.48

557.33 11.48 11.48 8.61 13.67 57.12 42.08 37.10 35.69 10.98 5.58

0.00 8.84 8.84 5.97 11.11 54.15 36.38 34.02 31.86 8.43 3.41

118.35 3.68 3.41 4.27 3.47 4.94 287.18 6.51 4.43 17.01 21.77

7116.55 43.98 37.49 93.80 41.44 31.46 41.59 42.48 31.62 33.27 23.85

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 28: Experimental results for scene “teapot40”.

N á 103680, hit hit TPD : N prim á 263169, Nhit á 226226, N prim á 161581, Nsec á 225988, Nsec á 67302, hit MIN MIN Nshad á 406161, Nshad á 33950, TR ç sé]á 4 â 03, Tapp ç sé]á 3 â 81, TRSA ç sé]á 0 â 22.

Appendix E

199 Scene = “tetra8”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 11651 42239 49151 19097 19097 19932 179 179 19097 19097 48264 48264 48264 44203 48264 48252 48252 19097 19003 19003 48264 48264 51845 51682 48503

0 11652 42240 49152 19098 19098 19933 180 180 19098 19098 48265 48265 48265 44204 48265 48253 48253 19098 19004 19004 48265 48265 51846 51683 48504

1 4588 25856 32768 7026 7026 7865 28 28 7026 7026 31881 31881 31881 27820 31881 31869 31869 7026 7068 7068 31881 31881 35462 35299 33020

65536 151552 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536

– 214.14 42.04 23.77 27.34 27.34 30.07 2948.70 2948.70 27.34 27.34 10.12 10.12 10.12 10.12 10.12 10.12 10.12 27.34 27.51 27.51 10.12 10.12 10.12 10.12 40.06

0 30.12 75.87 215.24 20.17 20.17 24.48 9.70 9.70 20.17 20.17 22.00 22.00 22.00 34.13 22.00 21.98 21.98 20.17 20.14 20.14 22.00 22.00 26.10 25.84 44.63

0 6.04 14.84 42.25 3.71 3.71 4.66 2.17 2.17 3.71 3.71 3.98 3.98 3.98 4.74 3.98 3.98 3.98 3.71 3.70 3.70 3.98 3.98 4.71 4.50 7.06

0 4.44 13.06 41.24 2.92 2.92 3.79 1.01 1.01 2.92 2.92 3.54 3.54 3.54 4.30 3.54 3.54 3.54 2.92 2.92 2.92 3.54 3.54 4.28 4.06 6.40

0.13 2.05 6.54 10.85 9.97 9.99 6.69 8.20 8.18 9.93 9.97 10.74 10.87 10.82 13.73 10.73 11.39 12.72 11.85 11.90 11.79 11.44 13.15 176.69 180.27 173.04

12855.80 6.46 4.61 9.27 2.40 2.39 2.56 82.19 83.23 2.45 2.40 2.08 2.09 2.08 2.32 2.10 2.07 1.96 2.28 2.28 2.28 2.07 2.03 2.13 2.13 3.28

28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67

1.0 0.72 0.17 0.04 0.33 0.33 0.30 0.99 0.99 0.33 0.33 0.14 0.14 0.14 0.10 0.14 0.14 0.14 0.33 0.33 0.33 0.14 0.14 0.12 0.12 0.24

428526.67 186.67 125.00 280.33 51.33 51.00 56.67 2711.00 2745.67 53.00 51.33 40.67 41.00 40.67 48.67 41.33 40.33 36.67 47.33 47.33 47.33 40.33 39.00 42.33 42.33 80.67

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

48264 48232 48264 51060 35813 48264 35813

48265 48233 48265 51061 35814 48265 35814

31881 31849 31881 34677 22688 31881 22688

65536 65536 65536 65536 65536 65536 65536

8.52 8.52 7.53 7.53 73.92 7.51 72.43

19.23 19.38 19.84 23.54 28.15 19.62 27.79

3.62 3.70 3.71 4.43 4.98 3.67 4.92

3.32 3.41 3.40 4.12 4.38 3.36 4.33

12.57 13.08 12.60 154.96 17.97 12.55 17.95

1.57 1.59 1.64 1.81 3.04 1.97 3.31

18.33 18.33 19.67 19.67 19.67 30.67 30.67

0.38 0.38 0.38 0.36 0.62 0.38 0.62

34.00 34.67 35.00 40.67 81.67 35.00 79.67

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

19097 19097 19097 19097 85575 41905 41905 41905 41905 151873 48264 48264 48264 48264 168703

19098 19098 19098 19098 19098 41906 41906 41906 41906 41906 48265 48265 48265 48265 48265

7026 7026 7026 7026 7026 25556 25556 25556 25556 25556 31881 31881 31881 31881 31881

65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536 65536

27.34 27.34 27.34 27.34 27.34 11.81 11.81 11.81 11.81 11.81 10.12 10.12 10.12 10.12 10.12

50.29 20.17 20.17 18.39 16.36 56.11 21.71 21.71 19.42 17.21 57.23 22.00 22.00 19.63 17.39

3.71 3.71 3.71 3.71 3.71 3.94 3.94 3.94 3.94 3.94 3.98 3.98 3.98 3.98 3.98

2.92 2.92 2.92 2.92 2.92 3.43 3.43 3.43 3.43 3.43 3.54 3.54 3.54 3.54 3.54

8.39 8.34 8.46 10.06 8.71 8.86 8.80 8.76 9.53 9.69 9.03 8.96 8.95 9.85 9.92

3.38 2.59 2.26 2.79 2.40 3.22 2.36 1.99 2.08 2.10 3.20 2.34 1.95 2.09 2.09

28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67 28.67

0.18 0.27 0.33 0.24 0.30 0.08 0.12 0.16 0.15 0.15 0.07 0.10 0.14 0.12 0.12

84.00 57.67 46.67 64.33 51.33 78.67 50.00 37.67 40.67 41.33 78.00 49.33 36.33 41.00 41.00

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

9339 5697 5697 11651 5697 0 16921 1 8281 7121 19097

43903 39880 39880 11652 39880 328509 656550 195112 386424 49848 19098

0 22496 22496 4588 22496 307819 376506 180318 151012 25744 7026

65536 212992 212992 151552 212992 232864 1455360 202552 3447984 311008 65536

1109.90 119.72 119.29 214.14 110.35 105.60 86.38 136.68 253.12 134.85 27.34

450.18 47.28 31.85 30.12 26.16 32.84 38.10 28.28 25.12 44.46 20.17

326.98 8.22 8.22 6.04 9.50 32.84 35.04 27.63 22.48 7.86 3.71

0.00 6.61 6.61 4.44 7.97 31.22 31.83 25.92 18.98 6.39 2.92

88.62 2.41 2.20 2.51 2.22 3.17 57.81 4.09 6.37 10.78 10.61

1418.19 9.48 7.33 9.86 8.56 5.97 9.64 7.42 11.63 9.40 3.57

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 29: Experimental results for scene “tetra8”.

TPD : N prim á 263169, Nhit hit Nshad á 40256, Nshad á 7098,

á

N á 65536, hit hit 159213, N prim á 43709, Nsec á 0, Nsec á 0, MIN MIN TR ç sé]á 0 â 89, Tapp ç sé]á 0 â 86, TRSA ç sé]á 0 â 03.

ê 200

Appendix E Scene = “tree15”

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 319 65425 65531 5019 5019 5281 75 70 5356 5019 143468 114142 72922 192425 73046 72735 73268 5362 5774 5783 72922 70835 78022 114150 82132

0 320 65426 65532 5020 5020 5282 76 71 5357 5020 143469 114143 72923 192426 73047 72736 73269 5363 5775 5784 72923 70836 78023 114151 82133

1 227 996 252 1364 1364 1394 7 4 1618 1364 56123 34755 28766 48787 28784 29186 29203 1624 2017 2026 28766 30429 30619 26050 24601

131071 134391 314658 267726 140681 140681 142057 132226 132225 140755 140681 231368 223827 184136 299524 184414 183277 184559 140755 140538 140538 171576 174110 188289 288861 237759

– 17792.00 1659.90 1871.50 113.69 113.69 133.18 9127.30 9128.50 110.96 113.69 21.75 25.35 23.56 17.02 21.98 23.69 22.05 109.38 109.05 107.47 22.94 22.14 22.86 23.69 102.49

0 63.02 274.45 346.00 18.00 18.00 21.21 12.31 12.10 18.61 18.00 20.83 19.84 20.44 26.25 20.71 20.35 20.67 18.86 18.67 18.93 20.26 19.88 20.29 23.43 29.76

0 13.12 59.06 74.89 4.31 4.31 5.36 3.32 3.27 4.41 4.31 4.80 4.63 4.73 5.31 4.74 4.72 4.73 4.41 4.43 4.43 4.69 4.61 4.66 5.15 5.56

0 6.99 0.60 0.02 0.94 0.94 1.88 0.14 0.09 1.11 0.94 1.52 1.25 1.48 2.78 1.74 1.48 1.73 1.37 1.14 1.39 1.51 1.67 1.45 1.93 1.71

0.26 4.81 17.19 27.43 31.83 31.90 18.62 21.03 21.04 31.82 31.83 40.28 39.90 37.42 49.95 37.53 39.89 44.70 35.50 37.71 37.67 39.45 57.57 324.74 472.11 408.52

232025.00 10901.20 554.75 651.41 33.78 33.96 39.41 2757.39 2751.83 33.48 33.78 12.70 13.18 13.14 13.61 13.02 13.14 12.34 32.03 31.62 31.35 12.24 12.12 12.65 14.04 33.54

18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59

1.0 0.99 0.65 0.62 0.66 0.66 0.65 1.00 1.00 0.64 0.66 0.24 0.28 0.26 0.16 0.24 0.26 0.24 0.64 0.64 0.63 0.25 0.24 0.25 0.23 0.51

1054659.09 49532.32 2503.00 2942.36 134.95 135.77 160.55 12515.00 12489.73 133.59 134.95 39.14 41.32 41.14 43.27 40.59 41.14 37.50 127.00 125.14 123.91 37.05 36.50 38.91 45.23 133.86

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

72922 80420 72922 111065 64191 72922 64191

72923 80421 72923 111066 64192 72923 64192

28766 30990 28766 44207 20383 28766 20383

184136 200666 184136 226182 314352 184136 314352

5.10 3.50 8.38 4.51 17361.00 8.29 17545.00

27.24 20.16 32.43 24.36 31.00 32.11 30.90

6.81 4.79 7.75 5.78 6.91 7.66 6.85

2.29 1.69 3.04 3.28 3.09 3.01 3.05

42.74 54.22 42.78 9227.67 294.09 42.77 294.16

6.11 5.66 5.20 4.52 4036.76 5.51 4055.84

20.83 20.83 17.40 17.40 17.40 17.18 17.18

0.42 0.47 0.37 0.36 1.00 0.37 1.00

13.11 10.61 17.27 12.73 26894.33 15.24 23840.71

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

5019 5019 5019 5019 39548 12900 12900 12900 12900 99423 72922 72922 72922 72922 528558

5020 5020 5020 5020 5020 12901 12901 12901 12901 12901 72923 72923 72923 72923 72923

1364 1364 1364 1364 1364 3804 3804 3804 3804 3804 28766 28766 28766 28766 28766

140681 140681 140681 140681 140681 147446 147446 147446 147446 147446 184136 184136 184136 184136 184136

113.93 113.68 113.68 114.20 114.20 52.00 51.88 51.88 52.16 52.16 23.65 23.56 23.56 23.72 23.72

43.09 18.00 18.00 17.16 15.39 46.42 18.79 18.79 17.87 15.94 53.12 20.44 20.44 19.35 17.14

4.32 4.31 4.31 4.33 4.33 4.46 4.45 4.45 4.46 4.46 4.74 4.73 4.73 4.75 4.75

0.94 0.94 0.94 0.94 0.94 1.08 1.08 1.08 1.08 1.08 1.48 1.48 1.48 1.48 1.48

25.97 25.93 25.93 26.05 26.04 27.40 27.48 27.49 27.67 27.74 30.53 30.49 30.54 31.94 32.31

38.40 33.42 32.15 43.74 42.56 24.68 19.60 18.54 22.25 24.75 19.00 13.77 12.51 13.32 13.46

18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59 18.59

0.54 0.63 0.66 0.47 0.48 0.32 0.42 0.45 0.36 0.31 0.15 0.23 0.26 0.24 0.23

155.95 133.32 127.55 180.23 174.86 93.59 70.50 65.68 82.55 93.91 67.77 44.00 38.27 41.95 42.59

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

21992 151 151 319 151 0 2 9 41 1328 5019

91171 1058 1058 320 1058 443680 393009 394682 275886 9297 5020

0 892 892 227 892 388122 372191 370210 230133 3935 1364

131071 137338 137338 134391 137340 188973 182880 200497 670903 150854 140681

803.76 6108.80 6076.50 17792.00 6032.40 6723.40 79.01 74.93 236.59 144.65 113.69

404.27 131.76 85.62 63.02 72.36 29.36 30.31 12.70 29.32 42.82 18.00

274.16 21.72 21.72 13.12 28.74 29.36 29.83 10.05 27.62 9.72 4.31

0.00 14.53 14.53 6.99 21.55 25.14 27.56 9.00 22.80 6.14 0.94

107.54 5.18 5.16 6.10 4.91 6.43 370.43 10.65 2.61 23.06 36.57

5220.49 9452.42 9153.53 13032.90 9261.85 2347.53 43.38 46.15 86.53 73.28 52.10

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

N˜ ET S

N˜ EET S

Table 30: Experimental results for scene “tree15”.

N á 131071, hit hit TPD : N prim á 263169, Nhit á 263169, N prim á 173215, Nsec á 0, Nsec á 0, hit MIN MIN Nshad á 1107745, Nshad á 48134, TR ç séOá 4 â 31, Tapp ç sé]á 4 â 09, TRSA ç sé]á 0 â 22.

Appendix E

201

balls5

gears9

jacks5

lattice29

mount8

Figure 5: Visualization of the G5SPD scenes using the testing procedure TPD .

ô 202

Appendix E

rings17

sombrero4

teapot40

tetra8

tree15

Figure 6: Visualization of the G5SPD scenes using the testing procedure TPD .

Appendix E

203 Scenes = group G3SPD

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 6590 4300 4490 1870 1870 1825 140 137 3690 1870 8978 4303 2043 2400 2385 1622 2160 3712 3714 3715 2043 2142 2552 3836 2433

0 6591 4301 4491 1871 1871 1826 141 138 3691 1871 8979 4304 2044 2401 2386 1623 2161 3713 3715 3716 2044 2143 2553 3837 2434

1 881 455 465 255 255 287 29 26 767 255 1133 298 501 469 502 458 460 769 769 770 501 602 554 419 428

1020 20030 8808 8544 3590 3590 3496 1292 1292 4844 3590 12429 8663 2937 3525 3905 2504 3856 4866 4871 4871 2740 2793 3612 6850 3925

– 78.33 57.76 71.42 12.93 12.93 13.12 52.09 52.95 10.30 12.93 10.30 13.02 12.80 11.92 12.86 14.12 14.24 10.29 10.30 10.29 12.11 10.31 11.48 13.80 17.41

0.00 36.90 48.37 57.73 18.97 18.97 20.45 12.44 12.18 22.03 18.97 23.72 19.83 19.25 22.16 20.25 18.40 19.95 22.04 22.09 22.09 19.20 19.54 20.98 24.09 25.43

0.00 7.15 9.83 12.49 3.71 3.71 4.14 2.76 2.71 4.24 3.71 4.52 3.85 3.78 4.25 3.94 3.65 3.92 4.23 4.24 4.24 3.77 3.84 4.10 4.55 4.62

0.00 2.81 2.06 2.01 1.05 1.05 1.45 0.74 0.59 1.69 1.05 1.73 1.05 1.55 1.67 1.55 1.50 1.50 1.69 1.69 1.70 1.55 1.77 1.73 1.43 1.60

0.00 0.10 0.14 0.19 0.13 0.13 0.08 0.07 0.06 0.17 0.13 0.31 0.20 0.13 0.18 0.15 0.13 0.17 0.21 0.20 0.21 0.21 0.22 7.39 13.23 7.35

671.54 28.39 27.91 36.98 13.10 12.80 13.17 21.02 21.03 12.61 13.10 13.22 13.19 12.83 13.12 13.24 13.09 12.96 11.88 11.88 12.01 12.37 12.08 12.45 13.72 14.58

13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29

1.00 0.51 0.47 0.46 0.39 0.39 0.38 0.77 0.77 0.31 0.39 0.29 0.38 0.37 0.33 0.36 0.40 0.38 0.30 0.31 0.30 0.36 0.33 0.34 0.34 0.36

2279.43 83.12 63.22 87.61 20.57 19.48 20.33 37.51 37.59 19.43 20.57 20.23 20.19 19.51 20.15 20.07 19.68 18.53 17.30 17.14 17.41 18.19 17.65 18.85 21.33 23.97

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

2043 2135 2043 1626 1892 2043 1892

2044 2136 2044 1627 1893 2044 1893

501 540 501 450 517 501 517

2937 4413 2937 2731 3907 2937 3907

6.24 5.22 6.40 12.79 16.38 6.39 16.26

17.31 13.85 20.19 15.57 19.78 20.00 19.63

3.62 2.93 4.23 3.40 4.10 4.19 4.07

1.77 1.37 2.01 1.56 1.84 1.99 1.83

0.15 0.20 0.15 16.59 0.75 0.15 0.77

3.12 2.93 3.84 4.81 4.56 4.14 4.88

11.32 11.32 13.12 13.12 13.12 12.30 12.30

0.47 0.50 0.45 0.54 0.58 0.49 0.61

11.44 9.99 11.55 13.99 16.98 9.51 13.61

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

1869 1869 1869 1869 9737 2526 2526 2526 2526 12989 2050 2050 2050 2050 9509

1870 1870 1870 1870 1870 2527 2527 2527 2527 2527 2051 2051 2051 2051 2051

255 255 255 255 255 282 282 282 282 282 502 502 502 502 502

3591 3591 3591 3591 3591 4806 4806 4806 4806 4806 2945 2945 2945 2945 2945

13.50 12.92 12.92 13.51 13.51 13.43 12.83 12.83 13.43 13.43 13.35 12.78 12.78 13.35 13.35

41.39 18.97 18.97 15.97 14.61 43.10 19.38 19.38 16.32 14.89 41.84 19.26 19.26 16.01 14.65

3.75 3.70 3.70 3.75 3.75 3.83 3.77 3.77 3.83 3.83 3.82 3.78 3.78 3.82 3.82

1.05 1.05 1.05 1.05 1.05 1.06 1.06 1.05 1.06 1.06 1.54 1.54 1.54 1.54 1.54

0.11 0.10 0.11 0.13 0.14 0.13 0.12 0.13 0.16 0.18 0.11 0.11 0.10 0.13 0.15

15.75 13.07 12.05 12.54 12.88 16.07 13.23 12.13 12.65 12.72 15.80 13.11 12.05 12.56 12.54

13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29 13.29

0.25 0.34 0.39 0.37 0.35 0.24 0.33 0.38 0.36 0.36 0.24 0.32 0.37 0.34 0.35

27.84 19.98 17.30 18.44 19.09 28.22 20.29 17.49 18.69 18.66 27.88 19.89 17.15 18.51 18.26

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

111 3197 3197 6590 3214 0 201 44 224 4095 1870

637 22382 22382 6591 22503 5127 5368 1979 4340 28666 1871

0 4791 4791 881 4860 3148 1756 804 1323 4592 255

1020 49927 49927 20030 50094 6020 12312 4657 14902 69335 3590

52.78 71.71 71.56 78.33 70.75 153.68 32.55 143.50 38.76 21.62 12.93

28.45 60.18 40.75 36.90 36.96 8.12 10.14 6.78 8.84 38.09 18.97

21.06 10.50 10.47 7.15 15.18 8.12 8.40 4.61 6.87 7.46 3.71

0.00 5.23 5.23 2.81 10.05 4.76 2.63 1.99 3.19 3.50 1.05

0.08 0.16 0.13 0.76 0.14 0.05 0.20 0.05 0.04 0.53 0.42

208.71 52.07 43.16 75.73 50.86 53.43 52.04 73.10 30.34 30.31 21.19

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 31: Experimental results, summary for G3SPD scenes, average values are reported.

ù 204

Appendix E Scenes = group G4SPD

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 10073 23997 21948 8317 8317 8057 172 171 12421 8317 65655 33433 16873 22381 20396 13109 19204 12440 12500 12504 16873 17462 22286 38410 23053

0 10074 23998 21949 8318 8318 8058 173 172 12422 8318 65656 33434 16874 22382 20397 13110 19205 12441 12501 12505 16874 17463 22287 38411 23054

1 1405 2540 1272 1263 1263 1311 24 22 3042 1263 8352 2414 3836 3960 3856 3351 3384 3054 3101 3106 3836 5282 4672 3708 3481

7635 43897 54859 47813 19515 19515 19615 8757 8757 21654 19515 91893 66189 25848 34095 35627 21965 37141 21663 21710 21710 22458 22824 33228 71266 38924

– 331.92 99.25 130.01 15.07 15.07 17.59 250.54 250.75 13.03 15.07 10.84 13.32 12.31 11.85 12.26 13.29 13.18 12.98 13.01 12.97 11.30 9.78 11.45 13.66 28.78

0.00 39.30 81.36 100.12 24.47 24.47 26.92 12.62 12.50 26.67 24.47 31.37 26.92 26.70 31.69 27.79 25.58 27.39 26.68 26.66 26.68 26.62 26.81 28.85 33.03 35.13

0.00 7.44 16.21 20.79 4.49 4.49 5.10 2.67 2.64 4.86 4.49 5.65 4.88 4.86 5.52 5.03 4.67 4.97 4.85 4.85 4.85 4.85 4.89 5.21 5.81 5.91

0.00 3.22 2.98 2.60 1.61 1.61 2.08 0.69 0.65 2.14 1.61 2.35 1.66 2.19 2.42 2.21 2.14 2.14 2.17 2.16 2.17 2.19 2.56 2.35 2.05 2.34

0.02 0.33 1.12 1.51 1.26 1.27 0.77 0.80 0.78 1.34 1.26 2.73 2.02 1.45 2.12 1.65 1.43 1.97 1.55 1.59 1.59 2.17 2.60 66.90 129.40 72.16

7839.09 102.73 49.61 63.60 16.76 16.43 17.29 75.45 75.52 16.18 16.76 16.86 16.77 16.01 16.85 16.43 16.24 16.04 15.46 15.44 15.50 15.03 14.82 15.75 17.91 28.32

13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73

1.00 0.61 0.47 0.46 0.38 0.38 0.39 0.94 0.94 0.32 0.38 0.26 0.33 0.31 0.28 0.30 0.33 0.32 0.32 0.32 0.32 0.29 0.27 0.28 0.29 0.35

22297.46 392.16 121.40 171.67 25.98 25.12 27.46 175.32 175.62 24.75 25.98 25.45 25.27 24.14 26.39 24.66 24.18 23.33 22.82 22.45 22.86 21.84 22.04 23.41 27.05 38.10

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

16873 17893 16873 13820 17146 16873 17146

16874 17894 16874 13821 17147 16874 17147

3836 3946 3836 3514 3948 3836 3948

25848 57241 25848 23614 40987 25848 40987

7.16 6.72 6.41 44.33 36.64 6.38 36.26

27.96 19.95 27.79 21.23 29.24 27.50 28.99

5.49 3.93 5.51 4.28 5.70 5.45 5.65

2.98 1.94 2.92 2.32 3.01 2.88 2.98

1.69 2.46 1.72 168.79 8.35 1.71 8.32

3.98 3.77 4.28 11.71 9.07 4.60 9.34

12.31 12.31 11.89 11.89 11.89 11.70 11.70

0.42 0.50 0.41 0.50 0.62 0.40 0.61

15.85 14.64 13.97 28.72 37.44 9.63 30.29

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

8319 8319 8319 8319 46575 13332 13332 13332 13332 74109 16905 16905 16905 16905 85673

8320 8320 8320 8320 8320 13333 13333 13333 13333 13333 16906 16906 16906 16906 16906

1264 1264 1264 1264 1264 1722 1722 1722 1722 1722 3840 3840 3840 3840 3840

19518 19518 19518 19518 19518 27372 27372 27372 27372 27372 25879 25879 25879 25879 25879

15.79 15.06 15.06 15.80 15.80 14.16 13.54 13.54 14.17 14.17 12.89 12.30 12.30 12.90 12.90

58.22 24.45 24.45 20.07 18.10 62.58 25.56 25.56 20.99 18.84 66.24 26.70 26.70 21.64 19.36

4.54 4.48 4.48 4.54 4.54 4.72 4.65 4.65 4.72 4.72 4.92 4.86 4.86 4.92 4.92

1.62 1.61 1.61 1.62 1.62 1.65 1.65 1.65 1.65 1.65 2.20 2.20 2.20 2.20 2.20

1.04 1.02 1.03 1.17 1.22 1.20 1.16 1.17 1.40 1.46 1.22 1.20 1.20 1.49 1.57

20.72 17.04 15.53 16.43 16.61 20.83 16.96 15.31 16.07 16.14 20.87 16.80 15.11 15.87 15.78

13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73 13.73

0.25 0.33 0.38 0.35 0.35 0.22 0.30 0.35 0.33 0.32 0.19 0.27 0.31 0.29 0.29

36.61 26.41 22.94 24.42 24.51 36.78 26.20 22.24 23.56 23.67 37.06 25.82 21.95 23.72 22.88

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

1003 4846 4846 10073 4864 0 2185 424 3320 6963 8317

5001 33924 33924 10074 34053 38369 54303 16086 55985 48746 8318

0 8996 8996 1405 9105 28449 18640 10026 13861 9532 1263

7635 86609 86609 43897 86898 36013 111948 32831 243468 120079 19515

170.08 246.65 246.45 331.92 244.60 283.89 42.26 81.41 50.17 23.67 15.07

135.83 65.19 43.14 39.30 38.09 13.97 21.62 11.44 13.31 46.46 24.47

96.38 10.98 10.97 7.44 15.16 13.97 17.56 8.77 10.85 8.48 4.49

0.00 6.04 6.04 3.22 10.40 10.12 6.67 5.44 6.33 4.46 1.61

1.40 0.46 0.41 1.04 0.42 0.39 3.11 0.42 0.62 1.79 2.02

1146.82 112.75 102.82 151.56 111.15 100.36 139.56 58.44 40.67 37.13 26.68

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 32: Experimental results, summary for G4SPD scenes, average values are reported.

Appendix E

205 Scenes = group G5SPD

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

0 16132 61558 63808 23113 23113 22873 181 180 24068 23113 476453 251440 249915 356130 277145 203940 252635 24077 24243 24252 249915 262549 348137 629685 348085

0 16133 61559 63809 23114 23114 22874 182 181 24069 23114 476454 251441 249916 356131 277146 203941 252636 24078 24244 24253 249916 262550 348138 629686 348086

1 1735 4176 3393 3580 3580 3145 23 22 4124 3580 83642 28940 55707 62027 55889 51251 51449 4134 4291 4300 55707 73790 71380 53090 50674

98880 215542 331923 237074 147514 147514 155665 104396 104396 147680 147514 663728 498597 366588 519386 445138 314085 442823 147680 147725 147725 304152 328496 500124 1146337 575455

– 2088.90 372.87 452.42 41.69 41.69 52.60 2659.95 2660.08 40.80 41.69 11.53 14.18 12.36 11.99 12.13 12.94 12.64 40.50 40.63 40.41 10.90 9.60 11.89 13.87 115.65

0.00 38.15 109.84 208.61 26.99 26.99 30.38 12.66 12.62 27.54 26.99 38.32 33.74 35.23 43.45 36.11 34.28 35.61 27.60 27.57 27.63 35.28 35.10 37.90 42.23 45.55

0.00 7.07 22.01 40.90 4.71 4.71 5.54 2.62 2.61 4.79 4.71 6.55 5.76 6.01 6.90 6.14 5.86 6.05 4.78 4.80 4.80 6.04 5.97 6.42 7.07 7.08

0.00 3.37 1.89 4.25 1.78 1.78 2.50 0.63 0.62 1.91 1.78 3.00 2.21 2.90 3.33 2.95 2.84 2.89 1.97 1.92 1.98 2.90 3.40 3.10 2.72 3.13

0.21 3.31 13.51 19.24 19.52 19.60 12.50 14.01 14.01 19.60 19.52 35.32 28.70 27.26 39.95 28.95 27.16 33.99 22.63 23.20 23.22 40.48 66.34 1066.65 2171.18 1145.45

103166.96 1205.40 149.05 190.85 26.35 26.44 28.11 883.41 884.45 26.24 26.35 23.46 21.09 19.94 21.98 20.41 19.79 20.04 25.16 25.28 25.44 19.32 18.51 21.13 28.14 137.11

13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40

1.00 0.84 0.68 0.59 0.60 0.60 0.61 0.99 0.99 0.59 0.60 0.26 0.34 0.29 0.26 0.28 0.31 0.29 0.59 0.59 0.59 0.26 0.25 0.27 0.28 0.39

304702.66 5192.23 427.09 639.95 46.27 46.09 53.52 2696.34 2698.10 46.44 46.27 31.60 30.14 28.78 33.00 29.28 28.49 27.55 43.20 43.23 43.26 26.87 26.11 29.78 36.30 111.59

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

249915 227704 249915 197143 256487 249915 256487

249916 227705 249916 197144 256488 249916 256488

55707 55818 55707 50704 58838 55707 58838

366588 963815 366588 325145 600446 366588 600446

10.19 12.48 6.42 232.42 1762.56 6.40 1781.15

47.77 29.04 36.59 27.09 41.47 36.23 41.11

9.10 5.72 6.77 5.17 7.76 6.71 7.69

5.16 2.84 3.87 3.03 4.35 3.84 4.33

31.55 44.81 31.88 2683.95 133.65 31.71 133.96

6.21 6.77 5.56 69.67 411.75 5.91 414.12

10.96 10.96 11.13 11.13 11.13 12.40 12.40

0.44 0.54 0.42 0.54 0.62 0.44 0.62

21.83 21.59 16.64 175.09 2717.62 15.13 2410.13

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

23113 23113 23113 23113 118817 57566 57566 57566 57566 316687 250908 250908 250908 250908 1053664

23114 23114 23114 23114 23114 57567 57567 57567 57567 57567 250909 250909 250909 250909 198637

3581 3581 3581 3581 3581 11238 11238 11238 11238 11238 55701 55701 55701 55701 53522

147514 147514 147514 147514 147514 187225 187225 187225 187225 187225 367546 367546 367546 367546 319202

43.17 41.61 41.61 43.21 43.21 22.93 21.95 21.95 22.95 22.95 12.92 12.35 12.35 12.94 13.80

65.02 26.99 26.99 21.36 19.25 76.98 30.04 30.04 23.91 21.36 98.77 35.22 35.22 27.85 23.90

4.76 4.70 4.70 4.76 4.76 5.24 5.17 5.17 5.24 5.24 6.08 6.00 6.00 6.09 5.92

1.78 1.78 1.78 1.78 1.78 2.06 2.06 2.06 2.06 2.06 2.90 2.89 2.89 2.90 2.83

16.16 16.12 16.15 16.76 16.73 17.48 17.44 17.43 18.63 19.06 23.46 22.61 22.75 35.77 32.66

30.97 26.73 25.00 30.54 27.10 26.53 22.31 20.36 21.52 22.88 29.18 21.55 19.06 22.86 30.29

13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.40 13.92

0.45 0.54 0.60 0.54 0.57 0.31 0.38 0.44 0.42 0.39 0.17 0.24 0.29 0.25 0.17

58.86 47.46 43.02 52.82 49.04 47.69 35.63 30.83 33.55 35.40 48.11 32.13 26.51 30.41 54.49

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

14049 8712 8712 16132 8712 0 14108 4378 11437 13023 23113

66508 60985 60985 16133 60985 471705 507733 158654 387369 91166 23114

0 12877 12877 1735 12877 381796 209423 111416 141279 17164 3580

98880 324543 324543 215542 324544 385702 1026475 414167 2344246 411170 147514

769.52 781.94 779.80 2088.90 768.76 754.85 48.60 241.48 106.91 62.75 41.69

731.69 64.12 41.87 38.15 36.52 28.81 23.63 18.07 23.06 49.14 26.99

519.51 10.56 10.55 7.07 14.20 28.81 20.89 15.50 20.49 8.61 4.71

0.00 6.08 6.08 3.37 9.92 24.21 14.87 12.07 15.19 4.71 1.78

131.74 3.93 3.70 4.38 3.65 5.67 175.17 4.98 6.00 17.31 22.26

8772.62 1035.91 999.11 1453.74 1016.39 282.33 190.84 100.59 72.52 49.79 40.39

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 33: Experimental results, summary for G5SPD scenes, average values are reported.

û 206

Appendix E Scenes = group G3SPD , G4SPD , and G5SPD

Line

Minimum Testing Output ∆

Σ

Mnemonic Notation

Θ

NG

NE

NEE

NER

rIT M

N˜ T S

N˜ ET S

N˜ EET S

TB

TR

ΘAPP

Θrat

ΘRUN

0 10932 29952 30082 11100 11100 10918 165 162 13393 11100 183695 96392 89610 126970 99975 72890 91333 13410 13486 13490 89610 94051 124325 223977 124524

0 10933 29953 30083 11101 11101 10919 166 163 13394 11101 183696 96393 89611 126971 99976 72891 91334 13411 13487 13491 89611 94052 124326 223978 124525

1 1340 2390 1710 1700 1700 1581 25 23 2644 1700 31042 10551 20014 22152 20082 18353 18431 2652 2720 2725 20014 26558 25535 19072 18194

35845 93156 131863 97811 56873 56873 59592 38149 38148 58059 56873 256017 191150 131791 185669 161557 112851 161273 58070 58102 58102 109783 118038 178988 408151 206101

– 833.05 176.63 217.95 23.23 23.23 27.77 987.52 987.93 21.38 23.23 10.89 13.51 12.49 11.92 12.42 13.45 13.36 21.26 21.31 21.22 11.44 9.90 11.61 13.77 53.95

0.00 38.11 79.85 122.15 23.48 23.48 25.92 12.57 12.43 25.41 23.48 31.14 26.83 27.06 32.43 28.05 26.09 27.65 25.44 25.44 25.47 27.04 27.15 29.25 33.12 35.37

0.00 7.22 16.02 24.73 4.30 4.30 4.93 2.68 2.65 4.63 4.30 5.57 4.83 4.89 5.56 5.04 4.73 4.98 4.62 4.63 4.63 4.89 4.90 5.24 5.81 5.87

0.00 3.13 2.31 2.95 1.48 1.48 2.01 0.69 0.62 1.91 1.48 2.36 1.64 2.21 2.48 2.24 2.16 2.18 1.94 1.92 1.95 2.22 2.58 2.39 2.06 2.36

0.07 1.25 4.92 6.98 6.97 7.00 4.45 4.96 4.95 7.04 6.97 12.79 10.31 9.61 14.08 10.25 9.57 12.04 8.13 8.33 8.34 14.29 23.05 380.31 771.27 408.32

37225.86 445.51 75.52 97.15 18.74 18.56 19.52 326.62 327.00 18.34 18.74 17.85 17.01 16.26 17.31 16.70 16.37 16.35 17.50 17.54 17.65 15.57 15.14 16.44 19.92 60.00

13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47

1.00 0.65 0.54 0.50 0.46 0.46 0.46 0.90 0.90 0.41 0.46 0.27 0.35 0.32 0.29 0.31 0.35 0.33 0.41 0.41 0.41 0.30 0.28 0.30 0.30 0.37

109759.85 1889.17 203.90 299.74 30.94 30.23 33.77 969.73 970.44 30.20 30.94 25.76 25.20 24.14 26.51 24.67 24.12 23.14 27.77 27.60 27.84 22.30 21.94 24.01 28.23 57.89

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

na¨ıve RSA spatmed-xyz(16,2) objmed-xyz(16,2) objmed(16,2) OSAH(16,2) OSAH-RMI(16,2) OSAH-xyz(16,2) OSAH(8,1) OSAH(8,2) OSAH(16,1) OSAH(16,2) OSAH(24,1) OSAH(24,2) OSAH(atc) OSAH2(atc) OSAH+LC(atc) OSAH+TPC(atc) OSAH+TPC+LC(atc) OSAH+LC(16,1) OSAH+TPC(16,1) OSAH+TPC+LC(16,1) OSAH+PR(atc) OSAH+SC(atc) OSAH+GCM(atc) OSAH+GCM2(atc) OSAH+GCM3(atc)

26 27 28 29 30 31 32

OSAH+PAR(atc) PARSAH+PAR(atc) OSAH+PER(atc) PERSAH+PER(atc) SPHSAH+PER(atc) OSAH+SPH(atc) SPHSAH+SPH(atc)

89610 82577 89610 70863 91842 89610 91842

89611 82578 89611 70864 91843 89611 91843

20014 20101 20014 18223 21101 20014 21101

131791 341823 131791 117163 215113 131791 215113

7.86 8.14 6.41 96.51 605.19 6.39 611.22

31.01 20.95 28.19 21.29 30.16 27.91 29.91

6.07 4.19 5.50 4.28 5.85 5.45 5.80

3.30 2.05 2.93 2.30 3.07 2.91 3.05

11.13 15.82 11.25 956.44 47.58 11.19 47.68

4.43 4.49 4.56 28.73 141.79 4.88 142.78

11.53 11.53 12.05 12.05 12.05 12.13 12.13

0.44 0.52 0.42 0.53 0.60 0.44 0.62

16.37 15.41 14.05 72.60 924.01 11.42 818.01

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

OSAH+TAseq (16,2) OSAH+TAA rec (16,2) OSAH+TAB rec (16,2) OSAH+TASNL (16,2) OSAH+TANLT (16,2) OSAH+TAseq (18,2) OSAH+TAA rec (18,2) OSAH+TAB rec (18,2) OSAH+TASNL (18,2) OSAH+TANLT (18,2) OSAH+TAseq (atc) OSAH+TAA rec (atc) OSAH+TAB rec (atc) OSAH+TASNL (atc) OSAH+TANLT (atc)

11100 11100 11100 11100 58376 24475 24475 24475 24475 134595 89955 89955 89955 89955 359820

11101 11101 11101 11101 11101 24476 24476 24476 24476 24476 89956 89956 89956 89956 68183

1700 1700 1700 1700 1700 4414 4414 4414 4414 4414 20014 20014 20014 20014 18107

56874 56874 56874 56874 56874 73134 73134 73134 73134 73134 132123 132123 132123 132123 109002

24.16 23.20 23.20 24.17 24.17 16.84 16.11 16.11 16.85 16.85 13.06 12.48 12.48 13.06 13.33

54.88 23.47 23.47 19.13 17.32 60.89 24.99 24.99 20.40 18.37 68.95 27.06 27.06 21.83 19.15

4.35 4.29 4.29 4.35 4.35 4.60 4.53 4.53 4.60 4.60 4.94 4.88 4.88 4.94 4.85

1.48 1.48 1.48 1.48 1.48 1.59 1.59 1.58 1.59 1.59 2.21 2.21 2.21 2.21 2.17

5.77 5.75 5.76 6.02 6.03 6.27 6.24 6.24 6.73 6.90 8.26 7.97 8.02 12.46 10.73

22.48 18.94 17.52 19.84 18.86 21.14 17.50 15.93 16.75 17.25 21.95 17.15 15.41 17.10 19.17

13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.47 13.64

0.32 0.40 0.46 0.42 0.42 0.26 0.34 0.39 0.37 0.36 0.20 0.28 0.32 0.29 0.27

41.11 31.28 27.75 31.89 30.88 37.56 27.37 23.52 25.27 25.91 37.69 25.95 21.87 24.21 31.10

48 49 50 51 52 53 54 55 56 57 58

BVH O84 O89 BSP O93 UG AG HUG RG O84A KD

5055 5585 5585 10932 5597 0 5498 1615 4994 8027 11100

24049 39097 39097 10933 39180 171734 189135 58906 149231 56192 11101

0 8888 8888 1340 8947 137798 76606 40749 52154 10429 1700

35845 153693 153693 93156 153845 142578 383578 150552 867538 200194 56873

330.80 366.77 365.94 833.05 361.37 397.47 41.14 155.46 65.28 36.01 23.23

298.66 63.16 41.92 38.11 37.19 16.97 18.46 12.10 15.07 44.57 23.48

212.32 10.68 10.66 7.22 14.85 16.97 15.61 9.62 12.74 8.19 4.30

0.00 5.78 5.78 3.13 10.12 13.03 8.06 6.50 8.24 4.22 1.48

44.41 1.52 1.41 2.06 1.41 2.03 59.49 1.81 2.22 6.54 8.23

3376.05 400.24 381.70 560.34 392.80 145.37 127.48 77.38 47.84 39.08 29.42

– – – – – – – – – – –

– – – – – – – – – – –

– – – – – – – – – – –

Table 34: Experimental results, summary for G3SPD , G4SPD , and G5SPD scenes, average values are reported.

Loading...