[brlcad-devel] Region tests before boolean evaluation

Discussion:

Marco Domingues

2017-05-17 17:20:38 UTC

Vasco Alexandre da Silva Costa

2017-05-17 23:42:44 UTC

Post by Marco Domingues
Hi,
When looking at Boolean evaluation code i noticed the use of the struct
bu_bitv (struct bu_bitv *solidbits) to test if a given region is ready to
be evaluated, by checking if every solid in the region has been intersected
(function âbool_partition_eligibleâ).
From my understanding, these tests are not necessary to be ported to
OpenCl, because at the moment of boolean evaluation, every solid is
count_hits()
store_segs()
weave_segs()*
eval_partitions()*
shade_segs()
*yet to be implemented
Is this correct? Or the struct bu_bitv is used for something else in the
background other than just checking if every solid has been intersected?

Right. If I remember correctly that bitvector is used to ensure that we do
not intersect each object more than once. It is basically a performance
optimization because, if the acceleration structure is a spatial partition
like a kd-tree or a grid, it is possible that you will compute multiple
intersections with the same object in the same ray otherwise. But the thing
is, I implemented a BVH object partition acceleration structure for the
OpenCL version of the raytracer code. In a BVH you will never compute an
intersection with any object more than once along the same ray. That's just
how it is. So the final list should have no duplicates already.

Hence we should not need the bitvector at all. That is why I used a BVH in
the first place. Having a per thread bitvector with one bit per primitive
in a GPU, with possibly thousands of threads in flight, would use a lot of
cache space which would make the intersection computations really slow. You
want to minimize the amount of temporaries per thread on a GPU so you can
have more threads in flight without waiting for accesses to DRAM which.are
really painfully slow in comparison to the cache.

Regards,

--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal

Marco Domingues

2017-05-18 12:25:29 UTC

Permalink

Hi Vasco,

Thanks for help and for the explanation!

Regards,

Hi,
When looking at Boolean evaluation code i noticed the use of the struct bu_bitv (struct bu_bitv *solidbits) to test if a given region is ready to be evaluated, by checking if every solid in the region has been intersected (function âbool_partition_eligibleâ).
count_hits()
store_segs()
weave_segs()*
eval_partitions()*
shade_segs()
*yet to be implemented
Is this correct? Or the struct bu_bitv is used for something else in the background other than just checking if every solid has been intersected?
Right. If I remember correctly that bitvector is used to ensure that we do not intersect each object more than once. It is basically a performance optimization because, if the acceleration structure is a spatial partition like a kd-tree or a grid, it is possible that you will compute multiple intersections with the same object in the same ray otherwise. But the thing is, I implemented a BVH object partition acceleration structure for the OpenCL version of the raytracer code. In a BVH you will never compute an intersection with any object more than once along the same ray. That's just how it is. So the final list should have no duplicates already.
Hence we should not need the bitvector at all. That is why I used a BVH in the first place. Having a per thread bitvector with one bit per primitive in a GPU, with possibly thousands of threads in flight, would use a lot of cache space which would make the intersection computations really slow. You want to minimize the amount of temporaries per thread on a GPU so you can have more threads in flight without waiting for accesses to DRAM which.are really painfully slow in comparison to the cache.
Regards,
--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org <http://slashdot.org/>! http://sdm.link/slashdot_______________________________________________ <http://sdm.link/slashdot_______________________________________________>
BRL-CAD Developer mailing list
https://lists.sourceforge.net/lists/listinfo/brlcad-devel <https://lists.sourceforge.net/lists/listinfo/brlcad-devel>

Vasco Alexandre da Silva Costa

2017-05-20 07:30:24 UTC

Permalink

If it turns out that you do need to query if a given primitive is
intersected by any given ray or not, you can, for now, just iterate the
intersections list produced by store_segs() for that ray to see if the
primitive is intersected. A lot of scenes do not have significant depth
complexity so it should still be ok as an initial step. In the longer run
we can think of some other way to speed up that computation if it is
required (e.g. create a sorted intersected primitives list and do a binary
search; or use some kind of hash table).

Post by Marco Domingues
Hi Vasco,
Thanks for help and for the explanation!
Regards,
On 18 May 2017, at 00:42, Vasco Alexandre da Silva Costa <

Post by Marco Domingues
From my understanding, these tests are not necessary to be ported to

OpenCl, because at the moment of boolean evaluation, every solid is
count_hits()
store_segs()
weave_segs()*
eval_partitions()*
shade_segs()
*yet to be implemented
Is this correct? Or the struct bu_bitv is used for something else in the
background other than just checking if every solid has been intersected?

Right. If I remember correctly that bitvector is used to ensure that we do
not intersect each object more than once. It is basically a performance
optimization because, if the acceleration structure is a spatial partition
like a kd-tree or a grid, it is possible that you will compute multiple
intersections with the same object in the same ray otherwise. But the thing
is, I implemented a BVH object partition acceleration structure for the
OpenCL version of the raytracer code. In a BVH you will never compute an
intersection with any object more than once along the same ray. That's just
how it is. So the final list should have no duplicates already.
Hence we should not need the bitvector at all. That is why I used a BVH in
the first place. Having a per thread bitvector with one bit per primitive
in a GPU, with possibly thousands of threads in flight, would use a lot of
cache space which would make the intersection computations really slow. You
want to minimize the amount of temporaries per thread on a GPU so you can
have more threads in flight without waiting for accesses to DRAM which.are
really painfully slow in comparison to the cache.
--

Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal

Vasco Alexandre da Silva Costa

2017-05-20 07:41:40 UTC

Permalink

On Sat, May 20, 2017 at 8:30 AM, Vasco Alexandre da Silva Costa <

Post by Vasco Alexandre da Silva Costa
If it turns out that you do need to query if a given primitive is
intersected by any given ray or not, you can, for now, just iterate the
intersections list produced by store_segs() for that ray to see if the
primitive is intersected. A lot of scenes do not have significant depth
complexity so it should still be ok as an initial step. In the longer run
we can think of some other way to speed up that computation if it is
required (e.g. create a sorted intersected primitives list and do a binary
search; or use some kind of hash table).

PS: Another alternative is to just re-compute the intersection. That would
use the least amount of memory at the expense of aditional computation cost.

Post by Vasco Alexandre da Silva Costa
On Thu, May 18, 2017 at 1:25 PM, Marco Domingues <

Post by Marco Domingues
Hi Vasco,
Thanks for help and for the explanation!
Regards,
On 18 May 2017, at 00:42, Vasco Alexandre da Silva Costa <

Post by Marco Domingues
From my understanding, these tests are not necessary to be ported to

Right. If I remember correctly that bitvector is used to ensure that we
do not intersect each object more than once. It is basically a performance
optimization because, if the acceleration structure is a spatial partition
like a kd-tree or a grid, it is possible that you will compute multiple
intersections with the same object in the same ray otherwise. But the thing
is, I implemented a BVH object partition acceleration structure for the
OpenCL version of the raytracer code. In a BVH you will never compute an
intersection with any object more than once along the same ray. That's just
how it is. So the final list should have no duplicates already.
Hence we should not need the bitvector at all. That is why I used a BVH
in the first place. Having a per thread bitvector with one bit per
primitive in a GPU, with possibly thousands of threads in flight, would use
a lot of cache space which would make the intersection computations really
slow. You want to minimize the amount of temporaries per thread on a GPU so
you can have more threads in flight without waiting for accesses to DRAM
which.are really painfully slow in comparison to the cache.
--

Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal

--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal

Marco Domingues

2017-05-20 13:36:23 UTC

Permalink

I will keep in mind those solutions when implementing the code.

Although i donât think that it will be necessary to know if a primitive is intersected by any given ray to evaluate the partitions, it may be necessary later on when implementing the statistics method for CSG coherence.

Regards,

If it turns out that you do need to query if a given primitive is intersected by any given ray or not, you can, for now, just iterate the intersections list produced by store_segs() for that ray to see if the primitive is intersected. A lot of scenes do not have significant depth complexity so it should still be ok as an initial step. In the longer run we can think of some other way to speed up that computation if it is required (e.g. create a sorted intersected primitives list and do a binary search; or use some kind of hash table).
PS: Another alternative is to just re-compute the intersection. That would use the least amount of memory at the expense of aditional computation cost.
Hi Vasco,
Thanks for help and for the explanation!
Regards,

--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal
--
Vasco Alexandre da Silva Costa
PhD in Computer Engineering (Computer Graphics)
Instituto Superior TÃ©cnico/University of Lisbon, Portugal
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot_______________________________________________
BRL-CAD Developer mailing list
https://lists.sourceforge.net/lists/listinfo/brlcad-devel