__GCC_HAVE_SYNC_COMPARE_AND_SWAP_n are not defined on this architecture
for some reason, however those functions are available.
Adding a workaround for newly used __sync_fetch_and_and() function.
Couple of issues here:
- Was a bug in heap memory allocation when run out
of allowed stack memory.
- Debug MSVC was failing because it uses separate
allocator for some sort of internal proxy thing,
which seems to be unable to be using stack memory
because allocator is being created in non-persistent
stack location.
Issue was that before writing to disk grids are clipped against the
density field's tree to optimize for memory/disk space, which in the
case of simulations with no density field results in all grids having
empty trees.
For now avoid clipping against empty grids, but perhaps in the future it
can be interresting to have some UI parameters to let the user choose
the grid used for clipping.
This mainly touches extern libraries and few debug-only places in intern.
Some summary:
- External libraries are not strict at all about missing declarations,
so we can rather safely remove such warning together with other strict
flags.
- Bullet has some static functions which are not used.
Those were commented out.
- Carve now has some unused debug-only functions commented out as well.
While we're on the way of getting rid of Carve, it makes sense to make
things a bit cleaner for the time being.
- In LZMA we have some parts disabled which gives some set but unused
variables which is rather correct.
- Elbeem had quite some variables set and never used because their usage
is inside of debug-only code which is commented out.
Note about patching upstream libraries: surely one might say that we
have to make local patchset against this, but own experience says it
only gives extra work trying to merge such tweaks to a new upstream
version and usually it's just faster to re-apply such fixes again after
bundling new upstream library.
Mainly makes logging less verbose when doing progressive sampling in viewport.
Such kind of verbosity is not really possible to be filtered out with `grep`
so let's reshuffle few lines of code.
Simple idea, use threads when dealing with "Copying Transformations to device"
scene update step. Only do it if there's enough objects in the scene.
Hopefully only brings less synchronization time and doesn't break anything.
From tests on my desktop this brings down transform update time from 58sec to
11sec on victor_cpu.blend scene from out benchmark.
This is an attempt to gracefully handle out-of-memory events
and stop rendering with an error message instead of a crash.
It uses bad_alloc exception, and usually i'm not really fond
of exceptions, but for such limited use for errors from which
we can't recover it should be fine.
Ideally we'll need to stop full Cycles Session, so viewport
render and persistent images frees all the memory, but that
we can support later, since it'll mainly related on telling
Blender what to do.
General rules are:
- Use as less exception handles as possible, try to find a
most geenric pace where to handle those.
For example, ccl::Session.
- Threads needs own handling, exception trap from one thread
will not catch exceptions from other threads.
That's why BVH build needs own thing.
Reviewers: brecht, juicyfruit, dingto, lukasstockner97
Differential Revision: https://developer.blender.org/D1898
This mimics behavior of default allocators in STL and allows all the routines
to catch out-of-memory exceptions and hopefully recover from that situation/
This commits implements threaded sorting of references when looking for
object spatial split. It mainly useful when doing initial binning, which
happens from main thread.
Gives nice speedup of BVH build for Bunny.blend: 36sec vs. 55sec for
the Rabbit mesh BVH build.
On more complex scenes the speedup is probably minimal, but still nice
to have more instant rendering for simplier scenes.
Some further tests with production scenes would be interesting.
Reviewers: juicyfruit, dingto, lukasstockner97, brecht
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1928
Note that this only removes the actual dependencies of Cycles on the
particle code in Blender, but not the internal "particle" definition
or the curve type handling inside Cycles. These structures may be in need
of some improvement themselves, but that is out of scope here.
- Fix wrong current sample reported in the log
- Also includes fix for progressive refine log
- Explicitly print to the stdout that resumable render is enabled
- Print error message and abort when passing wrong values for the
resumable render. Never waste someone's compute power for wrong
render!
Fixes T48185: Cycles resumable num chunks breaks sample counter
This is to prevent situations such as when the camera gets very close to a mesh
and causes it to be tessellated into an excessive amount of micropolygons. In
REYES this is known as the eye-splits problem.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1922
This makes it easier to control overall dicing rate without having to tweak
every object. The preview rate makes viewport editing more interactive. The
default preview rate of 8 is roughly 64 times faster for some operations.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1919
Instead of treating Fermi GPU limits as default,
and overriding them for other devices,
we now nicely set them for each platform.
* Due to setting values for all platforms,
we don't have to offset the slot id for OpenCL anymore,
as the image manager wont add float images for OpenCL now.
* Bugfix: TEX_NUM_FLOAT_IMAGES was always 5, even for CPU,
so the code in svm_image.h clamped float textures with alpha on CPU after the 5th slot.
Reviewers: #cycles, brecht
Reviewed By: #cycles, brecht
Subscribers: brecht
Differential Revision: https://developer.blender.org/D1925
Seems particular CUDA implementations has some precision issues,
which made integer coordinate (which was expected to always be
positive) to go negative.
Simple fix: release lock earlier.
Reduces spatial split build time from 96 to 53sec on the Bunny.blend
(using studio Intel for benchmark).
NOTE: Timing difference is not that spectacular when comparing numbers
with builds before memory optimization, but even then it's about 20%
faster build.
Similar to velocity, it was kind of supported by the mesh manager but
was missing a code in BlenderSession to get actual values.
In Cycles Heat is an attribute which goes from -1 to 1, where -1 is
the coldest ever temperature, 1 is the hottest ever one.
This was only visible on systems with lots of threads and root of the issue
was that we've been pre-allocating too much memory for all the threads.
Now we only pre-allocate data for the main thread and rest of the threads
does allocation on-demand.
This brings down memory usage from 36Gig to 6.9Gig when building spatial
split for the Bunny.blend file on our Intel beast.
Originally regression was happened by the threaded spacial split builder
commit.
Couple of things:
- No need to use string streams to format the version string,
we can do it at compile time and don't bother with anything
at runtime.
- Function declaration was wring and would have caused linking
conflicts in cases when util_version.h was included from
multiple places.
We should have an utility function to get Cycles version so
applications which are linked to Cycles dynamically can query
the version, but that can't be done as an inlined function in
header and would need to be a function properly exported to a
global symbol table (aka, be implemented in a .cpp file).
Now Cycles has its own versioning, that is mainly interesting for external projects, which integrate the engine.
We start with version 1.7.0. Reasons for that:
* The engine is too mature for a 1.0 release.
* We assume that Cycles inside of Blender 2.61 was version 0.1. We count upwards in 0.1 steps, therefore Cycles inside of Blender 2.77 would be 1.7.
We use a common versioning scheme here, with 3 decimals for the major, minor and patch level.
At the moment cycles --version can be used to display the version, easy to parse for external projects. The info will be added to the UI later aswell.
This way we prevent cracks in the model due to discontinuous normals, by using
smooth normals for displacement instead of always getting flat normals after
linear subdivision.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D1916