Subdivision surface efficiency relies on caching pre-computed topology
data for evaluation between frames. However, while eed45d2a23
introduced a second GPU subdiv evaluator type, it still only kept
one slot for caching this runtime data per mesh.
The result is that if the mesh is also needed on CPU, for instance
due to a modifier on a different object (e.g. shrinkwrap), the two
evaluators are used at the same time and fight over the single slot.
This causes the topology data to be discarded and recomputed twice
per frame.
Since avoiding duplicate evaluation is a complex task, this fix
simply adds a second separate cache slot for the GPU data, so that
the cost is simply running subdivision twice, not recomputing topology
twice.
To help diagnostics, I also add a message to show when GPU evaluation
is actually used to the modifier panel. Two frame counters are used
to suppress flicker in the UI panel.
Differential Revision: https://developer.blender.org/D17117
Pull Request #104441
The ear clipping method used by polyfill_2d only excluded concave ears
which meant ears exactly co-linear edges created zero area triangles
even when convex ears are available.
While polyfill_2d prioritizes performance over *pretty* results,
there is no need to pick degenerate triangles with other candidates
are available. As noted in code-comments, callers that require higher
quality tessellation should use BLI_polyfill_beautify.
Sockets after the geometry socket were ignored when cycling through
the node's output sockets. If there are multiple geometry sockets, the
behavior could still be refined probably, but this should at least make
basic non-geometry socket cycling work.
Now a single script to generate both links and release notes. It also includes
the issue ID for the LTS releases, so only the release version needs to be
specified.
Pull Request #104402
Minor change to [0], prefer calling em_setup_viewcontext,
even though there is no functional difference at the moment,
if this function ever performs additional operations than assigning
`ViewContext.em`, it would have to be manually in-lined in
`view3d_circle_select_recalc`.
[0]: 430cc9d7bf
Added missing documentation for `draw_cursor_add` and
`draw_cursor_remove` methods for `WindowManager`.
Differential Revision: https://developer.blender.org/D14860
Discard is not always treated as an explicit return and flow control can continue for required derivative calculations. This behaviour is different in Metal vs OpenGL. Adding return after discards ensures consistency in expectation as behaviour is well-defined.
Authored by Apple: Michael Parkin-White
Ref T96261
Reviewed By: fclem
Maniphest Tasks: T96261
Differential Revision: https://developer.blender.org/D17199
Host memory fallback in CUDA and HIP devices is almost identical.
We remove duplicated code and create a shared generic version that
other devices (oneAPI) will be able to use.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D17173
Straightforward port. I took the oportunity to remove some C vector
functions (ex: copy_v2_v2).
This makes some changes to DRWView to accomodate the alignement
requirements of the float4x4 type.
`9c14039a8f4b5f` broke blenlib tests in release builds, due to how
`EXPECT_BLI_ASSERT` works (in release builds it just calls the given
function, so if that crashes then the test fails).
For now remove that check in the test.
Remove the use of a separate contiguous positions array now that
they are stored that way in the first place. This allows removing the
complexity of tracking whether it is allocated and deformed in the
mesh modifier stack.
Instead of deferring the creation of the final mesh until after the
positions have been copied and deformed, create the final mesh
first and then deform its positions.
Since vertex and face normals are calculated lazily, we can rely on
individual modifiers to calculate them as necessary and simplify
the modifier stack. This was hard to change before because of the
separate array of deformed positions.
Differential Revision: https://developer.blender.org/D16971
When activating a rotation with the Transform gizmo for example, some
gizmos are hidden but they don't reappear when changing the mode.
Make sure the gizmos corresponding to the mode always reappear.
This patch optimises subsurface intersection queries on MetalRT. Currently intersect_local traverses from the scene root, retrospectively discarding all non-local hits. Using a lookup of bottom level acceleration structures, we can explicitly query only the relevant instance. On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.
Patch authored by Marco Giordano.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D17153
`em_setup_vivewcontext` cannot be used in this function now as it
expects `obedit` to be a mesh. It also duplicated the viewcontext init.
Instead `BKE_editmesh_from_object` is called only when type is a mesh.
Existing `BKE_main_namemap_destroy` is too specific when a entire Main
needs to have its namemaps cleared, since it would not handle the
Library ones.
While in regular situation current code is fine, ID management code that
may also edit linked data needs a wider, simpler clearing tool.
Current `BKE_id_remapper_add` would not replace an already existing
mapping rule, now `BKE_id_remapper_add_overwrite` allows that behavior
if necessary.
If identity pairs (i.e. old ID pointer being same as new one) was
forbidden, then this should be asserted against in code defining
remapping, not in code applying it.
But it is actually sometimes usefull to allow/use identity pairs, so
simply early-return on these instead of asserting.
The goal is to give technical artists the ability to optimize modifier usage
and/or geometry node groups for performance. In the long term, it
would be useful if Blender could provide its own UI to display profiling
information to users. However, right now, there are too many open
design questions making it infeasible to tackle this in the short term.
This commit uses a simpler approach: Instead of adding new ui for
profiling data, it exposes the execution-time of modifiers in the Python
API. This allows technical artists to access the information and to build
their own UI to display the relevant information. In the long term this
will hopefully also help us to integrate a native ui for this in Blender
by observing how users use this information.
Note: The execution time of a modifier highly depends on what other
things the CPU is doing at the same time. For example, in many more
complex files, many objects and therefore modifiers are evaluated at
the same time by multiple threads which makes the measurement
much less reliable. For best results, make sure that only one object
is evaluated at a time (e.g. by changing it in isolation) and that no
other process on the system keeps the CPU busy.
As shown below, the execution time has to be accessed on the
evaluated object, not the original object.
```lang=python
import bpy
depsgraph = bpy.context.view_layer.depsgraph
ob = bpy.context.active_object
ob_eval = ob.evaluated_get(depsgraph)
modifier_eval = ob_eval.modifiers[0]
print(modifier_eval.execution_time, "s")
```
Differential Revision: https://developer.blender.org/D17185