Commit Graph

389 Commits

Author SHA1 Message Date
Jacques Lucke 51f8bf53b2 Geometry Nodes: use xxhash for compute context hash
Previously, md5 was used which is significantly slower. In almost all cases
this does not have a significant performance impact in practice. However,
it's possible to build geometry nodes setups that become a few percent
faster ( by combining lots of cheap node groups). Using xxhash instead of
md5 should never be slower.

Pull Request: https://projects.blender.org/blender/blender/pulls/120225
2024-04-03 20:11:09 +02:00
Campbell Barton 937776b555 Cleanup: sort CMake file lists 2024-04-01 16:48:44 +11:00
Jacques Lucke 7314c86869 BLI: add fixed width integer type
This is intended to be used in the new exact mesh boolean algorithm by @howardt.
The new `BLI_fixed_width_int.hh` header provides types like `Int256` and
`UInt256` which are like e.g. `uint64_t` but with higher precision. The code
supports many different integer sizes.

The following operations are supported:
* Addition
* Subtraction
* Multiplication
* Comparisons
* Negation
* Conversion to and from other number types
* Conversion to and from string (based on `GMP`)

Division is not implemented. It could be implemented, but it's more complex and
is not required for the new mesh boolean algorithm.

Some alternatives to having a custom implementation have been discussed in
https://devtalk.blender.org/t/fixed-length-multiprecision-arithmetic/29189/.

Generally, the implementation is fairly straight forward. The main complexity is
the addition/multiplication algorithm which isn't too complicated. It's nice to
have control over this part as it allows us to optimize the code more if
necessary. Also, from what I understand, we might be able to benefit from some
special cases like multiplying a large integer with a smaller one.

I tried some different ways to optimize this already, but so far the normal
compiler optimization turned out to work best. Not sure if the same is true on
windows though, as it doesn't have native support for an `int128` which helps
the compiler understand what I'm doing. Alternatives I tried so far are using
intrinsics directly (mainly `_addcarry_u64` and similar), writing inline
assembly manually and copying the assembly output from the compiler. I assume
the assembly implementation didn't help for me because it prohibited other
compiler optimizations.

Pull Request: https://projects.blender.org/blender/blender/pulls/119528
2024-03-25 23:39:42 +01:00
Jacques Lucke ee1fa8e1ca BLI: support set operations on index masks
The `IndexMask` data structure was designed to allow us to implement set
operations like `union`, `intersection` and `difference` efficiently
(2cfcb8b0b8). This patch adds an evaluator for
arbitrary expressions involving the mentioned operations. The evaluator makes
use of the design of the `IndexMask` data structure to be quite efficient.

In some common cases, the evaluator runs in constant time. So it's very fast
even if the mask contains many millions of indices. If possible the evaluator
works on entire segments at once instead of looking at the individual indices.
This results in a very low constant factor even if the evaluation time is
linear. If the evaluator has to look at the individual indices to be able to
perform the operation, it can make use of multi-threading.

The evaluation consists of the following steps:
1. A coarse evaluation that looks at entire segments at once.
2. All segments that couldn't be fully evaluated by the coarse evaluation are
   evaluated exactly by looking at the actual indices. There are two evaluators
   for this case. One that is based on `std::set_union` etc. The other one first
   converts the index masks to bit spans, then does bit operations to evaluate
   the expression, and then converts the bits back into indices. Depending on
   the expression, one or the other can be more efficient.
3. Construct an index mask from the evaluated segments.

Showing the performance of the evaluator is kind of difficult because it highly
depends on the input data. Comparing the performance to something that does not
short-circuit when there are full ranges is meaningless, because one can
construct an example where the new evaluator is arbitrarily faster. I'm still
working on a case where performance can be compared to e.g. using
`std::set_union`. This comparison is only fair when the input data when
constructing a case where the new evaluator can't short-circuit.

One of the main remaining bottlenecks are the calls to `slice_content` on large
index masks. I think the impact of those can still be reduced.

We are not using this evaluator much yet, except through `IndexMask::complement`
calls. I intend to use it when I get to refactoring the field evaluator for
geometry nodes to optimize the evaluation of selections.

Pull Request: https://projects.blender.org/blender/blender/pulls/117805
2024-03-17 09:52:32 +01:00
Campbell Barton 9796805bb8 Cleanup: sort CMake source files 2024-03-07 13:29:09 +11:00
Hans Goudey 139607dd26 Cleanup: Move BLI_bitmap_draw_2d.h to C++ 2024-03-05 10:28:17 -05:00
Hans Goudey 164eb3c25b Cleanup: Move lasso utility files to C++ 2024-03-05 10:23:11 -05:00
Jacques Lucke fe2a47b5a7 BLI: add chunked list data structure that uses linear allocator
This adds a new special purpose container data structure that can be
used to gather many elements into many (potentially small) lists efficiently.

I originally worked on this data structure because I might want to use it
in #118772. However, also it's useful in the geometry nodes logger already.
I'm measuring a 10-20% speed improvement in my many-math-nodes file
when I enable logging for all sockets (not just the ones that are currently visible).

Pull Request: https://projects.blender.org/blender/blender/pulls/118774
2024-02-28 22:22:21 +01:00
Jacques Lucke 1e20f06c21 BLI: add utility to simplify creating proper random access iterator
The difficulty of implementing this iterator is that it requires lots of operator
overloads which are usually very simple to implement, but result in a lot of code.
The goal of this patch is to abstract the common parts so that it becomes easier
to implement random accessor iterators. Many algorithms can work more
efficiently with random access iterators than with other iterator types.

Also see https://en.cppreference.com/w/cpp/iterator/random_access_iterator

Pull Request: https://projects.blender.org/blender/blender/pulls/118113
2024-02-17 20:59:45 +01:00
Campbell Barton 7747b8c944 Fix convexhull_2d_test for macOS & re-enable the test
Use EXPECT_NEAR instead of EXPECT_EQ to account for a differences in
atan2 implementation on macOS, more generally relying on exact
float comparison for tests is error prone.
2024-02-13 14:07:26 +11:00
Hans Goudey 1394907474 Cleanup: Move uvproject.c to C++ 2024-02-12 20:43:24 -05:00
Campbell Barton fb81bbaa60 Tests: disable BLI_convexhull_2d_test which fails on macOS 2024-02-13 00:34:44 +11:00
Campbell Barton b91918564d Tests: add tests for convexhull_2d
Move BLI_convexhull_aabb_fit_points_2d to a public function to be able
to compare compare fitting one convex hull with a simple reference
method.

One test is disabled as it exposes an error in convex hull calculation
which needs further investigation.
2024-02-12 20:17:19 +11:00
Campbell Barton 727d47c015 Cleanup: move convexhull_2d to C++ 2024-02-10 22:40:46 +11:00
Jacques Lucke 47cf827049 Cleanup: add forward declaration header for IndexMask and VArray
This avoids duplicating the declaration in multiple places.
2024-02-04 11:55:45 +01:00
Ray Molenkamp fc409e4388 Cleanup: CMake: Modernize extern_fmtlib dependencies
Pretty straightforward

- Remove any fmtlib paths from INC
- Add a dependency though LIB when missing

context: https://devtalk.blender.org/t/cmake-cleanup/30260

Pull Request: https://projects.blender.org/blender/blender/pulls/117787
2024-02-03 18:55:09 +01:00
Jacques Lucke 319b911784 Cleanup: move hash and ghash utils to C++
Also see #103343.

Pull Request: https://projects.blender.org/blender/blender/pulls/117761
2024-02-02 19:55:06 +01:00
Jacques Lucke 311ca3e6af Core: rename Session UUID to Session UID
`UUID` generally stands for "universally unique identifier". The session identifier that
we use is neither universally unique, nor does it follow the standard. Therefor, the term
"session uuid" is confusing and should be replaced.

In #116888 we briefly talked about a better name and ended up with "session uid".
The reason for "uid" instead of "id" is that the latter is a very overloaded term in Blender
already.

This patch changes all uses of "uuid" to "uid" where it's used in the context of a
"session uid". It's not always trivial to see whether a specific mention of "uuid" refers
to an actual uuid or something else. Therefore, I might have missed some renames.
I can't think of an automated way to differentiate the case.

BMesh also uses the term "uuid" sometimes in a the wrong context (e.g. `UUIDFaceStepItem`)
but there it also does not mean "session uid", so it's *not* changed by this patch.

Pull Request: https://projects.blender.org/blender/blender/pulls/117350
2024-01-22 13:47:13 +01:00
Jacques Lucke 4b47b46f9c Cleanup: rename PIL to BLI
The term `PIL` stands for "platform independent library." It exists since the `Initial Revision`
commit from 2002. Nowadays, we generally just use the `BLI` (blenlib) prefix for such code
and the `PIL` prefix feels more confusing then useful. Therefore, this patch renames the
`PIL` to `BLI`.

Pull Request: https://projects.blender.org/blender/blender/pulls/117325
2024-01-19 14:32:28 +01:00
Aras Pranckevicius 709b00179f VSE: add Bicubic filtering option, and optimize bicubic performance
Part of overall "improve filtering situation" (#116980) task:

* Add Bicubic filtering option to strip Transform "Filter" setting.
Previously this option only existed in Transform Effect "Interpolation"
setting.
  - With this addition, it feels like the transform effect could
    possibly be marked as legacy/deprecated, since the regular Transform
    that is on all strips can do everything that Transform Effect did?
* Speed up bicubic filtering (used now in VSE, but also in CPU Compositor,
  image paint, etc.) by slightly simplifying the code and using some SIMD.
  Upscaling 96x54 image to 3840x2160 resolution, using Bicubic filtering:
  - Windows (VS2022, Ryzen 5950X): 35.5ms -> 15.1ms
  - Mac (clang 15, M1 Max): 29.6ms -> 24.4ms
* Add gtest coverage for bicubic functionality.

Pull Request: https://projects.blender.org/blender/blender/pulls/117100
2024-01-15 16:38:41 +01:00
Hans Goudey 9704d5e468 BLI: Add "numbers" math header, decouple C API
Adds a header that defines the same constants as the C++ 20
<numbers> header.

Benefits:
- Decouple our C++ and C math APIs
- Avoid using macros everywhere, nicer syntax
- Less header parsing during compilation
- Can be replaced by `std::numbers` with C++ 20

Downsides:
- There are fewer numbers defined in the C++ standard header
- Maybe we should just wait until we can use C++ 20

Pull Request: https://projects.blender.org/blender/blender/pulls/116805
2024-01-09 18:05:12 +01:00
Hans Goudey 5179e66756 Cleanup: Move array store files to C++ 2024-01-06 09:02:55 -05:00
Brecht Van Lommel 364beee159 Tests: add option to build one binary per GTest file
Bundling many tests in a single binary reduces build time and disk space
usage, but is less convenient for running individual tests command line
as filter flags need to be used.

This adds WITH_TESTS_SINGLE_BINARY to generate one executable file per
source file. Note that enabling this option requires a significant amount
of disk space.

Due to refactoring, the resulting ctest names are a bit different than
before. The number of tests is also a bit different depending if this
option is used, as one uses gtests discovery and the other is organized
purely by filename, which isn't always 1:1.

Co-authored-by: Sergey Sharybin <sergey@blender.org>
Pull Request: https://projects.blender.org/blender/blender/pulls/114604
2024-01-03 18:35:50 +01:00
Brecht Van Lommel f63accd3b6 Cleanup: move CMake test utility functions into testing.cmake
Combining functions from macros.cmake and Modules/GTestTesting.cmake.
It was unusual to have Blender specific code in the Modules folder.

Pull Request: https://projects.blender.org/blender/blender/pulls/116719
2024-01-03 14:49:11 +01:00
Brecht Van Lommel 4ce14a639f Revert "Cleanup: move CMake test utility functions into testing.cmake"
This breaks execution of some Windows tests.

This reverts commit 4190a61020.
2024-01-02 19:06:39 +01:00
Brecht Van Lommel 4190a61020 Cleanup: move CMake test utility functions into testing.cmake
Combining functions from macros.cmake and Modules/GTestTesting.cmake.
It was unusual to have Blender specific code in the Modules folder.
2024-01-02 15:34:52 +01:00
Hans Goudey 7d44065f73 Cleanup: Revert replacement of GSQueue with std::queue
There are some tragic design flaws with the Microsoft STL
implementation of `std::dequeue`. Unless we implement our
own similar data structure or use an implementation from
another library, the change isn't worth it.

This reverts commit b26cd6a4b9.
This reverts commit cc11ba33d9.
This reverts commit c929d75054.
This reverts commit bd3d5a750d.
2023-12-27 12:34:49 -05:00
Hans Goudey b26cd6a4b9 Cleanup: Remove unused GSQueue container
GSQueue dates back over 21 years, past the initial git commit. Nowadays
we generally prefer to use data structures from the C++ standard library
or our own C++ data structures. Previous commits replaced this container
with `std::queue` in a few areas. Now it is unused and can be removed.
2023-12-26 23:26:32 -05:00
Hans Goudey 451aa56d9c Cleanup: Move BLI_delaunay_2d.hh to C++ 2023-12-14 10:05:35 -05:00
Aras Pranckevicius 1e0bf33b00 ImBuf: optimize IMB_transform
IMB_transform is used by Sequencer (and other places) to do image
translation/rotation/scale on the CPU. This PR speeds up parts of it,
particularly when bilinear filtering is used. No behavior changes are
expected.

- Don't use virtual function calls inside inner loop. The code was using
  class hierarchies with virtual calls just to do equivalent of "outside
  of image? ignore" and "wrap UV coordinates or not?" decisions. Make those
  use non-virtual function based code.
- Simplify pixel sampling functions to only do the work as needed by
  anything within Blender codebase. For example, bilinear sampling of uchar
  images always uses 4 RGBA channels and never does "UV wrap" logic.
- Bilinear interpolation uchar: completely branchless SIMD code now.
- Bilinear interpolation float: 2x floor() calls instead of 4x floor() +
  2x ceil(), and final sample blending is done with SIMD.

Sequencer at 4K UHD resolution, with two image strips that need a transform,
playback framerate:

- Windows Ryzen 5950X: 18.7fps -> 26.2fps (IMB_transform time per frame goes
  26.3ms -> 11.2ms)
- Mac M1 Max: 27.3fps -> 31.4fps

At that point the IMB_transform is not the slowest part of where playback
takes time (but rather sequencer effect application etc.).

Note: the amount of _actual code_ got a bit smaller. But I've added 100 lines
of unit tests in BLI_math_interp_test.cc, the bilinear interpolation
functions were only tested very indirectly by CPU compositor template
image tests.

Pull Request: https://projects.blender.org/blender/blender/pulls/115653
2023-12-14 15:10:30 +01:00
Hans Goudey 991486c37f Cleanup: Remove/replace SmallHash data structure
Use blender::Set which is similar but offsers better type safety
and likely better performance as well. The only remaining user
was the mesh edit mode knife tool, and replacing that usage
with `Set` and `Map` was straightforward.
2023-11-28 14:03:53 -05:00
Hans Goudey 573b4728cb Cleanup: Remove unused BLI voronoi code
Unused after 75c947a467
2023-11-28 13:24:45 -05:00
Wannes Malfait b162281caf Mesh: add index-independent test for mesh equality
This adds a new function, `compare_meshes`,
as a replacement for `BKE_mesh_cmp`.

The main benefits of the new version are the following:
- The code is written in c++, and makes use of the new attributes API.
- It adds an additional check, to see if the meshes only differ by
  their indices. This is useful to verify correctness of new algorithmic
  changes in mesh code, which might produce mesh elements in a different
  order than the original algorithm. The tests will still fail, but the
  error will show that the indices changed.

Some downsides:
- The code is more complex, due to having to be index-independent.
- The code is probably slower due to having to do comparisons "index-
  independently". I have not tested this, as correctness was my priority
  for this patch. A future update could look to improve the speed,
  if that is desired.
- This is technically a breaking API change, since it changes the
  returned values of `rna_Mesh_unit_test_compare`. I don't think that
  there are many people (if any) using this, besides our own unit tests.

All tests that pass with `BKE_mesh_cmp` still pass with the new version.

**NOTE:**
Currently, mesh edge indices are allowed to be different in the
comparison, because `BKE_mesh_cmp` also allowed this. There are some
tests which would fail otherwise. These tests should be updated, and
then the corresponding code as well.

I wrote up a more detailed explanation of the algorithm here:
https://hackmd.io/@bo-JY945TOmvepQ1tAWy6w/SyuaFtay6

Pull Request: https://projects.blender.org/blender/blender/pulls/112794
2023-11-27 16:10:43 +01:00
Aras Pranckevicius 8a75b54735 BLI: change timeit to use fmtlib instead of direct cout
Especially on windows, direct output to `cout` via `<<` is very expensive.
Instead, use fmtlib to do all formatting into a no-alloc `fmt::memory_buffer`,
and output that with one call to `cout`.

timeit utilities are not used much by default, but during development or
profiling one often uncomments macros like `DEBUG_TIME` that then enable
`SCOPED_TIMER` or `SCOPED_TIMER_AVERAGED`.

Having one `SCOPED_TIMER_AVERAGED` inside sequencer `draw_channels`, with
empty timeline and all default channels; the overhead in % of `draw_channels`
duration of said scoped timer before and after this change:

- Windows: 29% -> 5%
- Mac: 5.0% -> 4.4%

Pull Request: https://projects.blender.org/blender/blender/pulls/115233
2023-11-21 18:35:42 +01:00
Jacques Lucke a976cf4876 Cleanup: reduce boilerplate for equality operators for structs
Pull Request: https://projects.blender.org/blender/blender/pulls/115088
2023-11-20 09:39:13 +01:00
Campbell Barton a615dcdfa8 Cleanup: add missing header to CMake, sort file lists 2023-11-10 09:40:05 +11:00
Aras Pranckevicius 03bbdd804c Cleanup: move math_geom.c to c++ 2023-11-06 20:51:13 +02:00
Ray Molenkamp 9c0bffcc89 Merge remote-tracking branch 'origin/blender-v4.0-release' 2023-10-31 18:51:26 -06:00
Ray Molenkamp f6c52849b5 Fix #112729: Update pinned blender shortcut
Windows allows people to pin an application to their taskbar, when a
user pins blender, the data we set in
`GHOST_WindowWin32::registerWindowAppUserModelProperties` is used
which includes the path to the `blender-launcher.exe`. Now once that
shortcut is created on the taskbar, this will never be updated, if
people remove blender and install it again to a different path
(happens often when using nightly builds) this leads to the
situation where the shortcut on the taskbar points to a no longer
existing blender installation. Now you may think, just un-pin and
re-pin that should clear that right up! It doesn't, it'll keep using
the outdated path till the end of time and there's no window API call
we can do to update this information. However this shortcut is stored
in the user profile in a sub-foder we can easily query, from there, we
can iterate over all files, look for the one that has our appid in it, and
when we find it, update the path to the blender launcher to the
current installation, bit of a hack, but Microsoft seemingly offers no
other way to deal with this problem.

Pull Request: https://projects.blender.org/blender/blender/pulls/113859
2023-11-01 01:44:51 +01:00
Brecht Van Lommel 39107b3133 Revert changes from main commits that were merged into blender-v4.0-release
The last good commit was 8474716abb.

After this commits from main were pushed to blender-v4.0-release. These are
being reverted.

Commits a4880576dc from to b26f176d1a that happend afterwards were meant for
4.0, and their contents is preserved.
2023-10-30 21:40:35 +01:00
Sergey Sharybin 7afa5aaa59 BLI: Add std::string variant of BLI_uniquename_cb
Allows to ensure unique name for cases when name is a dynamically
sized string.

Pull Request: https://projects.blender.org/blender/blender/pulls/114052
2023-10-24 11:35:52 +02:00
Campbell Barton 87a969e7fc Cleanup: sort files in CMake 2023-10-23 10:09:52 +11:00
Ray Molenkamp fece71fa0a Fix: Build error on windows
winstuff.c got converted to .cc recently, and a merge was done of
#113674 from 4.0 which still was using C style code.

Small update in code style was required here.
2023-10-20 10:44:34 -06:00
Clément Foucault b0e7a6db56 Merge branch 'blender-v4.0-release'
# Conflicts:
#	source/blender/gpu/opengl/gl_backend.cc
2023-10-20 17:23:53 +02:00
Anthony Roberts 4e69e49e7e Add check for Qualcomm devices on Windows
Some of these devices are not capable of running >=4.0, due to issues
with Mesa's Compute Shaders and their D3D drivers.

This PR marks those GPUs as unsupported, and prints info to stdout.

A driver update will be available for 8cx Gen3 on the 17th October
from here:
https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-8-series-mobile-compute-platforms/snapdragon-8cx-gen-3-compute-platform#Software

It will take longer via the standard MS Windows Update channels,
as there is certification, testing, etc required, but it is possible
to get the drivers, at least.

This issue applies even when using emulated x64.

If this does not get merged, all WoA devices will break with 4.0,
where older ones will just launch a grey screen and crash, and newer
ones will open, but scenes will not render correctly in Workbench.

These devices work by using Mesa's D3D12 Gallium driver ("GLOn12"),
which is why we have to read the DirectX driver version - the version
reported by OpenGL is the mesa version, which is independent of the
driver (which is the part with the bug).

Pull Request: https://projects.blender.org/blender/blender/pulls/113674
2023-10-20 17:18:35 +02:00
Sergey Sharybin 21c8af467d Cleanup: Convert winfunc and utfconv to C++
Basically, the intern/utfconv directory, as well as users of
these headers.

Pull Request: https://projects.blender.org/blender/blender/pulls/113901
2023-10-20 10:27:31 +02:00
Sergey Sharybin 85c557ffa2 Cleanup: Rename BLI_string_utils.h to BLI_string_utils.hh
All users of it are now C++, which opens doors to add C++ to the
public API.
2023-10-20 10:27:26 +02:00
Sergey Sharybin 37838e4ed3 Cleanup: Convert fileops to C++
Use fileops_c.cc as a name since there is already fileops.cc.

Pretty much as-is conversion, so it is a C code in a C++ file.
2023-10-20 10:27:26 +02:00
Sergey Sharybin 4d02a14e5a Cleanup: Convert BLI_filelist to C++
Pretty much as-is conversion, so it is a C code in a C++ file.
2023-10-20 10:27:26 +02:00
Sergey Sharybin 5954fc45f8 Cleanup: Convert path_util to C++
Pretty much as-is conversion, so it is a C code in a C++ file.
2023-10-20 10:27:26 +02:00