Skip to content

[SPH][Setup] allow speculative load balancing during injection#1797

Draft
tdavidcl wants to merge 9 commits intoShamrock-code:mainfrom
tdavidcl:speculative-setup
Draft

[SPH][Setup] allow speculative load balancing during injection#1797
tdavidcl wants to merge 9 commits intoShamrock-code:mainfrom
tdavidcl:speculative-setup

Conversation

@tdavidcl
Copy link
Copy Markdown
Member

@tdavidcl tdavidcl commented May 8, 2026

No description provided.

@tdavidcl tdavidcl added the draft label May 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Thanks @tdavidcl for opening this PR!

You can do multiple things directly here:
1 - Comment pre-commit.ci run to run pre-commit checks.
2 - Comment pre-commit.ci autofix to apply fixes.
3 - Add label autofix.ci to fix authorship & pre-commit for every commit made.
4 - Add label light-ci to only trigger a reduced & faster version of the CI (need the full one before merge).
5 - Add label trigger-ci to create an empty commit to trigger the CI.

Once the workflow completes a message will appear displaying informations related to the run.

Also the PR gets automatically reviewed by gemini, you can:
1 - Comment /gemini review to trigger a review
2 - Comment /gemini summary for a summary
3 - Tag it using @gemini-code-assist either in the PR or in review comments on files

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a speculative load balancing mechanism and adds support for discontinuous HCP lattice generation in the SPH setup. The speculative balancing logic calculates particle counts per patch using a device kernel and MPI reduction to improve load distribution during initialization. Feedback identifies a performance bottleneck in the load-counting kernel, which currently uses a brute-force $O(N \times M)$ approach instead of leveraging existing tree structures for $O(\log M)$ lookups. Additionally, reviewers noted that load values should be recomputed more frequently to avoid stale data during particle injection and that speculative counts should be added to existing patch loads rather than replacing them.

Comment on lines +383 to +394
for (size_t j = 0; j < npatch; j++) {
shammath::CoordRange<Tvec> patch_coord
= {patch_aabb_min[j], patch_aabb_max[j]};
if (patch_coord.contain_pos(pos)) {
sycl::atomic_ref<
u64,
sycl::memory_order::relaxed,
sycl::memory_scope::device>
atomic_local_load_values(local_load_values[j]);
atomic_local_load_values++;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The speculative load counting kernel uses a brute-force approach with a nested loop over all global patches for every local particle, resulting in $O(N_{particles} \times N_{patches})$ complexity. This will become a significant performance bottleneck as the simulation scale increases. Since SerialPatchTree is already available and used elsewhere in this file (e.g., line 635) to perform $O(\log N_{patches})$ lookups on the device, it should be leveraged here as well to improve efficiency.

References
  1. Refactor duplicated logic into a helper function or lambda to improve readability and maintainability.

u64 npatch = scheduler().patch_list.global.size();

// check if the number of patches has changed, rebuild otherwise
if (npatch != speculative_last_npatch) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The speculative load values are only recomputed when the number of patches (npatch) changes. However, during the injection process, particles are extracted from to_insert and moved between ranks. If the load values are not updated to reflect the current state of to_insert, the load balancer will make decisions based on stale distribution data, potentially leading to severe load imbalance. The counting logic should be executed whenever compute_load is called to ensure it reflects the current "pending" load.

Comment on lines +423 to +425
scheduler().update_local_load_value([&](shamrock::patch::Patch p) {
return speculative_load_values.get(p.id_patch);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The speculative load balancing logic currently replaces the patch load value with the count from to_insert. This ignores any particles that have already been successfully injected into the patches. To accurately represent the total expected load for a patch, the speculative count (particles yet to be injected) should be added to the current count of particles already present in the patch.

            scheduler().update_local_load_value([&](shamrock::patch::Patch p) {
                u64 current_load = scheduler().patch_data.owned_data.get(p.id_patch).get_obj_cnt();
                return current_load + speculative_load_values.get(p.id_patch);
            });

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Workflow report

workflow report corresponding to commit eb6c610
Commiter email is timothee.davidcleris@proton.me
GitHub page artifact URL GitHub page artifact link (can expire)

Pre-commit check report

Pre-commit check: ✅

trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check for merge conflicts................................................Passed
check that executables have shebangs.....................................Passed
check that scripts with shebangs are executable..........................Passed
check for added large files..............................................Passed
check for case conflicts.................................................Passed
check for broken symlinks................................................Passed
check yaml...............................................................Passed
detect private key.......................................................Passed
No-tabs checker..........................................................Passed
Tabs remover.............................................................Passed
cmake-format.............................................................Passed
Validate GitHub Workflows................................................Passed
clang-format.............................................................Passed
ruff check...............................................................Passed
ruff format..............................................................Passed
Check doxygen headers....................................................Passed
Check license headers....................................................Passed
Check #pragma once.......................................................Passed
Check SYCL #include......................................................Passed
No ssh in git submodules remote..........................................Passed
No UTF-8 in files (except for authors)...................................Passed

Test pipeline can run.

Clang-tidy diff report


305 warnings generated.
Suppressed 306 warnings (305 in non-user code, 1 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

359 warnings generated.
Suppressed 360 warnings (359 in non-user code, 1 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

708 warnings generated.
Suppressed 709 warnings (616 in non-user code, 92 due to line filter, 1 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

856 warnings generated.
Suppressed 857 warnings (734 in non-user code, 122 due to line filter, 1 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

Doxygen diff with main

Removed warnings : 33
New warnings : 37
Warnings count : 8232 → 8236 (0.0%)

Detailed changes :
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:100: warning: Member make_modifier_split_part(SetupNodePtr parent, u64 n_split, u64 seed, Tscal h_scaling) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:101: warning: Member make_modifier_split_part(SetupNodePtr parent, u64 n_split, u64 seed, Tscal h_scaling) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:52: warning: Member apply_setup_new(SetupNodePtr setup, bool part_reordering, std::optional< u32 > gen_count_per_step=std::nullopt, std::optional< u32 > insert_count_per_step=std::nullopt, std::optional< u64 > max_msg_count_per_rank_per_step=std::nullopt, std::optional< u64 > max_data_count_per_rank_per_step=std::nullopt, std::optional< u64 > max_msg_size=std::nullopt, bool do_setup_log=false) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:52: warning: Member apply_setup_new(SetupNodePtr setup, bool part_reordering, std::optional< u32 > gen_count_per_step=std::nullopt, std::optional< u32 > insert_count_per_step=std::nullopt, std::optional< u64 > max_msg_count_per_rank_per_step=std::nullopt, std::optional< u64 > max_data_count_per_rank_per_step=std::nullopt, std::optional< u64 > max_msg_size=std::nullopt, bool do_setup_log=false, bool speculative_balancing=false) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:62: warning: Member make_generator_lattice_hcp(Tscal dr, std::pair< Tvec, Tvec > box) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:63: warning: Member make_generator_lattice_hcp(Tscal dr, std::pair< Tvec, Tvec > box, bool discontinuous=true) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:65: warning: Member make_generator_lattice_cubic(Tscal dr, std::pair< Tvec, Tvec > box) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:66: warning: Member make_generator_lattice_cubic(Tscal dr, std::pair< Tvec, Tvec > box) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:68: warning: Member make_generator_disc_mc(Tscal part_mass, Tscal disc_mass, Tscal r_in, Tscal r_out, std::function< Tscal(Tscal)> sigma_profile, std::function< Tscal(Tscal)> H_profile, std::function< Tscal(Tscal)> rot_profile, std::function< Tscal(Tscal)> cs_profile, std::mt19937_64 eng, Tscal init_h_factor) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:69: warning: Member make_generator_disc_mc(Tscal part_mass, Tscal disc_mass, Tscal r_in, Tscal r_out, std::function< Tscal(Tscal)> sigma_profile, std::function< Tscal(Tscal)> H_profile, std::function< Tscal(Tscal)> rot_profile, std::function< Tscal(Tscal)> cs_profile, std::mt19937_64 eng, Tscal init_h_factor) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:80: warning: Member make_generator_from_context(ShamrockCtx &context_other) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:81: warning: Member make_generator_from_context(ShamrockCtx &context_other) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:82: warning: Member make_combiner_add(SetupNodePtr parent1, SetupNodePtr parent2) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:83: warning: Member make_combiner_add(SetupNodePtr parent1, SetupNodePtr parent2) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:85: warning: Member make_modifier_warp_disc(SetupNodePtr parent, Tscal Rwarp, Tscal Hwarp, Tscal inclination, Tscal posangle) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:86: warning: Member make_modifier_warp_disc(SetupNodePtr parent, Tscal Rwarp, Tscal Hwarp, Tscal inclination, Tscal posangle) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:88: warning: Member make_modifier_custom_warp(SetupNodePtr parent, std::function< Tscal(Tscal)> inc_profile, std::function< Tscal(Tscal)> psi_profile, std::function< Tvec(Tscal)> k_profile) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:89: warning: Member make_modifier_custom_warp(SetupNodePtr parent, std::function< Tscal(Tscal)> inc_profile, std::function< Tscal(Tscal)> psi_profile, std::function< Tvec(Tscal)> k_profile) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:94: warning: Member make_modifier_add_offset(SetupNodePtr parent, Tvec offset_postion, Tvec offset_velocity) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:95: warning: Member make_modifier_add_offset(SetupNodePtr parent, Tvec offset_postion, Tvec offset_velocity) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:97: warning: Member make_modifier_filter(SetupNodePtr parent, std::function< bool(Tvec)> filter) (function) of class shammodels::sph::modules::SPHSetup is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/SPHSetup.hpp:98: warning: Member make_modifier_filter(SetupNodePtr parent, std::function< bool(Tvec)> filter) (function) of class shammodels::sph::modules::SPHSetup is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:29: warning: Compound shammodels::sph::modules::GeneratorLatticeHCP is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:29: warning: Compound shammodels::sph::modules::IteratorTypeGetter is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:30: warning: Member type (typedef) of struct shammodels::sph::modules::IteratorTypeGetter is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:34: warning: Compound shammodels::sph::modules::IteratorTypeGetter< Tvec, false > is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:35: warning: Member type (typedef) of struct shammodels::sph::modules::IteratorTypeGetter< Tvec, false > is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:39: warning: Compound shammodels::sph::modules::GeneratorLatticeHCP is not documented.
- src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:49: warning: Member GeneratorLatticeHCP(ShamrockCtx &context, Tscal dr, std::pair< Tvec, Tvec > box) (function) of class shammodels::sph::modules::GeneratorLatticeHCP is not documented.
+ src/shammodels/sph/include/shammodels/sph/modules/setup/GeneratorLatticeHCP.hpp:59: warning: Member GeneratorLatticeHCP(ShamrockCtx &context, Tscal dr, std::pair< Tvec, Tvec > box) (function) of class shammodels::sph::modules::GeneratorLatticeHCP is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:203: warning: Compound SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:204: warning: Compound SetupLog::State is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:205: warning: Member count_per_rank (variable) of struct SetupLog::State is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:206: warning: Member msg_list (variable) of struct SetupLog::State is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:207: warning: Member state (variable) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:209: warning: Member step_counter (variable) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:211: warning: Member json_data (variable) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:212: warning: Compound SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:213: warning: Compound SetupLog::State is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:213: warning: Member log_state() (function) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:214: warning: Member count_per_rank (variable) of struct SetupLog::State is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:215: warning: Member msg_list (variable) of struct SetupLog::State is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:216: warning: Member state (variable) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:218: warning: Member step_counter (variable) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:220: warning: Member json_data (variable) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:221: warning: Member dump_state() (function) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:222: warning: Member log_state() (function) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:230: warning: Member dump_state() (function) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:234: warning: Member update_count_per_rank(u64 count) (function) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:243: warning: Member update_count_per_rank(u64 count) (function) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:244: warning: Member update_msg_list(std::vector< std::tuple< u32, u32, u64 > > &msg_list) (function) of struct SetupLog is not documented.
- src/shammodels/sph/src/modules/SPHSetup.cpp:252: warning: Member golden_number (variable) of file SPHSetup.cpp is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:253: warning: Member update_msg_list(std::vector< std::tuple< u32, u32, u64 > > &msg_list) (function) of struct SetupLog is not documented.
+ src/shammodels/sph/src/modules/SPHSetup.cpp:261: warning: Member golden_number (variable) of file SPHSetup.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1318: warning: Member add_analysisBarycenter_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1325: warning: Member add_analysisBarycenter_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1336: warning: Member add_analysisEnergyKinetic_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1343: warning: Member add_analysisEnergyKinetic_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1352: warning: Member add_analysisEnergyPotential_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1359: warning: Member add_analysisEnergyPotential_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1368: warning: Member add_analysisTotalMomentum_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1375: warning: Member add_analysisTotalMomentum_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1384: warning: Member add_analysisAngularMomentum_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1391: warning: Member add_analysisAngularMomentum_instance(py::module &m, const std::string &name_model) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1401: warning: Member analysis_impl(shammodels::sph::Model< Tvec, SPHKernel > &model) -> Analysis (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1406: warning: Member register_analysis_impl_for_each_kernel(py::module &msph, const char *name_class) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1408: warning: Member analysis_impl(shammodels::sph::Model< Tvec, SPHKernel > &model) -> Analysis (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1413: warning: Member register_analysis_impl_for_each_kernel(py::module &msph, const char *name_class) (function) of file pySPHModel.cpp is not documented.
- src/shammodels/sph/src/pySPHModel.cpp:1466: warning: Member Register_pymod(pysphmodel) (function) of file pySPHModel.cpp is not documented.
+ src/shammodels/sph/src/pySPHModel.cpp:1473: warning: Member Register_pymod(pysphmodel) (function) of file pySPHModel.cpp is not documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant