Skip to content

Comments

FCHL19 with periodic boundary conditions and revised OpenMP parallelization#6

Open
kvkarandashev wants to merge 9 commits intoqmlcode:mainfrom
kvkarandashev:main
Open

FCHL19 with periodic boundary conditions and revised OpenMP parallelization#6
kvkarandashev wants to merge 9 commits intoqmlcode:mainfrom
kvkarandashev:main

Conversation

@kvkarandashev
Copy link

Adds support for periodic boundary conditions when generating FCHL19 representations. Adds the verification script for checking PBC works correctly (temporarily placed into tests).

facsf.f90 revised to avoid using REDUCTION construct in FCHL19's OpenMP parallelization while minimizing the additional CPU and memory usage compared to a serial implementation. Admittedly, I have doubts about how necessary including "!$OMP CRITICAL" was, but I cannot see a better alternative.

Unfortunately, I had trouble getting qmllib to work with OpenMP for some reason, hence correctness of fascf.f90's parallelization was verified by using the same file in my fork of original qml.

@charnley charnley requested a review from andersx April 1, 2024 16:02
@charnley
Copy link
Member

charnley commented Apr 1, 2024

Unfortunately, I had trouble getting qmllib to work with OpenMP for some reason

can you expand on this? Did you have to remove the compiler flags or notice that it wasn't allocating all the resources?

@kvkarandashev
Copy link
Author

To compile with OpenMP I edited several lines in _compile.py as follows:

COMPILER_FLAGS = ["-O3", "-fopenmp", "-m64", "-march=native", "-fPIC",
                     "-Wno-maybe-uninitialized", "-Wno-unused-function", "-Wno-cpp"]
extra_flags = ["-lgomp", "-lpthread", "-lm", "-ldl"] + COMPILER_FLAGS

flags = ["-L/usr/lib/", "-lblas", "-llapack"] + extra_flags

After compiling the code I compared its performance on a test problem to performance of the qml fork with the same fascf.f90 file. qmllib never seemed to use more than one core, as supported by the timings:
OMP_NUM_THREADS=1:
qml: 0.7533 mins
qmllib: 0.8178 mins
OMP_NUM_THREADS=20:
qml: 0.1390 mins
qmllib: 0.8145 mins

@charnley charnley mentioned this pull request Apr 1, 2024
Verification script fixed and made more thorough.
@kvkarandashev
Copy link
Author

Found a mistake in the way gradients for PBC were calculated and verified.
Fixed the verification script and added verification via finite difference.

@charnley
Copy link
Member

charnley commented Nov 8, 2024

qmllib is finally updated to work with numpy2.0 and openmp flags are fixed.

  • Can you pull from main?
  • Use pathlib.Path instead of os and glob

@kvkarandashev
Copy link
Author

I successfully pulled from main and replaced references to glob with pathlib.Path; everything seems to work now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename the test file to start with test_? It looks like the file was skipped during the test run.

@kvkarandashev
Copy link
Author

I renamed the script and also changed it in order its CPU time was more in line with the other test scripts. TBH I am still not happy about the test’s CPU time consumption, but I do not see a way to make it better without augmenting conftest.py with functions from its qml2 analogue.

andersx added a commit that referenced this pull request Feb 21, 2026
* Fix CMake build system and remove legacy build files

- Fix CMakeLists.txt pybind11 linking issues
  - Link qmllib_kernels to qmllib_common for pybind11 headers
  - Remove references to non-existent kf_common target
  - Fix Python file installation path from python/ to src/
  - Install entire Python package directory structure
- Fix module name mismatch in bindings.cpp (_qmllib_kernel -> _qmllib)
- Migrate from setuptools to scikit-build-core
- Simplify Makefile to use pip install
- Remove legacy build files (setup.py, _compile.py, MANIFEST.in, etc.)
- Update dependencies and build requirements in pyproject.toml

* Add pybind11 bindings for Fortran solver functions

- Add C ABI wrapper functions to fsolvers.f90
  - c_fcho_solve, c_fcho_invert for Cholesky decomposition
  - c_fbkf_solve, c_fbkf_invert for Bunch-Kaufman decomposition
- Create bindings_solvers.cpp with pybind11 bindings
  - Handles Fortran-ordered arrays correctly
  - Copies input arrays to preserve them
  - Properly handles symmetric matrix triangle copying
- Update CMakeLists.txt to build _solvers module
  - Add qmllib_solvers object library
  - Add _solvers pybind11 module
  - Link BLAS/LAPACK and OpenMP
- Update solvers/__init__.py to use new bindings
  - Import from qmllib._solvers
  - Simplified wrapper functions
  - Fallback to f2py for compatibility

All tests in test_solvers.py pass successfully.

* Refactor fsolvers.f90 to eliminate code duplication

- Convert original subroutines to use C-compatible types (c_double, c_int)
- Add bind(C) attribute directly to original functions
- Remove duplicate c_* wrapper functions (132 lines removed)
- Update C++ bindings to call functions by original names
- File size reduced from 486 to 354 lines
- All tests still pass

* Add pybind11 bindings for representations and utils modules

- Convert frepresentations.f90 to use C-compatible types with bind(C)
  - Converted all 6 representation functions (coulomb matrix variants, BOB)
  - Removed 42 lines of code duplication by using minimal wrapper approach
  - Functions now use explicit array sizes instead of assumed-shape arrays

- Create bindings_representations.cpp with pybind11 wrappers
  - Handles Fortran column-major arrays with proper strides
  - Creates output arrays with correct memory layout
  - Makes copies of parameters that Fortran modifies (cutoff values)

- Convert fsettings.f90 to use C-compatible types with bind(C)
  - check_openmp function now returns int instead of logical
  - Both functions use iso_c_binding

- Create bindings_utils.cpp for utility functions
  - Wraps check_openmp and get_threads functions

- Update CMakeLists.txt to build new modules
  - Added qmllib_representations and qmllib_utils object libraries
  - Added _representations and _utils pybind11 modules
  - Linked BLAS/LAPACK and OpenMP to all modules

- Update Python imports to use new pybind11 modules
  - representations.py now imports from qmllib._representations
  - utils/__init__.py now imports from qmllib._utils
  - Temporarily commented out facsf, fslatm, arad, and fchl imports

- Fix BOB function type issues
  - Convert nmax to numpy array with int32 dtype
  - Convert n to integer before passing to Fortran

All 10 representation tests now passing.

* Add input validation back to representation functions

The conversion to bind(C) removed the automatic size() validation that
was present in the original f2py code. Added back explicit validation
to check that natoms <= nmax in:
- fgenerate_coulomb_matrix
- fgenerate_unsorted_coulomb_matrix
- fgenerate_eigenvalue_coulomb_matrix

This ensures early error detection if the caller passes inconsistent
dimensions, which was the original intent of the removed code.

* Add comprehensive input validation to all representation functions

Restored all input validators that were removed during bind(C) conversion:

1. Dimension validation (natoms <= nmax) added to:
   - fgenerate_coulomb_matrix
   - fgenerate_unsorted_coulomb_matrix
   - fgenerate_eigenvalue_coulomb_matrix
   - fgenerate_local_coulomb_matrix
   - fgenerate_atomic_coulomb_matrix

2. Central atom indices validation (1 <= index <= natoms) added to:
   - fgenerate_local_coulomb_matrix
   - fgenerate_atomic_coulomb_matrix

3. Sanity check (natoms > 0) added to:
   - fgenerate_bob

All validators now work with the explicit parameter passing required
by bind(C), providing the same safety guarantees as the original f2py
code which used size() intrinsic functions.

Note: Existing validators for cutoff_count > nmax were preserved during
conversion and continue to function correctly.

* Remove example kernel files and build configuration

Removed example/test files that were not part of the actual qmllib implementation:
- src/qmllib/kernels/kernel.f90 (example Fortran kernels)
- src/qmllib/kernels/bindings.cpp (_qmllib module bindings)
- src/qmllib/kernels/bindings_kernels.cpp (_kernels module bindings)
- src/qmllib/kernels/kernels.cpp (example C++ kernels)

Updated CMakeLists.txt to remove all references to these files:
- Removed qmllib_fortran object library
- Removed qmllib_kernels object library
- Removed _qmllib Python extension module
- Removed _kernels Python extension module
- Removed associated OpenMP and BLAS/LAPACK linking
- Removed compiler optimization flags for removed targets
- Updated install targets to only include active modules

The actual qmllib kernel implementation (fkernels.f90, fgradient_kernels.f90,
etc.) remains untouched and will be converted to pybind11 in future commits.

All existing tests continue to pass.

* Restore compiler optimization flags for all modules

Added back the optimization flags that were accidentally removed when
cleaning up example kernel files. The same aggressive optimization
flags are now applied to all Fortran and C++ targets:

Fortran optimization (GNU):
  -O3 -fopenmp -mcpu=native -mtune=native -ffast-math -ftree-vectorize

Fortran optimization (Intel/IntelLLVM):
  -O3 -ipo -xHost -fp-model fast=2 -no-prec-div -fno-alias -qopenmp

C++ optimization (GNU/Clang):
  -O3 -march=native -ffast-math -fopenmp -mtune=native -ftree-vectorize

C++ optimization (Intel):
  -O3 -qopt-report=3 -qopenmp -xHost

Applied to:
- qmllib_solvers (Fortran object library)
- qmllib_representations (Fortran object library)
- qmllib_utils (Fortran object library)
- _solvers (C++ pybind11 module)
- _representations (C++ pybind11 module)
- _utils (C++ pybind11 module)

All tests continue to pass with optimizations enabled.

* Add pybind11 bindings for kernels

* Convert gradient kernels from f2py to pybind11 (#3)

* Convert SLATM to pybind11 and remove ARAD representation (#4)

* WIP: Migrate FCHL representations from f2py to pybind11 (#5)

* WIP: Add FCHL pybind11 infrastructure and Fortran wrappers

- Add FCHL modules to CMakeLists.txt with OpenMP and BLAS/LAPACK support
- Create ffchl_wrappers.f90 with bind(C) wrappers for all 7 scalar kernel functions
- Create bindings_ffchl.cpp with pybind11 bindings (partial, first function only)
- Expose ffchl_kernel_types constants (GAUSSIAN, LINEAR, etc.)

All 7 scalar kernel Fortran wrappers are complete:
1. fget_kernels_fchl_wrapper
2. fget_symmetric_kernels_fchl_wrapper
3. fget_global_symmetric_kernels_fchl_wrapper
4. fget_global_kernels_fchl_wrapper
5. fget_atomic_kernels_fchl_wrapper
6. fget_atomic_symmetric_kernels_fchl_wrapper
7. fget_atomic_local_kernels_fchl_wrapper

Next: Complete C++ pybind11 bindings for remaining 6 functions

* WIP: Add complete FCHL pybind11 bindings for scalar kernels

- Implement all 7 scalar kernel C++ wrappers with pybind11
- Add both uppercase and lowercase kernel type constants
- Use Fortran-style array handling (f_style | forcecast)
- Fix nsigmas parameter type mismatch in atomic_local wrapper
- Temporarily disable force and electric field kernel imports

Status:
- ✅ All Fortran wrappers created (ffchl_wrappers.f90)
- ✅ All 7 C++ bindings implemented
- ✅ CMake configuration updated
- ✅ Code compiles successfully
- ✅ Module imports without errors
- ❌ Segfault when calling functions (needs debugging)

Next steps:
- Debug segfault in function calls
- Verify array dimensions and strides
- Compare with working SLATM implementation
- Add force and electric field kernel bindings

* Migrate fget_kernels_fchl from f2py to pybind11

- Modified fget_kernels_fchl in ffchl_scalar_kernels.f90 to add C bindings
  - Added bind(C) with explicit-shape arrays and dimension parameters
  - Converted logical parameters to integer for C compatibility
  - Used iso_c_binding types throughout
- Created bindings_fchl_simple.cpp with pybind11 wrapper
  - Extracts dimensions from numpy arrays
  - Forces Fortran-style memory layout
  - Properly passes dimension parameters first
- Updated CMakeLists.txt to build with new bindings
- Added stubs for remaining 6 FCHL functions not yet migrated
- Tests pass: KRR with 100 molecules gives MAE=1.20 kcal/mol

* Migrate fget_symmetric_kernels_fchl from f2py to pybind11

- Modified fget_symmetric_kernels_fchl in ffchl_scalar_kernels.f90:
  - Added bind(C) with explicit-shape arrays and dimension parameters
  - Converted logical parameters to integer for C compatibility
  - Added proper variable declarations for internal counters
  - Used iso_c_binding types throughout
- Added C++ binding in bindings_fchl_simple.cpp
- Registered function with pybind11 module
- Removed stub in fchl_scalar_kernels.py
- Test test_krr_fchl_local now passes (0.62s)

* Migrate global FCHL kernels to pybind11 (Fortran only)

- Modified fget_global_symmetric_kernels_fchl:
  - Added bind(C) with explicit-shape arrays
  - Changed logical to integer for C compatibility
  - Added logical conversion variables
- Modified fget_global_kernels_fchl:
  - Added bind(C) with explicit-shape arrays
  - Changed logical to integer for C compatibility
  - Added logical conversion variables
- Updated all internal function calls to use _logical versions
- C++ bindings and Python wrappers to be added next

* Complete migration of global FCHL kernels to pybind11

- Added C++ wrapper functions for both global kernels:
  - fget_global_symmetric_kernels_fchl_py
  - fget_global_kernels_fchl_py
- Registered both functions with pybind11 module
- Removed Python stubs for global kernel functions
- Updated imports in fchl_scalar_kernels.py
- Test test_krr_fchl_global now passes (0.78s)

Both global kernel functions are now fully working!

* Migrate atomic FCHL kernels and add all kernel type constants

- Migrated fget_atomic_kernels_fchl to pybind11
  - Modified Fortran signature with bind(C) and dimension parameters first
  - Converted logical to integer(c_int) for C compatibility
  - Added C++ wrapper and registered with pybind11
  - Updated Python wrapper to remove na1/na2 parameters

- Migrated fget_atomic_symmetric_kernels_fchl to pybind11
  - Applied same bind(C) pattern as atomic kernels
  - Created C++ wrapper function
  - Registered with pybind11 module
  - Updated Python imports and removed stub

- Added all kernel type constants to pybind11 module
  - POLYNOMIAL, SIGMOID, MULTIQUADRATIC, INV_MULTIQUADRATIC
  - BESSEL, L2, MATERN, CAUCHY, POLYNOMIAL2
  - Added lowercase aliases for consistency
  - Fixes AttributeError in kernel function tests

All 15 tests in test_fchl_scalar.py now pass!

* Complete FCHL migration from f2py to pybind11 (#6)

* Feature/fix gradient kernels (#7)

* Fix/fchl openmp race conditions (#8)

* Fix local kernel functions to handle arbitrary representation sizes (#9)

* Fix test_gdml_derivative and add diagnostics to remaining xfail tests (#10)

* Convert README from RST to Markdown

- Converted README.rst to README.md with proper Markdown formatting
- Maintained all content and structure
- Marked 'Convert readme to markdown' as completed in todo list
- Improved readability with Markdown syntax for code blocks, headers, and lists

* Remove README.rst in favor of README.md (#11)

* Add strict ruff and mypy linting (#12)

* Migrate from mypy to ty for type checking (#13)

* Migrate fqrlq_solve, fcond, and fcond_ge from f2py to pybind11 (#14)

* Separate integration tests from unit tests for faster CI (#15)

* Separate integration tests from unit tests

* Optimize FCHL kernel tests with symmetry

* WAdded ci build for macos (#17)

* Added ci+build for macos

* Feature/release workflow (#18)

* Add release workflow for PyPI wheel publishing

* Last mile 1 (#19)

* Jimmyfy in progress ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants