-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse algebra support with OOP API #760
base: master
Are you sure you want to change the base?
Conversation
Wouldn't it be useful if |
This could be done even shorter with a |
Agreed, |
Is it a good idea to have matvec product that do not initialize the result vector? I have to say that I have spent a good hour debugging a code and the source of divergence in the solver was the fact that I was not initializing the result vector before passing it to the subroutine. While I understand that this is a great way to perform Axpy operation, I would advise to have separate matvec (with initialization) and Axpy (without) routines.
|
Hi @ftucciarone, thanks for checking out the current draft and sorry for the inconvenience. Yes, I have intentionally designed the API to not initialize the vector, in the current version within FSPARSE https://github.com/jalvesz/FSPARSE?tab=readme-ov-file#sparse-matrix-vector-product I updated the README to be clear about it but forgot to update here. The reason for that choice is because it allows maximum flexibility for preprocessing linear systems needing operations with the same matrix operator that will be used for the solver. So it basically places the "responsability" on the user to place a This is of course totally open to debate, I just proposed following my experience reworking solvers. My feeling is that doing that can avoid having to manage different implementations or optionals, when the solution can be to explicitly state that the interface is updating (y = y + M * x) instead of overwriting (y = M * x). |
Hi @jalvesz, it was actually a pleasure to look into this draft, I am learning a lot and I look forward to discuss it with you if possible. I have to say that I have particularly strong feelings about the "non-initialization" problem, mostly twofold. However, I think that separating matvec and Axpy could be feasible at this stage (I can take care of that with the correct guidance) and potentially doable by preprocessing as well (I'm thinking at a fypp if). Cheers, Francesco Edit: example of splitting matvec and matAxpy with fypp #:include "../include/common.fypp"
#:set RANKS = range(1, 2+1)
#:set KINDS_TYPES = REAL_KINDS_TYPES
#:set OPERATIONS = ['matvec', 'matAxpy'] <--- DEFINE THE NAMES
#! define ranks without parentheses
#:def rksfx2(rank)
#{if rank > 0}#${":," + ":," * (rank - 1)}$#{endif}#
#:enddef
!! matvec_csr
#:for k1, t1 in (KINDS_TYPES)
#:for rank in RANKS
#:for idx, opname in enumerate(OPERATIONS) <--- ITERATE OVER THE NAMES
subroutine ${opname}$_csr_${rank}$d_${k1}$(vec_y,matrix,vec_x)
type(CSR_${k1}$), intent(in) :: matrix
${t1}$, intent(in) :: vec_x${ranksuffix(rank)}$
${t1}$, intent(inout) :: vec_y${ranksuffix(rank)}$
integer :: i, j
#:if rank == 1
${t1}$ :: aux
#:else
${t1}$ :: aux(size(vec_x,dim=1))
#:endif
associate( data => matrix%data, col => matrix%col, rowptr => matrix%rowptr, &
& nnz => matrix%nnz, nrows => matrix%nrows, ncols => matrix%ncols, sym => matrix%sym )
if( sym == k_NOSYMMETRY) then
do concurrent(i=1:nrows)
do j = rowptr(i), rowptr(i+1)-1
#:if idx==0 <--- IF MATVEC
vec_y(${rksfx2(rank-1)}$i) = data(j) * vec_x(${rksfx2(rank-1)}$col(j))
#:else <--- IF AXPY
vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
#:endif
end do
end do
else if( sym == k_SYMTRIINF )then
do i = 1 , nrows
aux = 0._${k1}$
do j = rowptr(i), rowptr(i+1)-2
aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
vec_y(${rksfx2(rank-1)}$col(j)) = vec_y(${rksfx2(rank-1)}$col(j)) + data(j) * vec_x(${rksfx2(rank-1)}$i)
end do
aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$i)
#:if idx==0 <--- IF MATVEC
vec_y(${rksfx2(rank-1)}$i) = aux
#:else <--- IF AXPY
vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + aux
#:endif
end do
else if( sym == k_SYMTRISUP )then
do i = 1 , nrows
aux = vec_x(${rksfx2(rank-1)}$i) * data(rowptr(i))
do j = rowptr(i)+1, rowptr(i+1)-1
aux = aux + data(j) * vec_x(${rksfx2(rank-1)}$col(j))
vec_y(${rksfx2(rank-1)}$col(j)) = vec_y(${rksfx2(rank-1)}$col(j)) + data(j) * vec_x(${rksfx2(rank-1)}$i)
end do
#:if idx==0 <--- IF MATVEC
vec_y(${rksfx2(rank-1)}$i) = aux
#:else <--- IF AXPY
vec_y(${rksfx2(rank-1)}$i) = vec_y(${rksfx2(rank-1)}$i) + aux
#:endif
end do
end if
end associate
end subroutine
#:endfor
#:endfor
#:endfor |
@ftucciarone thanks for the idea! I will also bring parts of a discussion I had with @ivan-pi on exactly this topic, where he brought to my attention the following:
https://psctoolkit.github.io/psblasguide/userhtmlse4.html#x18-550004 Taking all those ideas together, I could imagine using optionals to cover:
Or the default to be overwrite and an optional for addition, or simply with beta =1 Let me know your thoughts on that |
Thanks for the PR! I think it looks in the right direction. In order for me to start using these routines (for parallel sparse matmul, various conversions, etc.), I would need a version that does not use the stdlib's custom derived type, since I have my own derived type on my code. Is there a way to do that? The only way I know is to split the API into two parts:
Maybe there is another way, but the above way seems robust. Then I can start using the low level API, and just give it the CSR arrays that I have in my code from my own derived type. Why not structure this PR in this low level / high level style? It's almost there, but not quite. |
@certik Thank you for your comment. I agree with you that these features should be splitted into 2 parts (I have also my own sparse library, and I might be interested to call some of the stdlib features) :
However, @jalvesz started with the OO approach, and I think it is quite close to be merged as is, pending a few "minor" changes. Restructuring this PR might be a huge task. Therefore, I advice to review and merge this PR as is, maybe after implementing some suggestions where we think the API level could be already lowered a bit. I am afraid that if we go for the 2-API level approach in one go, we will have nothing in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some comments for the specs only (note: I did not look to the code yet; so some comments might just be dum).
doc/specs/stdlib_sparse.md
Outdated
|
||
`COO`, `intent(inout)`: Shall be any `COO` type. The same object will be returned with the arrays reallocated to the correct size after removing duplicates. | ||
|
||
`sort_data`, `logical(in)`, `optional`:: Shall be an optional `logical` argument to determine whether data in the COO graph should be sorted while sorting the index array, default `.false.`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not mentioned in the previous syntax section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here are some additional quick comments.
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
Co-authored-by: Jeremie Vandenplas <[email protected]>
@certik I understand and agree that a low level API is also desirable. Ideally we should eventually be able to link agains MKL following their interfaces which are quite similar but not exactly to Sparse BLAS syntax: https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-fortran/2024-0/sparse-blas-level-2-and-level-3-routines-002.html Now, this is a huge work. My feeling is that it is doable by changing the low-level back-ends without having to modify the high-level API (or with minimal breaking changes). I tried to bring this proposal to kickoff the interest and see if that could happen eventually and progressively once the DTs are in shape. Also, the idea of the DTs is to have sparse objects at the same level as a dense array such that higher-level interfaces can then be designed to work on either dense or sparse matrices. |
PR related to:
#38
#749
#189
Here I try to propose the OOP API upfront as a means to centralize data-containment format within stdlib, which I feel would be a (the) big added value. I'm migrating the proposal started here FSPARSE to adapt it to stdlib.
Current proposal design:
sparse_type
(abstract base type)COO_type
,CSR_type
,CSC_type
,ELL_type
,SELLC_type
(DTs with no data buffer)COO_?p_type
,CSR_?p_type
,CSC_?p_type
,ELL_?p_type
,SELLC_?p_type
(DTs with data buffer, where?
kind precision including complex values)