Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quda work ndg force #612

Merged
merged 23 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
afbf25c
start ndcloverrat derivative quda
Marcogarofalo Dec 15, 2023
725c28f
remove reset of mom_quda
Marcogarofalo Dec 15, 2023
5dfe316
passing the spinors to quda
Marcogarofalo Dec 18, 2023
c138b01
Merge branch 'quda_work_clover_force' into quda_work_ndg_force
Marcogarofalo Dec 19, 2023
b72a1c9
Merge branch 'quda_work_clover_force' into quda_work_ndg_force
Marcogarofalo Mar 7, 2024
f04b49b
pass a doublet to quda
Marcogarofalo Mar 7, 2024
74d95a3
more printing
Marcogarofalo Mar 7, 2024
a9c7c8c
add TODO
Marcogarofalo Mar 7, 2024
b8923aa
allocate vector of coeffecients
Marcogarofalo Mar 19, 2024
e4c7760
correct compilation error
Marcogarofalo Mar 19, 2024
e492a33
setup coefficient for ndg force
Marcogarofalo Apr 3, 2024
298339c
trace log term
Marcogarofalo Apr 3, 2024
799bdb5
compute trlog when needed
Marcogarofalo Apr 3, 2024
971243e
remove printing for quda debug
Marcogarofalo Apr 3, 2024
1bb347c
use new parameter evmax
Marcogarofalo Apr 12, 2024
a5f6fef
free memory
Marcogarofalo Apr 15, 2024
161587f
cleaning up
Marcogarofalo Apr 18, 2024
d198f15
remove some warning
Marcogarofalo Apr 18, 2024
b415eb6
NDCLOVERRAT derivative on QUDA only if also the inversion is done on …
Marcogarofalo Apr 18, 2024
e29573f
do not compute sw_term when possible
Marcogarofalo Apr 23, 2024
f0556da
Merge remote-tracking branch 'origin/master' into quda_work_ndg_force
kostrzewa May 8, 2024
fffd452
update QUDA docs for NDG force
kostrzewa May 8, 2024
cc59ad5
remove unnecessary tm_stopwatch_push
kostrzewa May 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions compare_derivative.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
# include <omp.h>
#endif
#include <stdio.h>
#include <math.h>
#include "global.h"
#include "monomial/monomial.h"

Expand Down
12 changes: 5 additions & 7 deletions doc/quda.tex
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ \subsubsection{Design goals of the interface}
The QUDA interface has been designed with the following goals in mind, sorted by priority:
\begin{enumerate}
\item \emph{Safety.} Naturally, highest priority is given to the correctness of the output of the interface.
This is trivially achieved by always checking the final residual on the CPU with the default tmLQCD routines.
For pure inversions this is trivially achieved by always checking the final residual on the CPU with the default tmLQCD routines. When QUDA is used in the HMC, however, the residual is only checked for {\ttfamily DebugLevel > 2} or when {\ttfamily StrictResidualCheck} is enabled.
\item \emph{Ease of use.} Within the operator declarations of the input file (between {\ttfamily BeginOperator} and {\ttfamily EndOperator}) a simple flag {\ttfamily UseExternalInverter} is introduced which, when set to {\ttfamily quda}, will let QUDA perform the inversion of that operator. The operators {\ttfamily TMWILSON, WILSON, DBTMWILSON} and {\ttfamily CLOVER, DBCLOVER} are supported.
Within the monomial declarations of the input file (between {\ttfamily BeginMonomial} and {\ttfamily EndMonomial}) the same flag can be used to offload solves for the \texttt{DET, DETRATIO, CLOVERDET, CLOVERDETRATIO, RAT, RATCOR, NDRAT, NDRATCOR, NDCLOVERRAT} and \texttt{NDCLOVERRATCOR} monomials in the HMC.
Further, the flag {\ttfamily UseExternalLibrary} is introduced which, when set to {\ttfamily quda}, will let QUDA perform the force calculation for the given monomial with support currently limited to {\ttfamily GAUGE, CLOVERDET, CLOVERDETRATIO}.
\item \emph{Minimality.} Minimal changes in the form of {\ttfamily \#ifdef QUDA} precompiler directives to the tmLQCD code base. The main bulk of the interface lies in a single separate file {\ttfamily quda\_interface.c} (with corresponding header file). The QUDA interface is entered .
Further, the flag {\ttfamily UseExternalLibrary} is introduced which, when set to {\ttfamily quda}, will let QUDA perform the force calculation for the given monomial with support currently limited to {\ttfamily GAUGE, CLOVERDET, CLOVERDETRATIO} and {\ttfamily NDCLOVERRAT}.
\item \emph{Minimality.} Minimal changes in the form of {\ttfamily \#ifdef TM\_USE\_QUDA} precompiler directives to the tmLQCD code base. The main bulk of the interface lies in a single separate file {\ttfamily quda\_interface.c} (with corresponding header file). The QUDA interface is entered .
\item \emph{Performance.} The higher priority of the previous items results in small performance detriments. In particular:
\begin{itemize}
\item tmLQCD's $\theta$-boundary conditions are not compatible with QUDA's 8 and 12 parameter reconstruction of the gauge fields (as of QUDA-1.1.0). Therefore reconstruction/compression is deactivated by default, although it may be activated via the input file, see below.
Expand Down Expand Up @@ -83,9 +83,7 @@ \subsubsection{QUDA versions}
\end{verbatim}
so that the wrapper to the QUDA fermionic forces is not compiled.

Thus, if \texttt{--enable-quda\_fermionic\_forces=no} setting {\ttfamily UseExternalLibrary=yes} in the inputfile for the {\ttfamily CLOVERDET, CLOVERDETRATIO} monomials
is not supported and tmLQCD will stop with an error.

Thus, if \texttt{--enable-quda\_fermionic\_forces=no}, setting {\ttfamily UseExternalLibrary=yes} in the inputfile for the {\ttfamily CLOVERDET, CLOVERDETRATIO} and {\ttfamily NDCLOVERRAT} monomials is not supported and tmLQCD will stop with an error.

\subsubsection{Usage}
Any main program that reads and handles the operator declaration from an input file can easily be set up to use the QUDA inverter by setting the {\ttfamily UseExternalInverter} flag to {\ttfamily quda}. For example, in the input file for the {\ttfamily invert} executable, add the flag to the operator declaration as
Expand Down Expand Up @@ -131,7 +129,7 @@ \subsubsection{Usage}
\item \texttt{RefinementPrecision}: When the operator or monomial uses the multishift (\texttt{cgmms[nd]}) solver and offloads to QUDA, this parameter sets the inner solver precision of shift-by-shift refinement solves. In practice, one might set \texttt{UseSloppyPrecision = single} and \texttt{RefinementPrecision = half}. This will iterate the residuals in the multishift solver up to single precision and then refine each solution using a double-half mixed-precision CG.
\end{itemize}

In additition, for the gauge monomial, the parameter \texttt{UseExternalLibrary = quda} can be used to offload the gauge force to QUDA.
In additition, for the \texttt{GAUGE, CLOVERDET, CLOVERDETRATIO} and \texttt{NDCLOVERRAT} monomials, the parameter \texttt{UseExternalLibrary = quda} can be used to offload the force calculation to QUDA.

Finally, for the \texttt{GRADIENTFLOW} online measurement, the parameter \texttt{UseExternalLibrary = quda} will offload the gradient flow to QUDA.

Expand Down
58 changes: 31 additions & 27 deletions monomial/cloverdet_monomial.c
Original file line number Diff line number Diff line change
Expand Up @@ -62,17 +62,6 @@ void cloverdet_derivative(const int id, hamiltonian_field_t * const hf) {
monomial * mnl = &monomial_list[id];
int N = VOLUME/2;
tm_stopwatch_push(&g_timers, __func__, mnl->name);
tm_stopwatch_push(&g_timers, "su3_zero", "");
#ifdef TM_USE_OMP
#pragma omp parallel for
#endif
for(int i = 0; i < VOLUME; i++) {
for(int mu = 0; mu < 4; mu++) {
_su3_zero(swm[i][mu]);
_su3_zero(swp[i][mu]);
}
}
tm_stopwatch_pop(&g_timers, 0, 1, "");

mnl->forcefactor = 1.;
/*********************************************************************
Expand All @@ -89,16 +78,29 @@ void cloverdet_derivative(const int id, hamiltonian_field_t * const hf) {
g_kappa = mnl->kappa;
boundary(mnl->kappa);

// we compute the clover term (1 + T_ee(oo)) for all sites x
sw_term( (const su3**) hf->gaugefield, mnl->kappa, mnl->c_sw);
// we invert it for the even sites only
if(!mnl->even_odd_flag) {
N = VOLUME;
}
else {
sw_invert(EE, mnl->mu);
if( g_debug_level > 2 || g_strict_residual_check || !(mnl->external_library == QUDA_LIB && mnl->solver_params.external_inverter == QUDA_INVERTER) ){
tm_stopwatch_push(&g_timers, "su3_zero", "");
#ifdef TM_USE_OMP
#pragma omp parallel for
#endif
for(int i = 0; i < VOLUME; i++) {
for(int mu = 0; mu < 4; mu++) {
_su3_zero(swm[i][mu]);
_su3_zero(swp[i][mu]);
}
}
tm_stopwatch_pop(&g_timers, 0, 1, "");

// we compute the clover term (1 + T_ee(oo)) for all sites x
sw_term( (const su3**) hf->gaugefield, mnl->kappa, mnl->c_sw);
// we invert it for the even sites only
if(!mnl->even_odd_flag) {
N = VOLUME;
}
else {
sw_invert(EE, mnl->mu);
}
}

// Invert Q_{+} Q_{-}
// X_o -> w_fields[1]
chrono_guess(mnl->w_fields[1], mnl->pf, mnl->csg_field, mnl->csg_index_array,
Expand Down Expand Up @@ -274,15 +276,17 @@ double cloverdet_acc(const int id, hamiltonian_field_t * const hf) {
g_kappa = mnl->kappa;
boundary(mnl->kappa);

sw_term( (const su3**) hf->gaugefield, mnl->kappa, mnl->c_sw);
if( g_debug_level > 2 || g_strict_residual_check || !(mnl->external_library == QUDA_LIB && mnl->solver_params.external_inverter == QUDA_INVERTER) ){

if(!mnl->even_odd_flag) {
N = VOLUME;
}
else {
sw_invert(EE, mnl->mu);
}
sw_term( (const su3**) hf->gaugefield, mnl->kappa, mnl->c_sw);

if(!mnl->even_odd_flag) {
N = VOLUME;
}
else {
sw_invert(EE, mnl->mu);
}
}
g_sloppy_precision_flag = 0;

if( mnl->solver == MG || mnl->solver == BICGSTAB ){
Expand Down
6 changes: 6 additions & 0 deletions monomial/monomial.c
Original file line number Diff line number Diff line change
Expand Up @@ -453,6 +453,12 @@ int init_monomials(const int V, const int even_odd_flag) {
monomial_list[i].name,
no_monomials);
}
if(monomial_list[i].external_library==QUDA_LIB){
if(monomial_list[i].solver_params.external_inverter != QUDA_INVERTER){
tm_debug_printf(0,0,"Error: NDCLOVERRAT monomial of UseExternalLibrary = quda is not supported without UseExternalInverter = quda\n");
exit(1);
}
}
}
else if(monomial_list[i].type == NDRATCOR) {
monomial_list[i].hbfunction = &ndratcor_heatbath;
Expand Down
Loading
Loading