Skip to content

Theory: pyOZ algorithm

previous (Ornstein-Zernike equation and closure relations)
theory index
next (interaction potentials)


This page describes the algorithm that is used in the pyOZ for the solution of the Ornstein-Zernike equation. It concerns only the “heart” of the program – all information about input and output handling and similar things is skipped.

Before discussing the algorithm itself, there are some things that have to be clarified first. We have the multicomponent Ornstein-Zernike equation

\displaystyle h(\mathbf{r}_{ij}) = c(\mathbf{r}_{ij}) + \sum\limits_{n=1}^N \rho_n (c_{in} * h_{nj})(\mathbf{r}_{in})

In Fourier space (using the convolution theorem)

\displaystyle \^h(\mathbf{k}_{ij}) = \^c(\mathbf{k}_{ij}) + \sum\limits_{n=1}^N \rho_n \^c(\mathbf{k}_{in}) \^h(\mathbf{k}_{nj})

All correlation functions are discretized on an evenly-spaced grid in both real and Fourier space. That means, that we actually have a set of P (number of discretization points) matrix equations with dimensionality of (ncomponents x ncomponents). All matrices are square and symmetric since the pair additivity is assumed for the interaction potentials.

In order to get rid of the density, we multiply with \sqrt{\rho_i \rho_j} to get

\displaystyle \sqrt{\rho_i \rho_j} \^h(\mathbf{k}_{ij}) = \sqrt{\rho_i \rho_j} \^c(\mathbf{k}_{ij}) + \sum\limits_{n=1}^N \sqrt{\rho_i \rho_n} \^c(\mathbf{k}_{in}) \sqrt{\rho_n \rho_j} \^h(\mathbf{k}_{in})

and recognize, that the sum is a definition of matrix multiplication. We then define new functions, that have the density prefactor included, e.g.,

\^H(\mathbf{k}_{ij}) = \sqrt{\rho_i \rho_j} \^h(\mathbf{k}_{ij})

and write everything in the matrix form

\displaystyle \mathbf{\^H} = \mathbf{\^C} + \mathbf{\^C} \mathbf{\^H}

Solving this matrix equation for all discretization steps and subtracting the direct correlation function in the Fourier space yields the indirect correlation function \Tilde{\Gamma} in the Fourier space as well.

\displaystyle \Tilde{\Gamma} = \mathbf{\^H} -\mathbf{\^C} = \frac{\mathbf{\^C}}{\mathbf{E}-\mathbf{\^C}} -\mathbf{\^C}

E in the previous equation is an identity matrix. Division by the \sqrt{\rho_i \rho_j} density factor yields \Hat{\Gamma}. Its inverse Fourier transformation provides the Γ function in real space, where it can be used for further calculation, as will be described below.

pyOZ algorithm outline

The whole algorithm consists of repeating several steps until convergence is found. This is the most simple version of the calculation procedure. pyOZ currently exactly follows these steps. This might change in the future. In such a case, the change will be noted here and the exact procedure will be thoroughly mathematically discussed in the Theory section. (The handling of density prefactors is meanwhile done transparently by the Fourier transforms themselves, which are now in a certain way normalized by these constants)

  1. set starting Gamma function \Gamma_{ij}^{\mathrm{in}} – use either zeros or Gamma function from previous calculation
  2. calculate the direct correlation function c in real space using selected closure relation and Γ
  3. Fourier transform the direct correlation function to Fourier space

    we are using the Fourier-sinus transformation and the operation can be described as (DFT and other normalization factors omitted)

    \displaystyle \^c_{ij} = \Delta r \frac{1}{k} \mathrm{DST}(r \cdot c_{ij})

    where DST is the discrete sine transform, k is reciprocal distance and Δr is the step size in real space. Then, multiply with \sqrt{\rho_i \rho_j} and get the final \^C_{ij} in Fourier space

  4. solve the matrix problem in Fourier space and get \Tilde{\Gamma}_{ij} – Gamma function in Fourier space with density prefactor applied
  5. divide \Tilde{\Gamma}_{ij} by \sqrt{\rho_i \rho_j} to get \Hat{\Gamma}_{ij} – the Gamma function in Fourier space without the density prefactor applied
  6. Fourier transform the Gamma function to real space using the inverse discrete sine transform (DFT and other normalization factors omitted)

    \displaystyle \Gamma_{ij}^{\mathrm{out}} = \Delta k \frac{1}{r} \mathrm{iDST}(k \cdot \Hat{\Gamma}_{ij})

    where iDST is the inverse discrete sine transform, r is distance, and Δk is the step size in real space

  7. check the convergence (DSQN) of Γ

    \displaystyle \mathrm{DSQN} = \sqrt{\frac{1}{N \mathrm{ncomp}^2} (\Gamma^{\mathrm{in}} -\Gamma^{\mathrm{out}})^2}

    DSQN is normalized to the number of discretization points (N) and the number of different combinations of interacting “ncomponents” particles – note that Γ has dimensions of (ncomp x ncomp x N).

  8. if not converged get new Γ – currently it is calculated by mixing old (from step 2) and new values (from step 6) using a mixing parameter α – more sophisticated approaches exist

    \displaystyle \Gamma_{ij}^{\mathrm{in, new}} = (1 -\alpha) \Gamma_{ij}^{\mathrm{in}} + \alpha \Gamma_{ij}^{\mathrm{out}}

    return to step 2 and repeat the cycle

  9. if convergence is achieved, then finish the calculation

Newton-Raphson/conjugate gradients (NR/CG) modification of the algorithm (J. Comp. Phys 1985, 61, 280-285; J. Chem. Phys 1988, 88, 5143)

The Picard iteration (mixing) is generally not very stable and requires many iterations in order to reach convergence. We are looking for another possibility how to obtain better estimate of the final function.

One of these possibilities is the use of the so called Newton-Raphson (NR) procedure. NR can be used for finding the root of a function f(x) by linear interpolation by using a derivative at the trial point x_i

f(x_{i+1}) = f(x_i) + f'(x_i) (x_{i+1}-x_i)

and finding the new value x_{i+1} by setting f_{i+1} = 0. In the case of a linear function, this procedure leads directly to the solution. In the case of a nonlinear function, this procedure has to be iterated. For more details check for example this page.

The situation is indeed more complicated in case of multidimensional problem. There we have

F_i(x_1,x_2,\dots,x_{N-1}) = 0 \hspace{1cm} i=0,1,\dots,N-1

In matrix form

\mathbf{F}(\mathbf{x}) = 0

The NR method is then written as

\displaystyle F_i(\mathbf{x}+\mathbf{dx}) = F_i(\mathbf{x}) + \sum_{j=0}^{N-1} \frac{\partial F_i}{\partial x_j} dx_j + \mathfrac{O}(d\mathbf{x}^2)

Let’s define the Jacobian matrix \mathbf{J}

\displaystyle J_{ij} = \frac{\partial F_i}{\partial x_j}

The whole method can be then written

\displaystyle \mathbf{F}(\mathbf{x}+\mathbf{dx}) = \mathbf{F}(\mathbf{x}) + \mathbf{J}d\mathbf{x} + \mathfrac{O}(d\mathbf{x}^2)

Neglecting the second order terms and setting the function value at the new set of trial points to zero (as was done for one dimensional case) leads to

\mathbf{J}d\mathbf{x} = -\mathbf{F}(\mathbf{x})

The new improved set of trial points can be then obtained via

\mathbf{x_{new}} = \mathbf{x_{old}} + d\mathbf{x} = \mathbf{x_{old}} -\mathbf{J}^{-1} \mathbf{F}

This step is quite problematic, since the Jacobian matrix is usually very large and it is not possible to invert it efficiently. A possible way how to solve the equation (albeit approximatively) without the need to invert J – iterative non-symmetric conjugate gradients method – will be shown later.

We now get back to the Ornstein-Zernike equation. Let’s formulate it slightly differently:

(1+\^h)(1-\^c) = 1
or
S(1-\^c) = 1

In this case, 1 is the identity matrix (ones on the diagonal) and S is the so called partial structure factor matrix. The convergence of the the normal cycle, described above, is achieved when the input and output gamma functions are the same:

\Gamma^\mathrm{in} = \Gamma^\mathrm{out}

We start the calculation with some other function, very similar to the original one. As a result, we will obtain

\Gamma^\mathrm{in} + d\Gamma^\mathrm{in} = \Gamma^\mathrm{out} + d\Gamma^\mathrm{out}

The last term is linear in d\Gamma^\mathrm{in}, coupled via a complicated linear operator. Rearrangement yields

d\Gamma^\mathrm{in} -d\Gamma^\mathrm{out} = \Gamma^\mathrm{out} -\Gamma^\mathrm{in}\\(1-A')d\Gamma^\mathrm{in} = \Gamma^\mathrm{out} -\Gamma^\mathrm{in}

which is a system of linear equations of the form AX=B, which is soluble by appropriate methods.

We have to show now, what does the operator A’ look like. We have to linearize the OZ equation and the closure. For HNC, the derivation leads to

c_{ij} \rightarrow dc_{ij} = \exp(-\beta U_{ij} + \Gamma_{ij}^\mathrm{in}) d\Gamma_{ij}^\mathrm{in} -d\Gamma_{ij}^\mathrm{in} = h_{ij}d\Gamma_{ij}

whereas for PY, we obtain

c_{ij} \rightarrow dc_{ij} = \exp(-\beta U_{ij}) d\Gamma_{ij}^\mathrm{in} -d\Gamma_{ij}^\mathrm{in} = (\exp(-\beta U_{ij})-1)d\Gamma_{ij} = M_{ij} d\Gamma_{ij}

where h is the original total correlation function without the density prefactors applied and M is the respective Mayer function. In the following text (in order to make the discussion of the two closures easier), the respective factor multiplying the d\Gamma will be called \Xi.

Fourier transformation is a linear operator, therefore (we assume that the forward FT takes care of the density prefactor – it is “normalized” by it; as a result, we have dimensionless correlation functions in Fourier space):

\^C_{ij} = \mathrm{FT}(c_{ij}) \rightarrow d\^C_{ij} = \mathrm{FT}(dc_{ij}) = \mathrm{FT}(\Xi d\Gamma_{ij})

Differentiation of the OZ equation yields

\displaystyle S(1-\^C) = 1 \rightarrow \frac{d}{d\^C} [S(1-\^C)] = 0

We need to do some straightforward manipulations now:

\displaystyle\frac{d}{d\^C} [S(1-\^C)] = \frac{dS}{d\^C} (1-\^C) + S\frac{d}{d\^C} (1-\^C) = \frac{dS}{d\^C} (1-\^C) -S = 0 \rightarrow \\ \rightarrow \frac{dS}{d\^C} (1-\^C) = S

Further rearrangement yields

dS(1-\^C) = Sd\^C \rightarrow dS = Sd\^C(1-C)^{-1} = Sd\^CS

Please note, that this expression involves matricial products (as opposed to all steps performed so far, which alway involve only functions for the same combination “ij”.

The new gamma function is calculated easily, the small change as well:

\Hat{\Gamma}^\mathrm{out} = S -1 -\^C \rightarrow d\Hat{\Gamma}^\mathrm{out} = dS-d\^C = Sd\^CS-d\^C = (S^2 -1)d\^C

The last expression is valid only when the correct form of multiplication is used (matrix product!).

The inverse Fourier transform is linear as well (we assume that it takes care of the removal of the density prefactor):

\Gamma_{ij}^\mathrm{out} = \mathrm{FT}^{-1}(\Hat{\Gamma}_{ij}^\mathrm{out}) \rightarrow d\Gamma_{ij}^\mathrm{out} = \mathrm{FT}^{-1}(d\Hat{\Gamma}_{ij}^\mathrm{out})

Putting all these steps together, we can write the total form of the operator A’:

d\Gamma^\mathrm{out} = A' d\Gamma^\mathrm{in} = \mathrm{FT}^{-1}(S \mathrm{FT}(\Xi d\Gamma^\mathrm{in})S -\mathrm{FT}(\Xi d\Gamma^\mathrm{in}))

The total operator A is then defined as A = 1 – A’.

As was noted above, the inversion of the Jacobian matrix needed for the solution of the problem AX=B using the Newton-Raphson method is computationally prohibitive. Therefore, we are using iterative method, which allows for quite fast determination of the solution – the so called conjugate gradients method, in its nonsymmetric, iterative form. We are repeating the following procedure until the remainders

R_N = B -AX_N

are zero. The following definitions are used in the process:

X_{N+1} = X_{N-1} + \omega_{N+1} ( \alpha_N A^+ R_N + X_N -X_{N-1})

\displaystyle \alpha_N = \frac{(R_N,R_N)}{(A^+R_N, A^+R_N)}

\displaystyle\omega_{N+1} = \left[ 1-\frac{\alpha_N}{\alpha_{N-1}\omega_N}\frac{(R_N,R_N)}{(R_{N-1},R_{N-1})} \right]^{-1}

where (X,Y) denotes the inner product

(X,Y) = \sum_{i,j} \int_0^\infty X_{ij}Y_{ij} 4 \pi r^2 dr

and A^+ is the adjoint of the operator A such as that

(AX,Y) = (X,A^+Y)

To find the adjoint of A we note that it can be ‘separated’ into several substeps (as shown above)

AX = X -A_1 A_2 A_3 A_4 X

and that (standard property of operator adjoint);

A^+Y = Y -A_4^+ A_3^+ A_2^+ A_1^+ Y

We also assume, that the Fourier transform is self adjoint. It doesn’t have to be for approximate discrete transforms, but the tests seem to show, that this assumption is correct in the case of the FT employed in pyOZ. It has to be noted, that the ability of this method to converge to the result is very dependent on the quality of the adjoint. Anyway, the adjoint of the operator A is

A^+ d\Gamma^\mathrm{in} = d\Gamma^\mathrm{in} -\mathrm{FT}^{-1}(S~\mathrm{FT}(d\Gamma^\mathrm{in})~S -\mathrm{FT}(d\Gamma^\mathrm{in})) \Xi

where the products involving the partial structure factors are done matrix-wise and the other products element-wise.

For the first iteration, X=X_0 is set to a fraction of B (in case of pyOZ, the default is zero). Also, X_0 = X_{-1} and \omega_1 = 1.

NR/CG pyOZ algorithm outline

The whole modified algorithm is very similar to the original. The whole calculation of the gamma function is the same, the difference comes at the moment when gamma for new iteration is calculated.

  1. set starting Gamma function \Gamma_{ij}^{\mathrm{in}} – use either zeros or Gamma function from previous calculation
  2. calculate the direct correlation function c in real space using selected closure relation and Γ
  3. Fourier transform the direct correlation function to Fourier space
  4. solve the matrix problem in Fourier space and get \Tilde{\Gamma}_{ij} – Gamma function in Fourier space with density prefactor applied
  5. Fourier transform the Gamma function to real space using the inverse discrete sine transform
  6. check the convergence (DSQN) of Γ as defined above – comparing the old and new gamma
  7. if not converged, we use the Newton-Raphson/conjugate gradients procedure as described earlier. We stop when the remainder R has converged (using DSQN(R,0)) to the desired accuracy, given as a fraction of the DSQN of the main (outer) cycle. In this way, we avoid unnecessary iterations in the beginning when DSQN is large.
  8. the new gamma is then calculated by
    \displaystyle \Gamma_{ij}^{\mathrm{in, new}} = \Gamma_{ij}^{\mathrm{in}} + X_{ij}
    then the algorithm returns to step 2 and another cycle is started
  9. if convergence is achieved, then finish the calculation

previous (Ornstein-Zernike equation and closure relations)
theory index
next (interaction potentials)