Guide to the Stokes Equations using Finite Elements in PETSc

This guide accompanies SNES Example 62.

The Stokes equations for a fluid, a steady-state form of the Navier-Stokes equations, start with the balance of momentum, just as in elastostatics,

\[\nabla \cdot \sigma + f = 0,\]

where \(\sigma\) is the stress tensor and \(f\) is the body force, combined with the conservation of mass

\[\nabla \cdot (\rho u) = 0,\]

where \(\rho\) is the density and \(u\) is the fluid velocity. If we assume that the density is constant, making the fluid incompressible, and that the rheology is Newtonian, meaning that the viscous stress is linearly proportional to the local strain rate, then we have

\[\begin{aligned} \nabla \cdot \mu \left( \nabla u + \nabla u^T \right) - \nabla p + f &= 0 \\ \nabla \cdot u &= 0 \end{aligned}\]

where \(p\) is the pressure, \(\mu\) is the dynamic shear viscosity, with units \(N\cdot s/m^2\) or \(Pa\cdot s\). If we divide by the constant density, we would have the kinematic viscosity \(\nu\) and a force per unit mass. The second equation demands that the velocity field be divergence-free, indicating that the flow is incompressible. The pressure in this case can be thought of as the Lagrange multiplier enforcing the incompressibility constraint. In the compressible case, we would need an equation of state to relate the pressure to the density, and perhaps temperature.

We will discretize our Stokes equations with finite elements, so the first step is to write a variational weak form of the equations. We choose to use a Ritz-Galerkin setup, so let our velocity \(u \in V\) and pressure \(p \in Q\), so that

\[\begin{aligned} \left< \nabla v, \mu \left( \nabla u + \nabla u^T \right) \right> + \left< v, \frac{\partial\sigma}{\partial n} \right>_\Gamma - \left< \nabla\cdot v, p \right> - \left< v, f \right> &= 0 & \text{for all} \ v \in V\\ \left< q, -\nabla \cdot u \right> &= 0 & \text{for all} \ q \in Q \end{aligned}\]

where integration by parts has added a boundary integral over the normal derivative of the stress (traction), and natural boundary conditions correspond to stress-free boundaries. We have multiplied the continuity equation by minus one in order to preserve symmetry.

Equation Definition

The test functions \(v, q\) and their derivatives are determined by the discretization, whereas the form of the integrand is determined by the physics. Given a quadrature rule to evaluate the form integral, we would only need the evaluation of the physics integrand at the quadrature points, given the values of the fields and their derivatives. The entire scheme is detailed in [KnepleyBrownRuppSmith13]. The kernels paired with test functions we will call \(f_0\) and those paired with gradients of test functions will be called \(f_1\).

For example, the kernel for the continuity equation, paired with the pressure test function, is called f0_p and can be seen here

static void f0_p(PetscInt dim, PetscInt Nf, PetscInt NfAux,
                 const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[],
                 const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[],
                 PetscReal t, const PetscReal x[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[])
{
  PetscInt d;
  for (d = 0, f0[0] = 0.0; d < dim; ++d) f0[0] -= u_x[d*dim+d];
}

We use the components of the Jacobian of \(u\) to build up its divergence. For the balance of momentum excluding body force, we test against the gradient of the test function, as seen in f1_u,

static void f1_u(PetscInt dim, PetscInt Nf, PetscInt NfAux,
                 const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[],
                 const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[],
                 PetscReal t, const PetscReal x[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f1[])
{
  const PetscReal mu = PetscRealPart(constants[0]);
  const PetscInt  Nc = uOff[1]-uOff[0];
  PetscInt        c, d;

  for (c = 0; c < Nc; ++c) {
    for (d = 0; d < dim; ++d) {
      f1[c*dim+d] = mu * (u_x[c*dim+d] + u_x[d*dim+c]);
    }

Notice how the pressure \(p\) is referred to using u[uOff[1]] so that we can have many fields with different numbers of components. DMPlex uses these point functions to construct the residual. A similar set of point functions is also used to build the Jacobian. The last piece of our physics specification is the construction of exact solutions using the Method of Manufactured Solutions (MMS).

MMS Solutions

An MMS solution is chosen to elucidate some property of the problem, and to check that it is being solved accurately, since the error can be calculated explicitly. For our Stokes problem, we first choose a solution with quadratic velocity and linear pressure,

\[u = \begin{pmatrix} x^2 + y^2 \\ 2 x^2 - 2 x y \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 2 x^2 + y^2 + z^2 \\ 2 x^2 - 2xy \\ 2 x^2 - 2xz \end{pmatrix}\]
static PetscErrorCode quadratic_u(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nc, PetscScalar *u, void *ctx)
{
  PetscInt c;

  u[0] = (dim-1)*PetscSqr(x[0]);
  for (c = 1; c < Nc; ++c) {
    u[0] += PetscSqr(x[c]);
    u[c]  = 2.0*PetscSqr(x[0]) - 2.0*x[0]*x[c];
  }
  return 0;
}
\[p = x + y - 1 \quad \mathrm{or} \quad x + y + z - \frac{3}{2}\]
static PetscErrorCode quadratic_p(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nc, PetscScalar *u, void *ctx)
{
  PetscInt d;

  u[0] = -0.5*dim;
  for (d = 0; d < dim; ++d) u[0] += x[d];
  return 0;
}

By plugging these solutions into our equations, assuming that the velocity we choose is divergence-free, we can determine the body force necessary to make them satisfy the Stokes equations. For the quadratic solution above, we find

\[f = \begin{pmatrix} 1 - 4\mu \\ 1 - 4\mu \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 1 - 8\mu \\ 1 - 4\mu \\ 1 - 4\mu \end{pmatrix}\]

which is implemented in our f0_quadratic_u pointwise function

static void f0_quadratic_u(PetscInt dim, PetscInt Nf, PetscInt NfAux,
                           const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[],
                           const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[],
                           PetscReal t, const PetscReal x[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[])
{
  const PetscReal mu = PetscRealPart(constants[0]);
  PetscInt        d;

  f0[0] = (dim-1)*4.0*mu - 1.0;
  for (d = 1; d < dim; ++d) f0[d] = 4.0*mu - 1.0;
}

We let PETSc know about these solutions

  ierr = PetscDSSetExactSolution(ds, 0, exactFuncs[0], user);CHKERRQ(ierr);
  ierr = PetscDSSetExactSolution(ds, 1, exactFuncs[1], user);CHKERRQ(ierr);

These solutions will be captured exactly by the \(P_2-P_1\) finite element space. We can use the -dmsnes_check option to activate function space checks. It gives the \(L_2\) error, or discretization error, of the exact solution, the residual computed using the interpolation of the exact solution into our finite element space, and uses a Taylor test to check that our Jacobian matches the residual. It should converge at order 2, or be exact in the case of linear equations like Stokes. Our \(P_2-P_1\) runs in the PETSc test section at the bottom of the source file

    suffix: 2d_p2_p1_check
    requires: triangle
    args: -sol quadratic -vel_petscspace_degree 2 -pres_petscspace_degree 1 -dmsnes_check 0.0001
    suffix: 3d_p2_p1_check
    requires: ctetgen
    args: -sol quadratic -dm_plex_box_dim 3 -dm_plex_box_faces 2,2,2 -vel_petscspace_degree 2 -pres_petscspace_degree 1 -dmsnes_check 0.0001

verify these claims, as we can see from the output files

L_2 Error: [2.08577e-16, 3.51044e-17]
L_2 Residual: 3.30808e-15
Function appears to be linear
L_2 Error: [8.33588e-16, 9.09348e-17]
L_2 Residual: 2.40406e-15
Function appears to be linear

We can carry out the same tests for the \(Q_2-Q_1\) element,

    suffix: 2d_q2_q1_check
    args: -sol quadratic -dm_plex_box_simplex 0 -vel_petscspace_degree 2 -pres_petscspace_degree 1 -dmsnes_check 0.0001
    suffix: 3d_q2_q1_check
    args: -sol quadratic -dm_plex_box_simplex 0 -dm_plex_box_dim 3 -dm_plex_box_faces 2,2,2 -vel_petscspace_degree 2 -pres_petscspace_degree 1 -dmsnes_check 0.0001

The quadratic solution, however, cannot tell us whether our discretization is attaining the correct order of convergence, especially for higher order elements. Thus, we will define another solution based on trigonometric functions.

\[u = \begin{pmatrix} \sin(\pi x) + \sin(\pi y) \\ -\pi \cos(\pi x) y \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 2 \sin(\pi x) + \sin(\pi y) + \sin(\pi z) \\ -\pi \cos(\pi x) y \\ -\pi \cos(\pi x) z \end{pmatrix}\]
static PetscErrorCode trig_u(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nc, PetscScalar *u, void *ctx)
{
  PetscInt c;

  u[0] = (dim-1)*PetscSinReal(PETSC_PI*x[0]);
  for (c = 1; c < Nc; ++c) {
    u[0] += PetscSinReal(PETSC_PI*x[c]);
    u[c]  = -PETSC_PI*PetscCosReal(PETSC_PI*x[0]) * x[c];
  }
  return 0;
}
\[p = \sin(2 \pi x) + \sin(2 \pi y) \quad \mathrm{or} \quad \sin(2 \pi x) + \sin(2 \pi y) + \sin(2 \pi z)\]
static PetscErrorCode trig_p(PetscInt dim, PetscReal time, const PetscReal x[], PetscInt Nc, PetscScalar *u, void *ctx)
{
  PetscInt d;

  for (d = 0, u[0] = 0.0; d < dim; ++d) u[0] += PetscSinReal(2.0*PETSC_PI*x[d]);
  return 0;
}
\[f = \begin{pmatrix} 2 \pi \cos(2 \pi x) + \mu \pi^2 \sin(\pi x) + \mu \pi^2 \sin(\pi y) \\ 2 \pi \cos(2 \pi y) - \mu \pi^3 \cos(\pi x) y \end{pmatrix} \quad \mathrm{or} \quad \begin{pmatrix} 2 \pi \cos(2 \pi x) + 2\mu \pi^2 \sin(\pi x) + \mu \pi^2 \sin(\pi y) + \mu \pi^2 \sin(\pi z) \\ 2 \pi \cos(2 \pi y) - \mu \pi^3 cos(\pi x) y \\ 2 \pi \cos(2 \pi z) - \mu \pi^3 \cos(\pi x) z \end{pmatrix}\]
static void f0_quadratic_u(PetscInt dim, PetscInt Nf, PetscInt NfAux,
                           const PetscInt uOff[], const PetscInt uOff_x[], const PetscScalar u[], const PetscScalar u_t[], const PetscScalar u_x[],
                           const PetscInt aOff[], const PetscInt aOff_x[], const PetscScalar a[], const PetscScalar a_t[], const PetscScalar a_x[],
                           PetscReal t, const PetscReal x[], PetscInt numConstants, const PetscScalar constants[], PetscScalar f0[])
{
  const PetscReal mu = PetscRealPart(constants[0]);
  PetscInt        d;

  f0[0] = (dim-1)*4.0*mu - 1.0;
  for (d = 1; d < dim; ++d) f0[d] = 4.0*mu - 1.0;
}
}

We can now use -snes_convergence_estimate to determine the convergence exponent for the discretization. This options solves the problem on a series of refined meshes, calculates the error on each mesh, and determines the slope on a logarithmic scale. For example, we do this in two dimensions, refining our mesh twice using -convest_num_refine 2 in the following test.

    suffix: 2d_p2_p1_conv
    requires: triangle
    # Using -dm_refine 3 gives L_2 convergence rate: [3.0, 2.1]
    args: -sol trig -vel_petscspace_degree 2 -pres_petscspace_degree 1 -snes_convergence_estimate -convest_num_refine 2 -ksp_error_if_not_converged \
      -ksp_atol 1e-10 -ksp_error_if_not_converged -pc_use_amat \
      -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition a11 -pc_fieldsplit_off_diag_use_amat \
        -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type lu

However, the test needs an accurate linear solver. Sparse LU factorizations do not, in general, do full pivoting. Thus we must deal with the zero pressure block explicitly. We use the PCFIELDSPLIT preconditioner and the full Schur complement factorization, but we still need a preconditioner for the Schur complement \(B^T A^{-1} B\). We can have PETSc construct that matrix automatically, but the cost rises steeply as the problem size increases. Instead, we use the fact that the Schur complement is spectrally equivalent to the pressure mass matrix \(M_p\). We can make a preconditioning matrix, which has the diagonal blocks we will use to build the preconditioners, letting PETSc know that we get the off-diagonal blocks from the original system with -pc_fieldsplit_off_diag_use_amat and to build the Schur complement from the original matrix using -pc_use_amat,

  ierr = PetscDSSetJacobianPreconditioner(ds, 0, 0, NULL, NULL, NULL, g3_uu);CHKERRQ(ierr);
  ierr = PetscDSSetJacobianPreconditioner(ds, 1, 1, g0_pp, NULL, NULL, NULL);CHKERRQ(ierr);

Putting this all together, and using exact solvers on the subblocks, we have

    suffix: 2d_p2_p1_conv
    requires: triangle
    # Using -dm_refine 3 gives L_2 convergence rate: [3.0, 2.1]
    args: -sol trig -vel_petscspace_degree 2 -pres_petscspace_degree 1 -snes_convergence_estimate -convest_num_refine 2 -ksp_error_if_not_converged \
      -ksp_atol 1e-10 -ksp_error_if_not_converged -pc_use_amat \
      -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_fact_type full -pc_fieldsplit_schur_precondition a11 -pc_fieldsplit_off_diag_use_amat \
        -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type lu

and we see it converges, however it is superconverging in the pressure,

L_2 convergence rate: [2.6, 3.3]

If we refine the mesh using -dm_refine 3, the convergence rates become [3.0, 2.1].

Dealing with Parameters

Like most physical problems, the Stokes problem has a parameter, the dynamic shear viscosity, which determines what solution regime we are in. To handle these parameters in PETSc, we first define a C struct to hold them,

typedef struct {
  PetscScalar mu; /* dynamic shear viscosity */
} Parameter;

and then add a PetscBag object to our application context. We then setup the parameter object,

static PetscErrorCode SetupParameters(MPI_Comm comm, AppCtx *ctx)
{
  Parameter     *p;
  PetscErrorCode ierr;

  PetscFunctionBeginUser;
  /* setup PETSc parameter bag */
  ierr = PetscBagCreate(PETSC_COMM_SELF, sizeof(Parameter), &ctx->bag);CHKERRQ(ierr);
  ierr = PetscBagGetData(ctx->bag, (void **) &p);CHKERRQ(ierr);
  ierr = PetscBagSetName(ctx->bag, "par", "Stokes Parameters");CHKERRQ(ierr);
  ierr = PetscBagRegisterScalar(ctx->bag, &p->mu, 1.0, "mu", "Dynamic Shear Viscosity, Pa s");CHKERRQ(ierr);
  ierr = PetscBagSetFromOptions(ctx->bag);CHKERRQ(ierr);
  {
    PetscViewer       viewer;
    PetscViewerFormat format;
    PetscBool         flg;

    ierr = PetscOptionsGetViewer(comm, NULL, NULL, "-param_view", &viewer, &format, &flg);CHKERRQ(ierr);
    if (flg) {
      ierr = PetscViewerPushFormat(viewer, format);CHKERRQ(ierr);
      ierr = PetscBagView(ctx->bag, viewer);CHKERRQ(ierr);
      ierr = PetscViewerFlush(viewer);CHKERRQ(ierr);
      ierr = PetscViewerPopFormat(viewer);CHKERRQ(ierr);
      ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr);
    }
  }
  PetscFunctionReturn(0);
}

which will allow us to set the value from the command line using -mu. The PetscBag can also be persisted to disk with PetscBagLoad/View(). We can make these values available as constant to our pointwise functions through the PetscDS object.

  {
    Parameter  *param;
    PetscScalar constants[1];

    ierr = PetscBagGetData(user->bag, (void **) &param);CHKERRQ(ierr);
    constants[0] = param->mu; /* dynamic shear viscosity, Pa s */
    ierr = PetscDSSetConstants(ds, 1, constants);CHKERRQ(ierr);
  }

Bibliography

KnepleyBrownRuppSmith13

M. G. Knepley, J. Brown, K. Rupp, and B. F. Smith. Achieving high performance with unified residual evaluation. ArXiv e-prints, September 2013. arXiv:1309.1204.