1 - BGVToOpenfhe

-bgv-to-openfhe

Lower bgv to openfhe dialect.

This pass lowers the bgv dialect to Openfhe dialect.

2 - BGVToPolynomial

-bgv-to-polynomial

Lower bgv to polynomial dialect.

This pass lowers the bgv dialect to polynomial dialect.

3 - CGGIPasses

-cggi-set-default-parameters

Set default parameters for CGGI ops

This pass adds default parameters to all CGGI ops as cggi_params named attributes, overriding any existing attribute set with that name.

This pass is primarily for testing purposes, and as a parameter provider before a proper parameter selection mechanism is added. This pass should not be used in production.

The specific parameters are hard-coded in lib/Dialect/CGGI/Transforms/SetDefaultParameters.cpp.

-cggi-straight-line-vectorizer

A straight-line vectorizer for CGGI bootstrapping ops.

This pass vectorizes CGGI ops. It ignores control flow and only vectorizes straight-line programs within a given region.

4 - CGGIToTfheRust

-cggi-to-tfhe-rust

Lower cggi to tfhe_rust dialect.

5 - CombToCGGI

-comb-to-cggi

Lower comb to cggi dialect.

This pass lowers the comb dialect to cggi dialect.

6 - ForwardStoreToLoadPasses

-forward-store-to-load

Forward stores to loads within a single block

This pass is a simplified version of mem2reg and similar passes. It analyzes an operation, finding all basic blocks within that op that have memrefs whose stores can be forwarded to loads.

Does not support complex control flow within a block, nor ops with arbitrary subregions.

7 - LWEPasses

-lwe-set-default-parameters

Set default parameters for LWE ops

This pass adds default parameters to all lwe types as the lwe_params attribute, and for lwe ops as the params attribute, overriding any existing attributes set with those names.

This pass is primarily for testing purposes, and as a parameter provider before a proper parameter selection mechanism is added. This pass should not be used in production.

The specific parameters are hard-coded in lib/Dialect/LWE/Transforms/SetDefaultParameters.cpp.

8 - MemrefToArith

-expand-copy

Expands memref.copy ops to explicit affine loads and stores

This pass removes memref copy operations by expanding them to affine loads and stores. This pass introduces affine loops over the dimensions of the MemRef, so must be run prior to any affine loop unrolling in a pipeline.

Input

module {
  func.func @memref_copy() {
    %alloc = memref.alloc() : memref<2x3xi32>
    %alloc_0 = memref.alloc() : memref<2x3xi32>
    memref.copy %alloc, %alloc_0 : memref<1x1xi32> to memref<1x1xi32>
  }
}

Output

module {
  func.func @memref_copy() {
    %alloc = memref.alloc() : memref<2x3xi32>
    %alloc_0 = memref.alloc() : memref<2x3xi32>
    affine.for %arg0 = 0 to 2 {
      affine.for %arg1 = 0 to 3 {
        %1 = affine.load %alloc[%arg0, %arg1] : memref<2x3xi32>
        affine.store %1, %alloc_0[%arg0, %arg1] : memref<2x3xi32>
      }
    }
  }
}

When --disable-affine-loop=true is set, then the output becomes

module {
  func.func @memref_copy() {
    %alloc = memref.alloc() : memref<2x3xi32>
    %alloc_0 = memref.alloc() : memref<2x3xi32>
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %0 = affine.load %alloc[%c0, %c0] : memref<2x3xi32>
    affine.store %0, %alloc_0[%c0, %c0] : memref<2x3xi32>
    %1 = affine.load %alloc[%c0, %c1] : memref<2x3xi32>
    affine.store %1, %alloc_0[%c0, %c1] : memref<2x3xi32>
    %2 = affine.load %alloc[%c0, %c2] : memref<2x3xi32>
    affine.store %2, %alloc_0[%c0, %c2] : memref<2x3xi32>
    [...]
  }
}

Options

-disable-affine-loop : Use this to control to disable using affine loops

-extract-loop-body

Extracts logic of a loop bodies into functions.

This pass extracts logic in the inner body of for loops into functions.

This pass requires that tensors are lowered to memref. It expects that a loop body contains a number of affine.load statements used as inputs to the extracted function, and a single affine.store used as the extracted function’s output.

Input

module {
  func.func @loop_body() {
    %c-128_i8 = arith.constant -128 : i8
    %c127_i8 = arith.constant 127 : i8
    %alloc_7 = memref.alloc() {alignment = 64 : i64} : memref<25x20x8xi8>
    affine.for %arg1 = 0 to 25 {
      affine.for %arg2 = 0 to 20 {
        affine.for %arg3 = 0 to 8 {
          %98 = affine.load %alloc_6[%arg1, %arg2, %arg3] : memref<25x20x8xi8>
          %99 = arith.cmpi slt, %arg0, %c-128_i8 : i8
          %100 = arith.select %99, %c-128_i8, %arg0 : i8
          %101 = arith.cmpi sgt, %arg0, %c127_i8 : i8
          %102 = arith.select %101, %c127_i8, %100 : i8
          affine.store %102, %alloc_7[%arg1, %arg2, %arg3] : memref<25x20x8xi8>
        }
      }
    }
  }
}

Output

module {
  func.func @loop_body() {
    %alloc_7 = memref.alloc() {alignment = 64 : i64} : memref<25x20x8xi8>
    affine.for %arg1 = 0 to 25 {
      affine.for %arg2 = 0 to 20 {
        affine.for %arg3 = 0 to 8 {
          %98 = affine.load %alloc_6[%arg1, %arg2, %arg3] : memref<25x20x8xi8>
          %102 = func.call @__for_loop(%98) : (i8) -> i8
          affine.store %102, %alloc_7[%arg1, %arg2, %arg3] : memref<25x20x8xi8>
        }
      }
    }
  }
  func.func private @__for_loop(%arg0: i8) -> i8 {
    %c-128_i8 = arith.constant -128 : i8
    %c127_i8 = arith.constant 127 : i8
    %99 = arith.cmpi slt, %arg0, %c-128_i8 : i8
    %100 = arith.select %99, %c-128_i8, %arg0 : i8
    %101 = arith.cmpi sgt, %arg0, %c127_i8 : i8
    %102 = arith.select %101, %c127_i8, %100 : i8
    return %102 : i8
  }
}

Options

-min-loop-size : Use this to control the minimum loop size to apply this pass
-min-body-size : Use this to control the minimum loop body size to apply this pass

-memref-global-replace

MemrefGlobalReplacePass forwards global memrefs accessors to arithmetic values

This pass forwards constant global MemRef values to referencing affine loads. This pass requires that the MemRef global values are initialized as constants and that the affine load access indices are constants (i.e. not variadic). Unroll affine loops prior to running this pass.

MemRef removal is required to remove any memory allocations from the input model (for example, TensorFlow models contain global memory holding model weights) to support FHE transpilation.

Input

module {
  memref.global "private" constant @__constant_8xi16 : memref<2x4xi16> = dense<[[-10, 20, 3, 4], [5, 6, 7, 8]]>
  func.func @main() -> i16 {
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %0 = memref.get_global @__constant_8xi16 : memref<2x4xi16>
    %1 = affine.load %0[%c1, %c1 + %c2] : memref<2x4xi16>
    return %1 : i16
  }
}

Output

module {
  func.func @main() -> i16 {
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %c8_i16 = arith.constant 8 : i16
    return %c8_i16 : i16
  }
}

-unroll-and-forward

Loop unrolls and forwards stores to loads.

This pass processes the first function in a given module, and, starting from the first loop, iteratively does the following:

  1. Fully unroll the loop.
  2. Scan for load ops. For each load op with a statically-inferrable access index:
  3. Backtrack to the original memref alloc
  4. Find all store ops at the corresponding index (possibly transitively through renames/subviews of the underlying alloc).
  5. Find the last store that occurs and forward it to the load.
  6. If the original memref is an input memref, then forward through any renames to make the target load load directly from the argument memref (instead of any subviews, say)
  7. Apply the same logic to any remaining loads not inside any for loop.

This pass requires that tensors are lowered to memref, and only supports affine loops with affine.load/store ops.

Memrefs that result from memref.get_global ops are excluded from forwarding, even if they are loaded with a static index, and are instead handled by memref-global-replace, which should be run after this pass.

9 - PolynomialToStandard

-polynomial-to-standard

Lower polynomial to standard MLIR dialects.

This pass lowers the polynomial dialect to standard MLIR, a mixture of affine, tensor, and arith.

10 - SecretizePasses

-secretize

Adds secret argument attributes to entry function

Adds a secret.secret attribute argument to each argument in the entry function of an MLIR module. By default, the function is main. This may be overridden with the option -entry-function=top_level_func.

Options

-entry-function : entry function of the module

-wrap-generic

Wraps regions using secret args in secret.generic bodies

This pass wraps function regions of func.func that use secret arguments in secret.generic bodies.

Secret arguments are annotated using a secret.secret argument attribute. This pass converts these to secret types and then inserts a secret.generic body to hold the functions region. The output type is also converted to a secret.

Example input:

  func.func @main(%arg0: i32 {secret.secret}) -> i32 {
    %0 = arith.constant 100 : i32
    %1 = arith.addi %0, %arg0 : i32
    return %1 : i32
  }

Output:

  func.func @main(%arg0: !secret.secret<i32>) -> !secret.secret<i32> {
    %0 = secret.generic ins(%arg0 : !secret.secret<i32>) {
    ^bb0(%arg1: i32):
      %1 = arith.constant 100 : i32
      %2 = arith.addi %0, %arg1 : i32
      secret.yield %2 : i32
    } -> !secret.secret<i32>
    return %0 : !secret.secret<i32>
  }

11 - SecretPasses

-secret-capture-generic-ambient-scope

Capture the ambient scope used in a secret.generic

For each value used in the body of a secret.generic op, which is defined in the ambient scope outside the generic, add it to the argument list of the generic.

-secret-distribute-generic

Distribute generic ops through their bodies.

Converts generic ops whose region contains many ops into smaller sequences of generic ops whose regions contain a single op, dropping the generic part from any resulting generic ops that have no secret.secret inputs. If the op has associated regions, and the operands are not secret, then the generic is distributed recursively through the op’s regions as well.

This pass is intended to be used as part of a front-end pipeline, where a program that operates on a secret type annotates the input to a region as secret, and then wraps the contents of the region in a single large secret.generic, then uses this pass to simplify it.

The distribute-through option allows one to specify a comma-separated list of op names (e.g., distribute-thorugh="affine.for,scf.if"), which limits the distribution to only pass through those ops. If unset, all ops are distributed through when possible.

Options

-distribute-through : comma-separated list of ops that should be distributed through

-secret-forget-secrets

Convert secret types to standard types

Drop the secret<...> type from the IR, replacing it with the contained type and the corresponding cleartext computation.

-secret-generic-absorb-constants

Copy constants into a secret.generic body

For each constant value used in the body of a secret.generic op, which is defined in the ambient scope outside the generic, add it’s definition into the generic body.

-secret-merge-adjacent-generics

Merge two adjacent generics into a single generic

This pass merges two immedaitely sequential generics into a single generic. Useful as a sub-operation in some passes, and extracted into its own pass for testing purposes.

12 - YosysOptimizerPasses

-yosys-optimizer

Invoke Yosys to perform circuit optimization.

This pass invokes Yosys to convert an arithmetic circuit to an optimized boolean circuit that uses the arith and comb dialects.

Note that booleanization changes the function signature: multi-bit integers are transformed to a tensor of booleans, for example, an i8 is converted to tensor<8xi1>.

The optimizer will be applied to each secret.generic op containing arithmetic ops that can be optimized.

Optional parameters:

  • abc-fast: Run the abc optimizer in “fast” mode, getting faster compile time at the expense of a possibly larger output circuit.
  • unroll-factor: Before optimizing the circuit, unroll loops by a given factor. If unset, this pass will not unroll any loops.
  • print-stats: Prints statistics about the optimized circuits.

Statistics

total circuit size : The total circuit size for all optimized circuits, after optimization is done.