Simulating Workbench Programs¶

In this tutorial, we will explore the various simulators available in Workbench, their capabilities and limitations.

As we've seen in the earlier tutorial, Workbench programs are compiled and executed via the filter pipeline - a sequence of filters that process quantum instructions. One of the most prominent types of filters are simulators - classical programs that simulate high-level behavior of quantum computers. The core functionality of quantum simulators is simulating the internal state of the quantum program and the way it changes with each gate or measurement applied in the program. Additionally, some simulators provide extra features to access their internal state the way you couldn't access the state of a real quantum computer.

We use simulators to run small quantum programs or specialized subroutines of larger quantum algorithms and verify that they produce correct results. This allows us to develop, test, and optimize quantum software before we build real fault-tolerant quantum computers.

In this tutorial, we'll go over the several main simulators available in Workbench:

State vector simulator
Bit-vector simulator, useful for simulating large computations that don't introduce superposition
CUDA-Q simulators, useful for accelerating state vector and tensor network simulation using a GPU

In the next tutorial, we will dive deeper into testing and debugging Workbench programs using the additional capabilities provided by the Workbench simulators.

State vector simulator¶

State vector simulator allows you to simulate small (~30 qubits) programs. This is the default simulator in Workbench; when you create a QPU without specifying the list of filters to use, it will use state vector simulator (alongside with a few utility filters).

The following example shows how to use the state vector simulator. It prepares a W state on four qubits using a sequence of controlled rotation gates. As a reminder,

Then, it prints the state vector of the system using print_state_vector method of the QPU object to demonstrate that it is indeed in the W state. This is a simulator-specific capability: you cannot print the program state when running on a real quantum device!

Finally, the program repeatedly prepares the W state and measures the qubits to get the frequencies of different outcomes. You will get the outcomes $1$, $2$, $4$, and $8$ with approximately equal frequency - the little-endian representations of the basis states that comprise the W state. If you ran this program on a QPU object without a simulator in the list of filters, all calls to read would've returned $0$.

In [1]:

Copied!





from math import asin, sqrt
from psiqworkbench import QPU, Qubits, Units

def prep_w_state(reg: Qubits):
    n = reg.num_qubits
    reg[0].ry(2 * asin(1 / sqrt(n)) * Units.rad)
    for j in range(1, n):
        reg[j].ry(2 * asin(1 / sqrt(n - j)) * Units.rad, cond=~reg[:j])

n = 4
qpu = QPU(num_qubits=n)
reg = Qubits(n, "reg", qpu)
prep_w_state(reg)
qpu.print_state_vector()
qpu.nop(repeat=10)
qpu.draw()

freq = [0] * 2**4
for _ in range(100):
    reg.write(0)
    prep_w_state(reg)
    freq[reg.read()] += 1
print(freq)
from math import asin, sqrt
from psiqworkbench import QPU, Qubits, Units

def prep_w_state(reg: Qubits):
    n = reg.num_qubits
    reg[0].ry(2 * asin(1 / sqrt(n)) * Units.rad)
    for j in range(1, n):
        reg[j].ry(2 * asin(1 / sqrt(n - j)) * Units.rad, cond=~reg[:j])

n = 4
qpu = QPU(num_qubits=n)
reg = Qubits(n, "reg", qpu)
prep_w_state(reg)
qpu.print_state_vector()
qpu.nop(repeat=10)
qpu.draw()

freq = [0] * 2**4
for _ in range(100):
    reg.write(0)
    prep_w_state(reg)
    freq[reg.read()] += 1
print(freq)

|reg>
|1>    0.500000+0.000000j
|2>    0.500000+0.000000j
|4>    0.500000+0.000000j
|8>    0.500000+0.000000j

No description has been provided for this image

[0, 22, 30, 0, 26, 0, 0, 0, 22, 0, 0, 0, 0, 0, 0, 0]

If you allocate more than 30 qubits, QPU initialization will fail due to memory limitations. For example, the following code snippet

qpu = QPU(num_qubits=31)

will raise the following runtime error:

RAM size for 31 qubits requires 32 GB, which exceeds qc.reset(max_ram_size_gb) limit of 16 GB.
To fix this, either increase max_ram_size_gb, reduce the number of qubits,
or remove the >>state-vector-sim>>/>>qpu>> filter (possibly replace it with >>bit-sim>> if applicable)

The state vector simulator has a default maximum state vector size of 16GB (30 qubits). This limit can be adjusted if needed. To do this, you need to call the reset method of the QPU with both

qpu = QPU()
qpu.reset(31, max_ram_size_gb=32)

However, you need to make sure that the machine that's running your script has enough RAM! If you set max_ram_size_gb to a value that's higher than the RAM available, the simulator will crash.

Bit-vector simulator¶

For programs on a useful scale (hundreds or thousands of qubits), large portions of the circuit often consist of reversible computations - quantum computations that implement classical computations, such as arithmetic functions. These subroutines can be simulated independently using the bit-vector simulator.

The bit-vector simulator supports only a limited subset of quantum operations: the X gate, its controlled variants, and measurements. Since the freshly allocated qubits start in the $|0\rangle$ state, the limited set of gates means that the state of the simulated quantum system is always a basis state, never a superposition. This allows the simulator to represent the system state as a single bit string of 0s and 1s and to simulate very large quantum programs very effectively.

The following example shows how to set up a QPU to use the BIT_DEFAULT filter preset, which includes bit-vector simulator, and to run an adder example. You wouldn't be able to run the same program on a state vector simulator - simulating a hundred qubits is way beyond your computer's capabilities! (Remember that the RAM size required for 100 qubits is 1.88895e+22 GB!)

In [2]:

Copied!





from psiqworkbench import QPU, Qubits
from psiqworkbench.filter_presets import BIT_DEFAULT
from psiqworkbench.qubricks import NaiveAdd
from random import randrange
import time

n = 50
bit_qpu = QPU(num_qubits=2 * n, filters=BIT_DEFAULT)
reg1, reg2 = Qubits(n, 'reg1', bit_qpu), Qubits(n, 'reg2', bit_qpu)
adder = NaiveAdd()

val1, val2 = randrange(0, 2 ** 49), randrange(0, 2 ** 49)

tic = time.monotonic()
reg1.write(val1)
reg2.write(val2)
adder.compute(reg1, reg2)
sum = reg1.read()
print(f'Computation finished in {time.monotonic() - tic} seconds')
assert val1 + val2 == sum

bit_qpu.print_state_vector()

tic = time.monotonic()
bit_qpu.draw()
print(f'Drawing finished in {time.monotonic() - tic} seconds')
from psiqworkbench import QPU, Qubits
from psiqworkbench.filter_presets import BIT_DEFAULT
from psiqworkbench.qubricks import NaiveAdd
from random import randrange
import time

n = 50
bit_qpu = QPU(num_qubits=2 * n, filters=BIT_DEFAULT)
reg1, reg2 = Qubits(n, 'reg1', bit_qpu), Qubits(n, 'reg2', bit_qpu)
adder = NaiveAdd()

val1, val2 = randrange(0, 2 ** 49), randrange(0, 2 ** 49)

tic = time.monotonic()
reg1.write(val1)
reg2.write(val2)
adder.compute(reg1, reg2)
sum = reg1.read()
print(f'Computation finished in {time.monotonic() - tic} seconds')
assert val1 + val2 == sum

bit_qpu.print_state_vector()

tic = time.monotonic()
bit_qpu.draw()
print(f'Drawing finished in {time.monotonic() - tic} seconds')

Computation finished in 0.06076729000000114 seconds
|reg1|reg2>
|276211007272690|141684700015608>    1.000000+0.000000j

Drawing finished in 1.1205059060000053 seconds

You can see that for this example drawing the circuit diagram takes much longer than the simulation itself! If you don't need to draw the visual representation of the circuit, for example, if you're running large-scale tests of arithmetic subroutines, the bit-vector simulator can handle much larger programs. The following example shows how to simulate addition of an integer to a 1,000-qubit register.

In [3]:

Copied!





from psiqworkbench import QPU, Qubits
from psiqworkbench.qubricks import NaiveAdd
from random import randrange
import time

n = 1000
qpu = QPU(num_qubits=n, filters=[">>bit-sim>>"])
reg = Qubits(n, 'reg', qpu)
adder = NaiveAdd()

val1, val2 = randrange(0, 2 ** (n - 1)), randrange(0, 2 ** (n - 1))

tic = time.monotonic()
reg.write(val1)
adder.compute(reg, val2)
sum = reg.read()
print(f'Computation finished in {time.monotonic() - tic} seconds')
qpu.print_state_vector()
assert val1 + val2 == sum
from psiqworkbench import QPU, Qubits
from psiqworkbench.qubricks import NaiveAdd
from random import randrange
import time

n = 1000
qpu = QPU(num_qubits=n, filters=[">>bit-sim>>"])
reg = Qubits(n, 'reg', qpu)
adder = NaiveAdd()

val1, val2 = randrange(0, 2 ** (n - 1)), randrange(0, 2 ** (n - 1))

tic = time.monotonic()
reg.write(val1)
adder.compute(reg, val2)
sum = reg.read()
print(f'Computation finished in {time.monotonic() - tic} seconds')
qpu.print_state_vector()
assert val1 + val2 == sum

Computation finished in 31.532571730999962 seconds
|reg>
|8539160238641207131757618179122977020768528758926605838133596991900907720652961055268293596417362415056972847976777791999210689319990786209474348044461922346413536720293432079625718111061239754313304636742231770950630002807662360037141430631093655214956549230936983469589439922179183903580251183769188>    1.000000+0.000000j

CUDA-Q simulators¶

Workbench contains built-in support for the CUDA-Q software package from NVIDIA ⧉, which enables rapid QPU simulation on GPU hardware. Many Workbench programs may be run without any modification at all.

Important Notes:

Not all QPU programs benefit from GPU simulation. State vector simulations of 28 to 35 qubits are typically good candidates fro acceleration on a GPU via CUDA-Q integration.

The CUDA-Q support in Workbench will only function on an instance where the CUDA-Q library is installed.

Activating the GPU¶

To activate the CUDA-Q simulators, import them as shown here, and pass the desired simulator to yout QPU construction.

from psiqworkbench.filter_presets import CUDAQ_STATE_VEC  # To use NVIDIA's state vector simulator
from psiqworkbench.filter_presets import CUDAQ_TENSORNET  # To use NVIDIA's tensor network simulator

qpu = QPU(num_qubits=n, filters=CUDAQ_STATE_VEC)

The rest of your program can remain unmodified.

Alternative method: Environment variable override¶

If you want to try CUDA-Q state vector simulation without modifying your Workbench code, you can set the OVERRIDE_SV_TO_CUDAQ_SV by typing the following command into your terminal:

export OVERRIDE_SV_TO_CUDAQ_SV=1

This tells Workbench to override all >>state-vector-sim>> and SV_DEFAULT usage with the CUDAQ_STATE_VEC backend. It does not affect programs using BIT_DEFAULT or other filter presets.

Activating multiple GPUs¶

CUDA-Q allows some simulations to be spread across multiple GPUs, allowing either increased speed or a larger number of simulated qubits. To enable this, use:

from psiqworkbench.filter_presets import CUDAQ_STATE_VEC_MGPU

qpu = QPU(num_qubits=n, filters=CUDAQ_STATE_VEC_MGPU)

...and then instead of python my_script.py use mpiexec -np <m> python my_script.py to launch your program. (Replace m with the number of GPUs you plan to use.)

Limitations¶

While the majority of workflows are fully supported, please note that Workbench's CUDA-Q integration does not support the following:

Partial-state push/pull: pull_state() and push_state() can currently be called on a QPU instance, but not on a Qubits instance.
Postselect: The QPU.postselect() and Qubits.postselect() calls are not operational.
Allocation debugging: The QPU.enable_qubit_allocation_debugging() feature is not yet supported.
PPM Probability: The Qubits.peek_ppm_probability() feature is not yet supported.

Next steps¶

In this tutorial, you've learned to use Workbench simulators. The next tutorial will discuss using various simulators to test and debug your programs.