





















|                      |                                                                                      | HyPER Optimization              |                               |             |                  |                  |                                                 |  |
|----------------------|--------------------------------------------------------------------------------------|---------------------------------|-------------------------------|-------------|------------------|------------------|-------------------------------------------------|--|
| VivadoHLS            | Characterize                                                                         | basic                           | : build                       | ing blo     | ocks             |                  |                                                 |  |
|                      | Building Blocks                                                                      | LUI                             | FF                            | BRAM        | DSF              | P ns/val.        | Static optimization (II P):                     |  |
| High-Level Synthesis | Increment Generator:<br>Mersenne Twister<br>ICDF<br>Antithetic Core                  | 301<br>451<br>225               | 1 323<br>1 592<br>3 258       |             | 010              | -                | Optimized architecture<br>for each level        |  |
|                      | Path Generators:<br>Single-Level Kernel<br>Multilevel Kernel                         | 4 153<br>5 607                  | 3 4241<br>7 5326              | 266         | 38<br>43         | _                | $\mathcal{H}_1$ $\mathcal{H}_2$ $\mathcal{H}_3$ |  |
|                      | Payoff Features F <sub>i</sub> :<br>Barrier                                          | 180                             | 0 158                         | 8 0         | 0                | -                |                                                 |  |
|                      | Payoff h:<br>Call/Put                                                                | 440                             | 396                           | 6 0         | 2                | 6                |                                                 |  |
|                      | Feature<br>Serializer k×1<br>Exponential<br>Multilevel Difference<br>Statistics II-1 | 30k+65<br>900<br>2 372<br>2 170 | 5 65k+45 $384$ $2 355$ $1615$ |             | 0<br>7<br>2<br>9 | $250 \\ 5 \\ 6$  |                                                 |  |
|                      | Statistics II=2                                                                      | 1 454                           | 1 1164                        | 2           | 6                | 3                |                                                 |  |
|                      | Com. Interface $\Psi$<br>FPGA $\rightarrow$ CPU                                      | LUT                             | FF                            | BRAM        | Ban<br>in        | dwidth<br>MB/s   | $\downarrow$                                    |  |
|                      | Config-Bus 1×k<br>Streaming-Fifo<br>DMA-Core                                         | 30k+50<br>654<br>1 864          | 2k+40<br>4 611<br>4 3 122     | 0<br>4<br>4 |                  | < 1<br>20<br>350 | Bitstreams                                      |  |
|                      | Hybrid Chin F                                                                        | LUT                             | FF F                          | RAM         | DSP              | ARM              |                                                 |  |
|                      | Xilinx Zynq 7020<br>Synthesis weight $\alpha$                                        | 53 200<br>0.8                   | 106 400<br>0.5                | 280<br>1    | 220<br>1         | 2 cores          |                                                 |  |
|                      |                                                                                      |                                 |                               |             |                  |                  | Reconfigure at runtime                          |  |







