{ "cells": [ { "cell_type": "markdown", "id": "cef49fd5", "metadata": { "origin_pos": 0 }, "source": [ "# Compilers and Interpreters\n", ":label:`sec_hybridize`\n", "\n", "So far, this book has focused on imperative programming, which makes use of statements such as `print`, `+`, and `if` to change a program's state. Consider the following example of a simple imperative program.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "2c48ab86", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:08.456290Z", "iopub.status.busy": "2023-08-18T19:32:08.456017Z", "iopub.status.idle": "2023-08-18T19:32:08.467816Z", "shell.execute_reply": "2023-08-18T19:32:08.466792Z" }, "origin_pos": 1, "tab": [ "pytorch" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10\n" ] } ], "source": [ "def add(a, b):\n", " return a + b\n", "\n", "def fancy_func(a, b, c, d):\n", " e = add(a, b)\n", " f = add(c, d)\n", " g = add(e, f)\n", " return g\n", "\n", "print(fancy_func(1, 2, 3, 4))" ] }, { "cell_type": "markdown", "id": "6dafbc49", "metadata": { "origin_pos": 2 }, "source": [ "Python is an *interpreted language*. When evaluating the above `fancy_func` function it performs the operations making up the function's body *in sequence*. That is, it will evaluate `e = add(a, b)` and store the results as variable `e`, thereby changing the program's state. The next two statements `f = add(c, d)` and `g = add(e, f)` will be executed similarly, performing additions and storing the results as variables. :numref:`fig_compute_graph` illustrates the flow of data.\n", "\n", "![Data flow in an imperative program.](../img/computegraph.svg)\n", ":label:`fig_compute_graph`\n", "\n", "Although imperative programming is convenient, it may be inefficient. On the one hand, even if the `add` function is repeatedly called throughout `fancy_func`, Python will execute the three function calls individually. If these are executed, say, on a GPU (or even on multiple GPUs), the overhead arising from the Python interpreter can become overwhelming. Moreover, it will need to save the variable values of `e` and `f` until all the statements in `fancy_func` have been executed. This is because we do not know whether the variables `e` and `f` will be used by other parts of the program after the statements `e = add(a, b)` and `f = add(c, d)` are executed.\n", "\n", "## Symbolic Programming\n", "\n", "Consider the alternative, *symbolic programming*, where computation is usually performed only once the process has been fully defined. This strategy is used by multiple deep learning frameworks, including Theano and TensorFlow (the latter has acquired imperative extensions). It usually involves the following steps:\n", "\n", "1. Define the operations to be executed.\n", "1. Compile the operations into an executable program.\n", "1. Provide the required inputs and call the compiled program for execution.\n", "\n", "This allows for a significant amount of optimization. First, we can skip the Python interpreter in many cases, thus removing a performance bottleneck that can become significant on multiple fast GPUs paired with a single Python thread on a CPU. \n", "Second, a compiler might optimize and rewrite the above code into `print((1 + 2) + (3 + 4))` or even `print(10)`. This is possible since a compiler gets to see the full code before turning it into machine instructions. For instance, it can release memory (or never allocate it) whenever a variable is no longer needed. Or it can transform the code entirely into an equivalent piece.\n", "To get a better idea, consider the following simulation of imperative programming (it is Python after all) below.\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "27a95270", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:08.472772Z", "iopub.status.busy": "2023-08-18T19:32:08.472202Z", "iopub.status.idle": "2023-08-18T19:32:08.479140Z", "shell.execute_reply": "2023-08-18T19:32:08.478119Z" }, "origin_pos": 3, "tab": [ "pytorch" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "def add(a, b):\n", " return a + b\n", "\n", "def fancy_func(a, b, c, d):\n", " e = add(a, b)\n", " f = add(c, d)\n", " g = add(e, f)\n", " return g\n", "print(fancy_func(1, 2, 3, 4))\n", "10\n" ] } ], "source": [ "def add_():\n", " return '''\n", "def add(a, b):\n", " return a + b\n", "'''\n", "\n", "def fancy_func_():\n", " return '''\n", "def fancy_func(a, b, c, d):\n", " e = add(a, b)\n", " f = add(c, d)\n", " g = add(e, f)\n", " return g\n", "'''\n", "\n", "def evoke_():\n", " return add_() + fancy_func_() + 'print(fancy_func(1, 2, 3, 4))'\n", "\n", "prog = evoke_()\n", "print(prog)\n", "y = compile(prog, '', 'exec')\n", "exec(y)" ] }, { "cell_type": "markdown", "id": "1139822b", "metadata": { "origin_pos": 4 }, "source": [ "The differences between imperative (interpreted) programming and symbolic programming are as follows:\n", "\n", "* Imperative programming is easier. When imperative programming is used in Python, the majority of the code is straightforward and easy to write. It is also easier to debug imperative programming code. This is because it is easier to obtain and print all relevant intermediate variable values, or use Python's built-in debugging tools.\n", "* Symbolic programming is more efficient and easier to port. Symbolic programming makes it easier to optimize the code during compilation, while also having the ability to port the program into a format independent of Python. This allows the program to be run in a non-Python environment, thus avoiding any potential performance issues related to the Python interpreter.\n", "\n", "\n", "## Hybrid Programming\n", "\n", "Historically most deep learning frameworks choose between an imperative or a symbolic approach. For example, Theano, TensorFlow (inspired by the former), Keras, and CNTK formulate models symbolically. Conversely, Chainer and PyTorch take an imperative approach. An imperative mode was added to TensorFlow 2.0 and Keras in later revisions.\n" ] }, { "cell_type": "markdown", "id": "0e58166e", "metadata": { "origin_pos": 6, "tab": [ "pytorch" ] }, "source": [ "As mentioned above, PyTorch is based on imperative programming and uses dynamic computation graphs. In an effort to leverage the portability and efficiency of symbolic programming, developers considered whether it would be possible to combine the benefits of both programming paradigms. This led to a torchscript that lets users develop and debug using pure imperative programming, while having the ability to convert most programs into symbolic programs to be run when product-level computing performance and deployment are required.\n" ] }, { "cell_type": "markdown", "id": "69c139d7", "metadata": { "origin_pos": 8 }, "source": [ "## Hybridizing the `Sequential` Class\n", "\n", "The easiest way to get a feel for how hybridization works is to consider deep networks with multiple layers. Conventionally the Python interpreter will need to execute the code for all layers to generate an instruction that can then be forwarded to a CPU or a GPU. For a single (fast) computing device this does not cause any major issues. On the other hand, if we use an advanced 8-GPU server such as an AWS P3dn.24xlarge instance Python will struggle to keep all GPUs busy. The single-threaded Python interpreter becomes the bottleneck here. Let's see how we can address this for significant parts of the code by replacing `Sequential` with `HybridSequential`. We begin by defining a simple MLP.\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "4cc8af33", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:08.484297Z", "iopub.status.busy": "2023-08-18T19:32:08.483690Z", "iopub.status.idle": "2023-08-18T19:32:11.518419Z", "shell.execute_reply": "2023-08-18T19:32:11.517515Z" }, "origin_pos": 10, "tab": [ "pytorch" ] }, "outputs": [ { "data": { "text/plain": [ "tensor([[-0.1602, 0.0003]], grad_fn=)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import torch\n", "from torch import nn\n", "from d2l import torch as d2l\n", "\n", "\n", "# Factory for networks\n", "def get_net():\n", " net = nn.Sequential(nn.Linear(512, 256),\n", " nn.ReLU(),\n", " nn.Linear(256, 128),\n", " nn.ReLU(),\n", " nn.Linear(128, 2))\n", " return net\n", "\n", "x = torch.randn(size=(1, 512))\n", "net = get_net()\n", "net(x)" ] }, { "cell_type": "markdown", "id": "dcbd5cb8", "metadata": { "origin_pos": 13, "tab": [ "pytorch" ] }, "source": [ "By converting the model using `torch.jit.script` function, we are able to compile and optimize the computation in the MLP. The model's computation result remains unchanged.\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "8934d85c", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:11.522293Z", "iopub.status.busy": "2023-08-18T19:32:11.521613Z", "iopub.status.idle": "2023-08-18T19:32:11.669261Z", "shell.execute_reply": "2023-08-18T19:32:11.668397Z" }, "origin_pos": 16, "tab": [ "pytorch" ] }, "outputs": [ { "data": { "text/plain": [ "tensor([[-0.1602, 0.0003]], grad_fn=)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net = torch.jit.script(net)\n", "net(x)" ] }, { "cell_type": "markdown", "id": "c52d0bc2", "metadata": { "origin_pos": 19, "tab": [ "pytorch" ] }, "source": [ "This seems almost too good to be true: write the same code as before and simply convert the model using `torch.jit.script`. Once this happens the network is optimized (we will benchmark the performance below).\n" ] }, { "cell_type": "markdown", "id": "17420099", "metadata": { "origin_pos": 21 }, "source": [ "### Acceleration by Hybridization\n", "\n", "To demonstrate the performance improvement gained by compilation we compare the time needed to evaluate `net(x)` before and after hybridization. Let's define a class to measure this time first. It will come handy throughout the chapter as we set out to measure (and improve) performance.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "74c63619", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:11.672972Z", "iopub.status.busy": "2023-08-18T19:32:11.672389Z", "iopub.status.idle": "2023-08-18T19:32:11.677426Z", "shell.execute_reply": "2023-08-18T19:32:11.676627Z" }, "origin_pos": 22, "tab": [ "pytorch" ] }, "outputs": [], "source": [ "#@save\n", "class Benchmark:\n", " \"\"\"For measuring running time.\"\"\"\n", " def __init__(self, description='Done'):\n", " self.description = description\n", "\n", " def __enter__(self):\n", " self.timer = d2l.Timer()\n", " return self\n", "\n", " def __exit__(self, *args):\n", " print(f'{self.description}: {self.timer.stop():.4f} sec')" ] }, { "cell_type": "markdown", "id": "1f3fd6d6", "metadata": { "origin_pos": 24, "tab": [ "pytorch" ] }, "source": [ "Now we can invoke the network twice, once with and once without torchscript.\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "69585cb8", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:11.680884Z", "iopub.status.busy": "2023-08-18T19:32:11.680338Z", "iopub.status.idle": "2023-08-18T19:32:17.913447Z", "shell.execute_reply": "2023-08-18T19:32:17.912391Z" }, "origin_pos": 27, "tab": [ "pytorch" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Without torchscript: 2.1447 sec\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "With torchscript: 4.0545 sec\n" ] } ], "source": [ "net = get_net()\n", "with Benchmark('Without torchscript'):\n", " for i in range(1000): net(x)\n", "\n", "net = torch.jit.script(net)\n", "with Benchmark('With torchscript'):\n", " for i in range(1000): net(x)" ] }, { "cell_type": "markdown", "id": "89a3097d", "metadata": { "origin_pos": 30, "tab": [ "pytorch" ] }, "source": [ "As is observed in the above results, after an `nn.Sequential` instance is scripted using the `torch.jit.script` function, computing performance is improved through the use of symbolic programming.\n" ] }, { "cell_type": "markdown", "id": "e10e5f06", "metadata": { "origin_pos": 32 }, "source": [ "### Serialization\n" ] }, { "cell_type": "markdown", "id": "380c9842", "metadata": { "origin_pos": 34, "tab": [ "pytorch" ] }, "source": [ "One of the benefits of compiling the models is that we can serialize (save) the model and its parameters to disk. This allows us to store a model in a manner that is independent of the front-end language of choice. This allows us to deploy trained models to other devices and easily use other front-end programming languages. At the same time the code is often faster than what can be achieved in imperative programming. Let's see the `save` function in action.\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "e4d801da", "metadata": { "execution": { "iopub.execute_input": "2023-08-18T19:32:17.918181Z", "iopub.status.busy": "2023-08-18T19:32:17.917287Z", "iopub.status.idle": "2023-08-18T19:32:18.088454Z", "shell.execute_reply": "2023-08-18T19:32:18.086861Z" }, "origin_pos": 37, "tab": [ "pytorch" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-rw-r--r-- 1 ci ci 651K Aug 18 19:32 my_mlp\r\n" ] } ], "source": [ "net.save('my_mlp')\n", "!ls -lh my_mlp*" ] }, { "cell_type": "markdown", "id": "e9e55a0a", "metadata": { "origin_pos": 50 }, "source": [ "## Summary\n", "\n", "\n", "* Imperative programming makes it easy to design new models since it is possible to write code with control flow and the ability to use a large amount of the Python software ecosystem.\n", "* Symbolic programming requires that we specify the program and compile it before executing it. The benefit is improved performance.\n" ] }, { "cell_type": "markdown", "id": "22d2b11e", "metadata": { "origin_pos": 52 }, "source": [ "## Exercises\n" ] }, { "cell_type": "markdown", "id": "e004259d", "metadata": { "origin_pos": 54, "tab": [ "pytorch" ] }, "source": [ "1. Review the models that interest you in the previous chapters. Can you improve their computational performance by reimplementing them?\n" ] }, { "cell_type": "markdown", "id": "e1d0a88d", "metadata": { "origin_pos": 56, "tab": [ "pytorch" ] }, "source": [ "[Discussions](https://discuss.d2l.ai/t/2490)\n" ] } ], "metadata": { "language_info": { "name": "python" }, "required_libs": [] }, "nbformat": 4, "nbformat_minor": 5 }