Jekyll2020-02-05T22:04:19+00:00https://arnoutdevos.github.io/feed.xmlArnoutDevos.netArnout's personal website.Arnout DevosRandom Seeding In Pytorch2018-10-30T00:00:00+00:002018-10-30T00:00:00+00:00https://arnoutdevos.github.io/Random-seeding-in-PyTorch<p>When trying to recreate reproducible code with randomness it’s a good idea to introduce a seed. As long as this seed is kept the same, the same ‘random’ things (most often numbers) will be generated repeatedly.</p>
<p>In PyTorch, this can be done using the following code:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>seed = 123
torch.manual_seed(seed)
</code></pre></div></div>Arnout DevosWhen trying to recreate reproducible code with randomness it’s a good idea to introduce a seed. As long as this seed is kept the same, the same ‘random’ things (most often numbers) will be generated repeatedly.Installing Jekyll on OSX without Xcode2018-01-01T00:00:00+00:002018-01-01T00:00:00+00:00https://arnoutdevos.github.io/Installing-Jekyll-on-OSX-without-Xcode<p>Using Jekyll is awesome! Actually this whole blog is based on it. Installing it on OSX and getting it to work, however, is another story. Officially you need to install Xcode and the Command-Line Tools it ships with. Since I never use Xcode, an alternative for installing Jekyll is as follows with the Ruby Virtual Machine (RVM):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ xcode-select --install
$ \curl -sSL https://get.rvm.io | bash -s stable
$ source ~/.rvm/scripts/rvm
$ rvm install ruby --latest
$ gem install jekyll
</code></pre></div></div>
<p>These commands make sure you get the latest version of Ruby which ensures compatibility with the latest version of jekyll.</p>Arnout DevosUsing Jekyll is awesome! Actually this whole blog is based on it. Installing it on OSX and getting it to work, however, is another story. Officially you need to install Xcode and the Command-Line Tools it ships with. Since I never use Xcode, an alternative for installing Jekyll is as follows with the Ruby Virtual Machine (RVM):Learning to generate images with Generative Adversarial Networks2017-12-18T00:00:00+00:002017-12-18T00:00:00+00:00https://arnoutdevos.github.io/Learning-to-generate-images-with-Generative-Adversarial-Networks<p>Generative models try to model the distribution of the data in an explicit way, enabling us to easily sample new data points from this model. This is in contrast to discriminative models that try to infer the output from the input. A classic deep generative model is the Variational Autoencoder (VAE). Here, another generative model that has risen to prominence in recent years, the Generative Adversarial Network (GAN), will be discussed.</p>
<p>As the maths of Generative Adversarial Networks is somewhat tedious, a story is often told of a forger and a policeman to illustrate the idea.</p>
<blockquote>
<p>Imagine a forger that makes fake bills, and a policeman that tries to find these forgeries. If the forger were a VAE, his goal would be to take some real bills, and try to replicate the real bills as precisely as possible. In GAN, he has a different idea in his mind: rather than trying to replicate the real bills, it suffices to make fake bills such that people <em>think</em> they are real.</p>
<p>Now lets start. In the beginning, the policeman knows nothing about how to distinguish between real and fake bills. The forger knows nothing either and only produces white paper.</p>
<p>In the first round, the policemanpoliceman gets the fake bill and learns that the forgeries are white while the real bills are green. The forger then finds out that white papers can no longer fool the policeman and starts to produce green papers.</p>
<p>In the second round, the policeman learns that real bills have denominations printed on them while the forgeries do not. The forger then finds out that plain papers can no longer fool the policeman and starts to print numbers on them.</p>
<p>In the third round, the policeman learns that real bills have watermarks on them while the forgeries do not. The forger then has to reproduce the watermarks on his fake bills.</p>
<p>…</p>
<p>Finally, the policeman is able to spot the tiniest difference between real and fake bills and the forger has to make perfect replicas of real bills to fool the policeman.</p>
</blockquote>
<p>Now in a GAN, the forger becomes the generator and the policeman becomes the discriminator. The discriminator is a binary classifier with the two classes being “taken from the real data” (“real”) and “generated by the generator” (“fake”). Its objective is to minimize the classification loss. The generator’s objective is to generate samples so that the discriminator misclassify them as real.</p>
<p>Here we have some complications: the goal is not to find one perfect fake sample. Such a sample will not actually fool the discriminator: if the forger makes hundreds of the exact same fake bill, they will all have the same serial number and the policeman will soon find out that they are fake. Instead, we want the generator to be able to generate a variety of fake samples such that when presented as a distribution alongside the distribution of real samples, these two are indistinguishable by the discriminator.</p>
<p>So how do we generate different samples with a diterministic generator? We provide it with random numbers as input.</p>
<p>Typically, for the discriminator we use binary cross entropy loss with label 1 being real and 0 being fake. For the generator, the input is a random vector drawn from a standard normal distribution. Denote the generator by @@G_{\phi}(z)@@, discriminator by @@D_{\theta}(x)@@, the distribution of the real samples by @@p(x)@@ and the input distribution to the generator by @@q(z)@@. Recall that the binary cross entropy loss with classifier output @@y@@ and label @@\hat{y}@@ is</p>
<script type="math/tex; mode=display">L(y, \hat{y}) = -\hat{y} \log y - (1 - \hat{y}) \log (1 - y)</script>
<p>For the discriminator, the objective is</p>
<script type="math/tex; mode=display">\min_{\theta} \mathrm{E}_{x \sim p(x)}[L(D_{\theta}(x), 1)] + \mathrm{E}_{z \sim q(z)}[L(D_{\theta}(G_{\phi}(z)), 0)]</script>
<p>For the generator, the objective is</p>
<script type="math/tex; mode=display">\max_{\phi} \mathrm{E}_{z \sim q(z)}[L(D_{\theta}(G_{\phi}(z)), 0)]</script>
<p>The generator’s objective corresponds to maximizing the classification loss of the discriminator on the generated samples. Alternatively, we can <strong>minimize</strong> the classification loss of the discriminator on the generated samples <strong>when labelled as real</strong>:</p>
<script type="math/tex; mode=display">\min_{\phi} \mathrm{E}_{z \sim q(z)}[L(D_{\theta}(G_{\phi}(z)), 1)]</script>
<p>And this is what we will use in our implementation. The strength of the two networks should be balanced, so we train the two networks alternatingly, updating the parameters in both networks once in each interation.</p>Arnout DevosGenerative models try to model the distribution of the data in an explicit way, enabling us to easily sample new data points from this model. This is in contrast to discriminative models that try to infer the output from the input. A classic deep generative model is the Variational Autoencoder (VAE). Here, another generative model that has risen to prominence in recent years, the Generative Adversarial Network (GAN), will be discussed.Adding math to Jekyll themed (GitHub) website2017-12-17T00:00:00+00:002017-12-17T00:00:00+00:00https://arnoutdevos.github.io/Adding-math-to-Jekyll-themed-(Github)-website<p>In order to use MathJax code in your Jekyll website you can add the following code to <code class="language-plaintext highlighter-rouge">_includes/scripts.html</code>:
<script src="https://gist.github.com/ArnoutDevos/51796733969a3412bfdcb8d079b71bfd.js"></script></p>
<p>All the (math) text that is between <code class="language-plaintext highlighter-rouge">$$</code> tags will be displayed as a formula on a separate line, and the <code class="language-plaintext highlighter-rouge">@@</code> tag is used for inline math.</p>
<p>Then, in each post where you want to use math you need to add <code class="language-plaintext highlighter-rouge">mathjax: true</code> to the YAML header. This lets us use inline math such as @@\int^b_a \int f(x,y)\,\mathrm{d}x\mathrm{d}y@@, and math on a separate line:</p>
<p><script type="math/tex">{n+1\choose k} = {n\choose k} + {n \choose k-1}</script>.</p>
<p>Note that contrary to regular latex practice, you want to avoid using a single <code class="language-plaintext highlighter-rouge">$</code> tag to result in inline math. The reason for this is that you might want to use the dollar sign <code class="language-plaintext highlighter-rouge">$</code> for prices without the text in between being parsed into math. For example the sentence <code class="language-plaintext highlighter-rouge">Each apple costs $1 and a pear $2.</code> would be parsed as: Each apple costs @@1 and a pear @@2.</p>Arnout DevosIn order to use MathJax code in your Jekyll website you can add the following code to _includes/scripts.html:Modeling Language with Recurrent Neural Networks2017-12-17T00:00:00+00:002017-12-17T00:00:00+00:00https://arnoutdevos.github.io/Modeling-Language-with-Recurrent-Neural-Networks<p>Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering. In this post a basic recurrent neural network (RNN), a deep neural network structure, is implemented from scratch in Python. An improved version, the Long Short-Term Memory (LSTM) architecture, that can deal better with long-term information is implemented as well.</p>
<h2 id="recurrent-neural-network-rnn">Recurrent Neural Network (RNN)</h2>
<h1 id="forward-pass">Forward pass</h1>
<p>A forward pass of an RNN takes as input features both the ones in <code class="language-plaintext highlighter-rouge">x</code> and the ones that are the output of the previous stage, <code class="language-plaintext highlighter-rouge">prev_h</code> in this case. A tangens hyperbolicus activation function is used.</p>
<script type="math/tex; mode=display">h_{next} = \tanh(W_x x + W_h h_{prev} + b)</script>
<p>Note that, although they could be concatenated and multiplied by one big weight matrix, for interpretability a separation in weights @@W_x@@ and @@W_h@@ is chosen.</p>
<script src="https://gist.github.com/ArnoutDevos/9c71c0114ebfa4f83ac2d74711e53f6d.js"></script>
<p>In the code, <code class="language-plaintext highlighter-rouge">x</code> is the input feature vector of size <code class="language-plaintext highlighter-rouge">(N,D)</code> and <code class="language-plaintext highlighter-rouge">prev_h</code> is the hidden state from the previous timestep of size <code class="language-plaintext highlighter-rouge">(N,H)</code>. The <code class="language-plaintext highlighter-rouge">meta</code> variable stores those variables needed for the backward pass.</p>
<h1 id="backward-pass">Backward pass</h1>
<p>In the backward pass of the RNN, all the necessary gradients are calculated to update the weights in both the @@W_x@@ and @@W_h@@ matrices. Here the variables stored in the <code class="language-plaintext highlighter-rouge">meta</code> variable of the forward pass are used again. For completeness, all gradients are calculated even when they don’t necessarily make sense in the first place such as @@dx@@.</p>
<script type="math/tex; mode=display">\text{d}W_x = x^T\text{d}tanh</script>
<script type="math/tex; mode=display">\text{d}W_h = h_{prev}^T\text{d}tanh</script>
<script type="math/tex; mode=display">\text{d}x = W_x^T\text{d}tanh</script>
<script type="math/tex; mode=display">\text{d}tanh = \text{d}h_{next}(1-h_{next}^2)</script>
<script src="https://gist.github.com/ArnoutDevos/29f6afb5b6da3091a7c4696e31f85004.js"></script>
<p>In the code,</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">dnext_h</code>: gradients with respect to the next hidden state,</li>
<li><code class="language-plaintext highlighter-rouge">meta</code>: the variables needed for the backward pass,</li>
<li><code class="language-plaintext highlighter-rouge">dx</code>: gradients of the input features <code class="language-plaintext highlighter-rouge">(N, D)</code></li>
<li><code class="language-plaintext highlighter-rouge">dprev_h</code>: gradients of previous hiddel state <code class="language-plaintext highlighter-rouge">(N, H)</code></li>
<li><code class="language-plaintext highlighter-rouge">dWh</code>: gradients w.r.t. feature-to-hidden weights <code class="language-plaintext highlighter-rouge">(D, H)</code></li>
<li><code class="language-plaintext highlighter-rouge">dWx</code>: gradients w.r.t. hidden-to-hidden weights <code class="language-plaintext highlighter-rouge">(H, H)</code></li>
<li><code class="language-plaintext highlighter-rouge">db</code>: gradients w.r.t bias <code class="language-plaintext highlighter-rouge">(H,)</code></li>
</ul>
<h2 id="long-short-term-network">Long Short-Term Network</h2>Arnout DevosRecurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering. In this post a basic recurrent neural network (RNN), a deep neural network structure, is implemented from scratch in Python. An improved version, the Long Short-Term Memory (LSTM) architecture, that can deal better with long-term information is implemented as well.A Neural Algorithm of Artistic Style2017-12-03T00:00:00+00:002017-12-03T00:00:00+00:00https://arnoutdevos.github.io/A-Neural-Algorithm-of-Artistic-Style<p>In this post we will implement the style transfer technique from the paper <a href="https://arxiv.org/abs/1508.06576">A Neural Algorithm of Artistic Style</a>. The general idea is to take two images, and produce a new image that reflects the content of one but the artistic “style” of the other. We will do this by first formulating a loss function that matches the content and style of each respective image in the feature space of a deep neural network, and then performing gradient descent on the pixels of the image itself.</p>
<p>A (less verbose) runnable Python file can be found <a href="https://github.com/ArnoutDevos/StyleTransferTensorFlow">on GitHub</a>.</p>
<figure class="third ">
<a href="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/images/muse.jpg" title="Style image: Muse">
<img src="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/images/muse.jpg" alt="" />
</a>
<a href="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/images/hollywood_sign.jpg" title="Content image: Hollywood sign">
<img src="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/images/hollywood_sign.jpg" alt="" />
</a>
<a href="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/output/200.png" title="Result of StyleTransfer after 200 iterations">
<img src="https://raw.githubusercontent.com/ArnoutDevos/StyleTransferTensorFlow/master/output/200.png" alt="" />
</a>
<figcaption>Two input images and one output image of the Style Transfer algorithm after 200 iterations.
</figcaption>
</figure>
<h2 id="model">Model</h2>
<p>We need to take the advatange of a CNN structure which (implicitly) understands image contents and styles. Rather than training a completely new model from scratch, we will use a pre-trained model to achieve our purpose - called “transfer learning”.</p>
<p>We will use the VGG19 model. Since the model itself is very large (>500Mb) you will need to download the VGG-19 model and put it under the model/ folder. The comments below describe the dimensions of the VGG19 model. We will replace the max pooling layers with average pooling layers as the paper suggests, and discard all fully connected layers.</p>
<p>We use the VGG 19-layer model by from the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition” and store its path in the variable VGG_MODEL. In order to use this VGG model we need to substract the mean of the images originally used to train the VGG model from the new input images to be consistent. This affects the performance greatly.</p>
<p>The <code class="language-plaintext highlighter-rouge">load_vgg_model()</code> function returns a model for the purpose of ‘painting’ the picture. It takes only the convolution layer weights and wrap using the TensorFlow Conv2d, Relu and AveragePooling layer.</p>
<script src="https://gist.github.com/ArnoutDevos/fb9654a0e9908f7e320046dfee36791a.js"></script>
<h2 id="input-images">Input images</h2>
<p>We need to define some constants for the inputs and we will be using RGB images with a 640 x 480 resolution, but you can easily modify the code to accommodate different sizes. Because of the mean substraction, that messes up the image visually, two helper functions <code class="language-plaintext highlighter-rouge">load_image()</code> and <code class="language-plaintext highlighter-rouge">recover_image()</code> are used.</p>
<script src="https://gist.github.com/ArnoutDevos/e2e3a1734b81930f0719138bd156fd6b.js"></script>
<p>Using the previously defined helper function we can load the input images. The vgg model expects image data with <code class="language-plaintext highlighter-rouge">MEAN_VALUES</code> subtracted to function correctly. <code class="language-plaintext highlighter-rouge">load_image()</code> already handles this. The subtracted images will look funny.</p>
<script src="https://gist.github.com/ArnoutDevos/85c4862df37582f7380269585a94b90e.js"></script>
<h2 id="random-image-generator">Random Image Generator</h2>
<p>The first step of style tranfer is to generate a starting image. The model will then gradually adjust this starting image towards target content/style. We will need a random image generator. The generated image can be arbitrary and doesn’t necessarily have anything to do with the content image. But, generating something similar to the content image will reduce our computing time.</p>
<script src="https://gist.github.com/ArnoutDevos/84d5e1d3c93e781bae71d0c900292bb3.js"></script>
<p>Now we can check by visualizing images we generated. Keep in mind that <code class="language-plaintext highlighter-rouge">noise_ratio = 0.0</code> produces the original subtracted image, while <code class="language-plaintext highlighter-rouge">noise_ratio = 1.0</code> produces a complete random noise. Notice that the visulized images are not necessarily more clear with lower noise_ratio. The image sometimes looks sharper when some intermediate level of noise is added. Because the noise is random, borders/edges will stick out compared to more regular surfaces. This leads to a ‘sharper’ perception. This ofcourse only holds when the noise levels are low enough to not make the borders/edges completely disappear (you sum the noise to the original image)</p>
<script src="https://gist.github.com/ArnoutDevos/7f323379ca6664252fb93a7a0f87afc5.js"></script>
<h1 id="loss-functions">Loss functions</h1>
<p>Once we generate a new image, we would like to evaluate by how much it maintains contents while approaching the target style. This can be defined by a loss function. The loss function is a weighted sum of two terms: <em>content loss</em> and <em>style loss</em>.</p>
<h2 id="content-loss">Content Loss</h2>
<p>Let’s first write the content loss function of equation (1) from the paper. Content loss measures how much the feature map of the generated image differs from the feature map of the source image. We only care about the content representation of one layer of the network (say, layer @@\ell@@), that has feature maps @@A^\ell \in \mathbb{R}^{1 \times H_\ell \times W_\ell \times N_\ell}@@. @@N_\ell@@ is the number of filters/channels in layer @@\ell@@, @@H_\ell@@ and @@W_\ell@@ are the height and width. We will work with reshaped versions of these feature maps that combine all spatial positions into one dimension. Let @@F^\ell \in \mathbb{R}^{M_\ell \times N_\ell}@@ be the feature map for the current image and @@P^\ell \in \mathbb{R}^{M_\ell \times N_\ell}@@ be the feature map for the content source image where @@M_\ell=H_\ell\times W_\ell@@ is the number of elements in each feature map. Each row of @@F^\ell@@ or @@P^\ell@@ represents the vectorized activations of a particular filter, convolved over all positions of the image.</p>
<p>Then the content loss is given by:</p>
<script type="math/tex; mode=display">L_c = \frac{1}{2} \sum_{i,j} (F_{ij}^{\ell} - P_{ij}^{\ell})^2</script>
<p>We are only concerned with the “conv4_2” layer of the model.</p>
<script src="https://gist.github.com/ArnoutDevos/6bcc3e5f5baff5703aee969150c7acfc.js"></script>
<h2 id="style-loss">Style Loss</h2>
<p>Now we can tackle the style loss of equation (5) from the paper. For a given layer @@\ell@@, the style loss is defined as follows:</p>
<p>First, compute the Gram matrix G which represents the correlations between the responses of each filter, where F is as above. The Gram matrix is an approximation to the covariance matrix – we want the activation statistics of our generated image to match the activation statistics of our style image, and matching the (approximate) covariance is one way to do that. There are a variety of ways you could do this, but the Gram matrix is nice because it’s easy to compute and in practice shows good results.</p>
<p>Given a feature map @@F^\ell@@ of shape @@(1, M_\ell, N_\ell)@@, the Gram matrix has shape @@(1, N_\ell, N_\ell)@@ and its elements are given by:</p>
<script type="math/tex; mode=display">G_{ij}^\ell = \sum_k F^{\ell}_{ik} F^{\ell}_{jk}</script>
<p>Assuming @@G^\ell@@ is the Gram matrix from the feature map of the current image, @@A^\ell@@ is the Gram Matrix from the feature map of the source style image, then the style loss for the layer @@\ell@@ is simply the Euclidean distance between the two Gram matrices:</p>
<script type="math/tex; mode=display">E_\ell = \frac{1}{4 N^2_\ell M^2_\ell} \sum_{i, j} \left(G^\ell_{ij} - A^\ell_{ij}\right)^2</script>
<p>In practice we usually compute the style loss at a set of layers @@\mathcal{L}@@ rather than just a single layer @@\ell@@; then the total style loss is the weighted sum of style losses at each layer by @@w_\ell@@:</p>
<script type="math/tex; mode=display">L_s = \sum_{\ell \in \mathcal{L}} w_\ell E_\ell</script>
<p>In our case it is a summation from conv1_1 (lower layer) to conv5_1 (higher layer). Intuitively, the style loss across multiple layers captures lower level features (hard strokes, points, etc) to higher level features (styles, patterns, even objects).
<script src="https://gist.github.com/ArnoutDevos/0cb0328aa09633d0abb057de7362234d.js"></script></p>
<h2 id="building-a-tensorflow-session-and-model">Building a TensorFlow session and model</h2>
<script src="https://gist.github.com/ArnoutDevos/f42414e971d159a42e34255b35369701.js"></script>
<h2 id="total-loss">Total loss</h2>
<p>The last thing we need to do is the most important: define what loss function to optimize. The total loss function is a weighted sum of the content loss and style loss by @@\alpha@@ and @@\beta@@ respectively.</p>
<script type="math/tex; mode=display">L = \alpha L_c + \beta L_s</script>
<script src="https://gist.github.com/ArnoutDevos/92c9a9a096762a5bf668e0889840d7e6.js"></script>
<h2 id="run">Run</h2>
<p>Now we run the model which saves the resulting image every 50 iterations. You can find those intermediate images in the <code class="language-plaintext highlighter-rouge">output/</code> folder.</p>
<script src="https://gist.github.com/ArnoutDevos/ca35f2f5ac8c860f3a171eba113fa9c5.js"></script>
<p>Note on that it usually takes almost half an hour to run just 50 iterations on a 2015 Macbook Pro 15” base model CPU. Personally I preferred using Google Cloud Platform which speeded up calculations a lot (50 iterations took me 5 minutes instead).</p>Arnout DevosIn this post we will implement the style transfer technique from the paper A Neural Algorithm of Artistic Style. The general idea is to take two images, and produce a new image that reflects the content of one but the artistic “style” of the other. We will do this by first formulating a loss function that matches the content and style of each respective image in the feature space of a deep neural network, and then performing gradient descent on the pixels of the image itself.Humanoid Imitation Learning from Diverse Sources2017-12-01T00:00:00+00:002017-12-01T00:00:00+00:00https://arnoutdevos.github.io/Humanoid-Imitation-Learning-from-Diverse-Sources<p><img src="https://uscresl.github.io/humanoid-gail/architecture.svg" class="image mod-full-width" /></p>
<p><a href="https://uscresl.github.io/humanoid-gail/">This link</a> describes our experience implementing a system which learns locomotion skills for humanoid skeletons from imitation, and all of the supporting infrastructure and data processing necessary to do so. Our system employs a generative adversarial imitation learning (GAIL) architecture, which is a type of generative adversarial network. We successfully trained our GAIL to control a custom-designed humanoid skeleton, using expert demonstrations from a reinforcment-learned (RL) policy using that skeleton. We also explored several methods for deriving real human motion demonstrations from video and developed a preprocessing pipeline for motion capture data. Our system is work-in-progress, making it the foundation for several possible future research projects.</p>
<p>The complete post can be found <a href="https://uscresl.github.io/humanoid-gail/">here</a>.</p>Arnout DevosViterbi Pseudocode2017-11-07T00:00:00+00:002017-11-07T00:00:00+00:00https://arnoutdevos.github.io/Viterbi-Pseudocode<p>This version of the popular Viterbi algorithm assumes that all the input values are given in log-probabilities. Therefore summations are used instead of multiplications. The input sentence has <code class="language-plaintext highlighter-rouge">N</code> words, and we are trying to assign a label to each word, chosen from a set of <code class="language-plaintext highlighter-rouge">L</code> labels.</p>
<h2 id="viterbi-algorithm-pseudocode">Viterbi algorithm pseudocode</h2>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>viterbi(Emission, Trans, Start, End):
# Set internal variables
s <- 0.0
Y <- []
Trellis <- empty NxL matrix
Backpointers <- empty (N-1)xL matrix
# Set first row of Trellis
Trellis[0, :] <- start + Emission[0,:]
# Construct rest of Trellis table and keep Backpointers table
for each word i in {1, ..., N-1}:
for each label j in {0, ..., L-1}:
Trellis[i,j] <- Emission[i,j] + max_k(Trans[k,j] + Trellis[i-1,k])
Backpointers[i-1,j] <- argmax_k(Trans[k,j] + Trellis[i-1,k])
# Calculate total score s, last backpointer b_next, add b_next to the result Y
s = max_k(End[k] + Trellis[-1,k])
b_next = argmax_k(End[k] + Trellis[N-1,k])
Y[N-1] <- b_next
# Do backpropagation for remaining N-1 words
for word i in {N-2, ..., 0}:
b_next <- Backpointers[i, b_next]
Y[i] <- b_next
return (s,Y)
</code></pre></div></div>
<p><strong>Explanation</strong></p>
<p>The algorithm takes 2 matrices and 2 vectors as input:</p>
<ul>
<li>
<p><em><code class="language-plaintext highlighter-rouge">Emission</code></em> is an <code class="language-plaintext highlighter-rouge">NxL</code> matrix storing the log-probability of observing word n, given a label l</p>
<p><code class="language-plaintext highlighter-rouge">P(n|l) = Emission[n,l]</code></p>
</li>
<li>
<p><em><code class="language-plaintext highlighter-rouge">Trans</code></em> is an <code class="language-plaintext highlighter-rouge">LxL</code> matrix storing the transition log-probability from the previous label (Yp) to the current label (Yc)</p>
<p><code class="language-plaintext highlighter-rouge">P(Yc|Yp) = Trans[Yp,Yc]</code></p>
</li>
<li>
<p><em><code class="language-plaintext highlighter-rouge">Start</code></em> is a <code class="language-plaintext highlighter-rouge">Lx1</code> vector storing the transition log-probability of the beginning of a sentence <code class="language-plaintext highlighter-rouge"><s></code> to every label <code class="language-plaintext highlighter-rouge">l</code></p>
<p><code class="language-plaintext highlighter-rouge">P(l|<s>) = Start[l]</code></p>
</li>
<li>
<p><em><code class="language-plaintext highlighter-rouge">End</code></em> is a <code class="language-plaintext highlighter-rouge">Lx1</code> vector storing the transition log-probability from the label <code class="language-plaintext highlighter-rouge">l</code> of the last word to the end of the sentence <code class="language-plaintext highlighter-rouge"></s></code></p>
<p><code class="language-plaintext highlighter-rouge">P(</s>|l) = End[l]</code></p>
</li>
</ul>
<p>Internally, the <code class="language-plaintext highlighter-rouge">NxL</code> matrix <em><code class="language-plaintext highlighter-rouge">Trellis[i,l]</code></em> stores the score of the <em>best</em> sequence from 1 … i such that l<sub>i</sub> = l. The <code class="language-plaintext highlighter-rouge">(N-1)xL</code> <em><code class="language-plaintext highlighter-rouge">Backpointers</code></em> matrix tracks from which previous label the calculated optimal score for each cell came. Note that <em><code class="language-plaintext highlighter-rouge">Backpointers</code></em> has one row less than <em><code class="language-plaintext highlighter-rouge">Trellis</code></em> as the last backpointer can be stored in a single variable (<code class="language-plaintext highlighter-rouge">b_next</code>), before starting backpropagation.</p>Arnout DevosThis version of the popular Viterbi algorithm assumes that all the input values are given in log-probabilities. Therefore summations are used instead of multiplications. The input sentence has N words, and we are trying to assign a label to each word, chosen from a set of L labels. Viterbi algorithm pseudocodeStudent Startup Forum 20172017-03-27T00:00:00+00:002017-03-27T00:00:00+00:00https://arnoutdevos.github.io/Student-Startup-Forum-2017<p>On March 27th, 2017, <a href="http://www.afcleuven.be/">Academics For Companies</a> organized the sixth edition of Belgium’s biggest startup event for students: <a href="https://arnoutdevos.github.io/assets/html/stst2017/">Student Startup Forum 2017</a>. It served as a platform for students to think about entrepreneurship as a career option. 6 keynote speakers, from ambitious youngster to industry leader, shared their experiences and outlook. Personal testimonies were given by 15 carefully selected startups, from high-tech spinoff to social entrepreneur, while workshops provided a more hands-on experience.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/kjtW_oByZCA?rel=0&showinfo=0" frameborder="0" gesture="media" allow="encrypted-media" allowfullscreen=""></iframe>Arnout DevosOn March 27th, 2017, Academics For Companies organized the sixth edition of Belgium’s biggest startup event for students: Student Startup Forum 2017. It served as a platform for students to think about entrepreneurship as a career option. 6 keynote speakers, from ambitious youngster to industry leader, shared their experiences and outlook. Personal testimonies were given by 15 carefully selected startups, from high-tech spinoff to social entrepreneur, while workshops provided a more hands-on experience.Dragon self-balancing two-wheel robot RF link2014-05-01T00:00:00+00:002014-05-01T00:00:00+00:00https://arnoutdevos.github.io/Dragon-EECS-robot<p>In 2013, <a href="https://www.kuleuven.be/wieiswie/en/person/00101844">Jona Beysens</a>, <a href="https://www.kuleuven.be/wieiswie/en/person/00104326">Andreas Van Barel</a>, and I were concerned with constructing a radiofrequency (RF) link from scratch on a Xilinx FPGA, using MATLAB Simulink for our Bachelor’s in EECS project. Our design and its progress can be seen in the following two posters:</p>
<figure class="half">
<a href="https://arnoutdevos.github.io/assets/pdf/dragonrfA1.pdf"><img src="https://raw.githubusercontent.com/ArnoutDevos/ArnoutDevos.github.io/master/assets/images/DragonPoster1.png" height="150" /></a>
<a href="https://arnoutdevos.github.io/assets/pdf/dragonrfA1Testing.pdf"><img src="https://raw.githubusercontent.com/ArnoutDevos/ArnoutDevos.github.io/master/assets/images/DragonPoster2.png" height="150" /></a>
<figcaption>Two posters describing our design and its progress.</figcaption>
</figure>
<p>Our RF link was part of a bigger project with 20 students to build a two-wheel self-balancing robot (similar to a segway). The final robot is shown in the video below:</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/RHjh0bwaSz8?rel=0&showinfo=0" frameborder="0" gesture="media" allow="encrypted-media" allowfullscreen=""></iframe>
<p>Next to our functional RF link, I created an HTML5 web app which used the gyroscope and accelerometer of any mobile phone (Android, iOS, …) to steer the robot. It uses a PHP and Perl backend to send commands from the webserver through a serial port (USB) to our base station FPGA, which then transmits the data wirelessly to the robot FPGA.</p>Arnout DevosIn 2013, Jona Beysens, Andreas Van Barel, and I were concerned with constructing a radiofrequency (RF) link from scratch on a Xilinx FPGA, using MATLAB Simulink for our Bachelor’s in EECS project. Our design and its progress can be seen in the following two posters: