Hello! Welcome to the latest blog post in my Google Summer of Code series. With this blog, we’ll marking the end of the second phase of evaluations! Additionally, this blog is going to be a little different from all the previous ones as we finally start to work and discuss about Python and the actual RADIS code base.
Hey! Continuing from where we left off last time, in this blog I’ll be talking about the work I did in the period between 1st - 15th July. As I had mentioned in the previous blog, my primary task for this period was to get the Cython/CuPy version of our proof-of-work code to produce the correct output. I had initially expected the work to be relatively easy, as the output was already there; I simply had to check the logic part of the program to identify why it wasn’t correct. However, the work actually turned out to be a lot more painful that I’d have imagined. I had briefly touched upon this fact in the last blog as well, but the first thing I had to do in order to finish this task was to ensure that the debugging tools and methods were in place. This, fortunately, was one of the easier parts of the problem. The problem was two fold, and thus required 2 different solutions as well. First, the loading of the dataset on to the RAM was a challenging task for my computer. I am not sure exactly why that was the case, but it certainly was nowhere as performant as the C++ code which did the same task of loading the exact same data in a relative breeze. I looked around the internet to understand what could possibly be slowing down the loading step so much, which was quite simply a single line : v0 = np.load(dir+path+'v0.npy'). The file itself was around 400MB, and there were a total of 8 of them, pushing the total memory required to around 3 GB. The solution to this problem was relatively straight forward, and felt almost as if it was hiding in plain sight when I did find it. The core idea is that when we try and load a numpy array the way I was doing without having declared the variable previously as a c-type, Cython quite naturally assumes it to be a pure Python variable and therefore fails to deliver the performance boost it promises on compilation. This however, was a trivial issue to resolve. All I had to leverage the advantage promised by Cython was to declare the numpy arrays prior to loading them with the datasets. This was done with a simple line cdef np.ndarray[dtype=np.float32_t, ndim=1] v0 = np.zeros(N_points, dtype=np.float32). That was it! With just this single addition, the array was now a c-type variable and thus was processed significantly faster than the older pure numpy arrays. Unfortunately I didn’t benchmark the difference as I still have some things which I am not super confident about related to this part. I am not sure if it’s due to an observation bias or some other external factor, but I felt that the speed of loading the data itself varied quite significantly even with the same binaries. I am not sure if this is due to some caching/optimizations being done under the hood by the compiler itself, but whatever it is, certainly would make aimless benchmarking without controlling these external factors a futile exercise. The second issue which I faced which was making debugging difficult was the inability of my GPU to automatically kill the Python/CuPy tasks once the program finished execution. I searched around stackoverflow and found that it is actually a rather common issue with CuPy. As a result, it also didn’t take a long time before I found a makeshift solution for this problem as well. All I had to do once the program had finished execution was to call some specific CuPy methods to free the memory, and it worked just fine! With these two issues sorted, I had a much better setup in place to try and debug the code without being forced to restart the computer or wait 10 minutes for the RAM to clear up everytime the program finished execution!
Hello! This blog marks the end of the first phase of Google Summer of Code. The journey so far has been challenging but also extremely rewarding. The knowledge gained as a by-product of the work I’ve been doing on my project so far is unbelievable, but more importantly has been a better, more pleasant experience compared to the traditional system of gaining knowledge by reading books and tutorials. In this blog, I will be summarising the work that I’ve been doing for the past 2 weeks, update the readers on my current position and give an idea of what lies ahead.
Hey there, welcome to the second blog of the series, and the first one to document the coding period. The Community bonding period which I described in my previous blog ended on 31st May and paved the way for the official coding period of the Google Summer of Code. These past two weeks were my first where I spent most of my time working on the actual code that will be a part of my project. My primary objective over these two weeks was to study the proof of work code that implements the spectral matrix algorithm to compute the spectra and execute it on a GPU. This was followed by a period of studying the different mechanisms with which RADIS calculates the spectras, and to understand the differences between each of them. This was important as implementing GPU compatible methods for all these distinct pipelines is my final objective and it is essential for me to understand the differences between these methods at the very onset of my project. Finally, the remaining time was spent on back and forth discussions with my mentors on various languages and libraries that could have been possible choices for undertaking this project. Once we had made our decision, I spent the time going through the library’s documentation, source code and tutorials to familiarize myself with these tools.