Rene – Hi! Welcome to QuBites, your bite-sized pieces of quantum Computing. My name is Rene from Valorem Reply and today we're going to talk about what Nvidia is doing with Quantum Computing, and for this, I'm very honored to have a special expert guest today, Jin-Sung Kim. Hi Jin, and welcome to the show, how are you today?
Jin-Sung - Hey Rene. It's great to be here, thanks for having me on the show!
Rene - Awesome! Well, first of all, tell us a little bit about yourself and your background as it relates to the topics about Quantum Computing, computer science, and all of these things we're talking about today.
Jin-Sung - Yeah, so I'm the developer relations manager for Quantum Computing at Nvidia, which means I'm in charge of the quantum ecosystem in regard to Nvidia Quantum products and software. My job is to enable researchers within the field who are working on Quantum and make sure that they have the best tools available to them. I understand what direction the field is going and making sure that we're building products and features that best enable folks within the field of cloud computing. A little bit about my background: My background is directly from the field of Quantum Computing. I did my PhD at Princeton in electrical engineering and Material Science with Steve Lyon, where I studied electron spins in Silicon Quantum devices. I actually made some of the cleanest silicon Quantum devices in the literature. I've been making this joke since I was a grad student, but the electron mobilities from the devices I've been in grad school are still higher than the ones reported by Intel today, so I'm still quite proud of that. I then spent about three years as a research scientist at IBM Quantum, where I worked a bit higher in the stack, looking at qubits and characterizing noise in Quantum systems before moving on to more applications and demonstrations. One of the papers I published while I was at IBM Quantum was an experimentally verified instance of Quantum Advantage, where we demonstrated that a Quantum computer can actually do certain computations with higher Fidelity than a classical computer when you restrict the computational space. For better or worse, the thing that people probably would know me the best for is my YouTube series while I was at IBM Quantum, coding with Qiskit, where I taught folks how to program Quantum algorithms using Qiskit. And in real life, people actually recognize me more for this than any of my other work in Quantum physics. So, I started my career in Material Science, in device level Quantum devices, then moved to qubit systems, improving Gates and Quantum applications, and now my role at Nvidia, where I have a bird's eye view of the Quantum ecosystem and help drive the software tools that enable the Quantum to accelerate supercomputers of tomorrow.
Rene - Well, thank you, that's pretty impressive in fact. So you went actually through the whole stack like from Material Science to developer relations field.
Jin-Sung - I tried to cover the whole field from the ground up.
Rene – Awesome, awesome! Well, that's why we have you here, and you're super-knowledgeable, but let's talk about Nvidia and what Nvidia is doing in the quantum space. Of course, you know I'm watching GDC and all the other news and announcements and throughout the last couple of years, we have all been seeing that Nvidia is, of course, you know, growing also with Quantum, but can you give us an overview and tell us in which areas Nvidia is working on when it comes to Quantum?
Jin-Sung - Yeah, so, essentially, in a nutshell, Nvidia is working on and has developed tools for everything in the quantum computing stack, except for the actual quantum hardware. The first thing that we developed and we released last year is Cu Quantum, which is a set of software libraries that accelerate quantum circuit simulation on Nvidia GPUs. A big part of driving the field forward is developing quantum algorithms and identifying what use cases will actually benefit from quantum acceleration when quantum advantage actually arrives. So, fast simulations of large qubit systems and a gate model is actually a fantastic tool for this since quantum computers today are still, you know, they've gone much, much better over the years, but they're still pretty noisy, expensive, and they're not really ubiquitous, so you can't just submit a job to a farm processor and expect your results back immediately. So, in this sense, GPU-accelerated simulation is a great alternative tool. In terms of the performance benchmarks, a lot of folks are not super familiar with how well GPUs work for these types of things. The performance benchmarks that we've seen doing quantum circuit simulation is actually really, really fantastic on GPUs. We've seen up to like 300X improvements in performance running a quantum circuit simulation on a GPU system compared to a pretty powerful server-grade CPU. So, what this means is if you have some large-scale simulation, say like a 30-qubit variational quantum eigensolver, and you're simulating this, it might run on a CPU in say an hour, but with the 300X speed up this will run in less than a second, which really, really improves your development cycle and saves the researchers valuable time. The other main thrust that we've developed is a unified programming model and compiler for hybrid quantum-classical computing called QODA, or the Quantum Optimized Device Architecture.
Rene - Got it, got it. So, we're going to dive a little bit deeper also into the simulation part of this, but yeah, it's really impressive like what I've seen on the website, the video, and some of the simulations that are running, but again, we will dive a little bit deeper later into this particular topic because it's super exciting stuff, but let's hear a little bit more about this hybrid quantum-classical computing platform, which is a very long name, but it's called Nvidia QODA. Sorry, yeah, so QODA, that's the short name. So, Nvidia QODA QODA. All right, and so there's this unified programming platform, what is it, and why is this useful?
Jin-Sung - Yeah, so QODA, with the Q, is really the quantum analog of the CUDA programming model for GPUs. What QODA does is allows the user to program quantum processors alongside the best classical tools in the same workflow and in a performant way. So, we announced this last summer, and we're partnering with a growing list of quantum hardware providers like QuTech, Rigetti, Pascal Farm, Brilliance, IQM, Xanadu, and more coming on the way. We've done this to enable people to program hybrid quantum-classical systems, and importantly, QODA will be open and open-sourced. So, why are we doing this? Why is hybrid quantum-classical programming so important? Well, it's really important because all valuable client applications of the future will be some sort of hybrid application. We imagine that the first instances of really useful quantum computing will be using a quantum computer to accelerate some bottlenecked part of a classical computation running on CPUs and GPUs. So, if you want to think of a few examples, you can think of something like a classical molecular dynamics simulation where maybe a large portion of the code runs really well on CPUs and GPUs, but maybe you also want to offload the actual quantum interactions of when the molecules actually collide with each other to a quantum computer. Another example of a hybrid application is something like a quantum generative adversarial network. There's a lot of promise in quantum machine learning applications, and people have shown advantages in expressivity and neural networks in a quantum computer. So, maybe one would want to use a quantum generator and a classical discriminator in this use case. QODA makes these kinds of programming heterogeneous resources simple and performant. In addition, what we also want to do is enable this very large community of HPC developers to program quantum computers in a familiar work environment. If you think about how quantum computers are programmed today, it's essentially the equivalent of quantum assembly language, where you have to individually program quantum gates on individual qubits, and it's really the domain of experts in quantum physics. So, this, of course, is not very scalable when it comes to programming larger applications in the future. So, what we're doing is providing a library of higher-level quantum primitives, which helps the domain scientists who are not necessarily experts in quantum physics program in a quantum computer alongside CPUs and GPUs in the same workflow. I do want to comment briefly on the name QODA, which again is Quantum Optimized Device Architecture. If you know something about the history of Nvidia, you'll notice that QODA sounds very similar to CUDA, that's CUDA with a "C", which stands for Compute Unified Device Architecture, and is the platform released in 2007 for easily programming Nvidia GPUs. So, a bit of a history lesson. Once upon a time, GPUs were only used for gaming and generating graphics, this was the original purpose of these chips and the original purpose of the company. But then, in the early 2000s, some very smart researchers found out that you could actually program GPUs to do scientific computations, and it turns out they're really, really good at it. But the problem at this point was you had to essentially map some difficult, hard scientific computational problem into some format that the GPUs at the time could understand, meaning you literally had to program shader APIs and essentially trick the GPU into thinking it's computing some graphics and then take that output and then map it back onto your scientific problem. So, this was obviously very difficult and annoying to do. So, what Nvidia did at that time was develop CUDA, that's CUDA with the "C", back in around 2006, 2007, which was a programming model that made it easy to program GPUs alongside CPUs and gpus in the same workflow. I do want to comment briefly on the name QODA, which again is quantum optimized device architecture. If you know something about the history of Nvidia, you'll notice that QODA sounds very similar to CUDA, that's CUDA with a "C", which stands for Compute Unified Device Architecture, and is the platform released in 2007 for easily programming Nvidia GPUs. So, a bit of a history lesson. Once upon a time, GPUs were only used for gaming and generating graphics. This was the original purpose of these chips and the original purpose of the company. But then, in the early 2000s, some very smart researchers found out that you could actually program GPUs to do scientific computations, and it turns out they're really, really good at it. But the problem at this point was you had to essentially map some difficult, hard scientific computational problem into some format that the GPUs at the time could understand, meaning you literally had to program shader APIs and essentially trick the GPU into thinking it's computing some graphics and then take that output and then map it back onto your scientific problem. So, this was obviously very difficult and annoying to do. So, what Nvidia did at that time was develop CUDA, that's CUDA with the "C", back in around 2006, 2007, which was a programming model that made it easy to program GPUs alongside CPUs in a high-level language. So, what CUDA did back then was immediately enable scientists and researchers to now do scientific computations on GPUs. Then a few years later, a guy named Alex Krizhevsky famously demonstrated that using CUDA, in what's now known as AlexNet, and demonstrated that deep neural networks actually run really well on GPUs. And this really was the thing that kind of kicked off the AI revolution and essentially all of the modern AI workloads. So, what we're hoping to do with QODA, that's QODA with the "Q", now the quantum version, is to arm scientists and researchers with the programming tools to enable hybrid quantum-classical applications. And when the ecosystem has the right tools at its disposal, amazing things will happen in the future.
Rene - Oh, that's awesome. Thanks for giving an overview of the QODA, and yeah, I very much remember, I'm old enough, you know, I was actually also using a CUDA, and the general term is a GP-GPU.
Jin-Sung - Yeah, that's right, it wasn't even a GP back then.
Rene - Yeah, that's right, so GP-GPU and QDOT, like you're saying, it enabled so many folks, like in science, it's crazy, and this is the standard basically today. Let's see where QODA comes in, actually. or, whatever, code actually ends up, but yeah, if it takes the name from QDOB, it has a bright future for sure. So, yeah, no, awesome stuff, and yeah, I really liked what you also mentioned about this hybrid approach, and in QML, we actually had also a few folks here at the QuBites show, talking about quantum machine learning, and we also discussed these kinds of hybrid architectures, where you have for example, one layer typically one of the later layers almost the make to be the one before the output layer, this is like your quantum layer, basically, where you can represent because you just need a few qubits, basically, and you can even run it on a real quantum computer, and yeah, the results are pretty impressive, like more precision, higher precision, and better speed in fact, as well, right? Very promising. So, yeah, I'm really excited for this because, you know, like you're saying, we have of course AlexNet and all these models enabled a lot of things, but now we have these super large transformer models like GPT-3 and what have you and Stable Diffusion, yeah, exactly, all of that stuff. These are huge, right, and it takes ages to train them, it's costly as well, and these hybrid approaches might provide a normal solution there, right?
Jin-Sung - And even when Quantum Advantage arrives, GPUs and CPUs will still play a very important part of the computing stack of the hybrid.
Rene - Exactly, exactly, and that's you know, I always keep on saying this because a lot of folks, like when you talk with them about quantum, they will say, "Oh, when will be my smartphone here, when will I have a quantum computer in here?" and let's say probably never. We need deterministic computing, right, and we will still have classic computers, and folks should rather think about the quantum computer like a GPU, right? The GPU is very good at specific things like large computation with matrices and whatnot, and the QPU or quantum processing units are very good at all those specific problems, but you still need this kind of generic, you know, classical CPU, silicon-based chip of course. And it gives us also the segue into the next one where we're talking about GPU because Nvidia is also providing these large-scale Quantum simulation using Nvidia dgx systems and on the software side there's also like you were saying the Cu Quantum Library where you can develop Quantum circuits easier and so can you explain how this works and how folks can actually get started?
Jin-Sung - Yeah, so CuQuantum is our library of SDKs to accelerate Quantum circuit simulations. It consists of a state Vector library and a tensor network library, and it's currently integrated in all of your favorite Quantum Computing Frameworks like Qiskit, Penny Lane, and many others.Our state Vector library, Q StateVec, is great for simulating very deep and very entangled circuits at the expense of system size. With the dgx-a100 system, which is 8 GPUs, you can simulate about 36 qubits, and even more with a supercomputer like a dgx superpod. We also have a complementary library to Q StateVec called Q TensorNet which is our tensor network library. This is great for simulating really large qubit systems, but at the expense of depth and entanglement. The easiest way to get started with Q Quantum is to download the quantum appliance, which is available for free. This is our software container with Q Quantum, which now comes with the Cirq and Qiskit front-end and allows you to scale out your simulation size with multi-gpu and multi-node support. For those without an A100 GPU, you can spin one up on your favorite cloud service provider like AWS or GCP, and now, Oracle has the Q Quantum Appliance deployed as an app on their cloud service. If you're a Penny Lane user, it's really easy to install Q Quantum, you can just pip install Q Quantum or go to Amazon Braket and use the Q Quantum integration which is already available online. The back-end for Penny Lane is called lightning.gpu which is our cool Quantum integration and in our latest release of our good Quantum Appliance, we now support multi-node execution. So, what this means is you can easily do Quantum circuit simulation at Super Computing scale. So we actually have some really nice results that we recently announced coming out of ABCI, demonstrating a really large qubit simulation distributed across many GPU nodes at a super computing center. So the performance is really good, and yeah, CuQuantum allows you to straightforwardly expand your problem size to multiple GPU nodes. On that note, the crew Quantum Appliance is already deployed on a few different supercomputing centers as well, like NREN, Seneca, and CSA, and more coming in the future. So if you're a user of one of those facilities, definitely check out CuQuantum.
Rene - Wow, this is amazing. And can you tell me again which conference the big announcement happened at?
Jin-Sung - Oh, the announcement for the ABCI result was announced at Supercomputing a couple of weeks ago, but there will be a blog post coming out that details the full performance benchmarks and results.
Rene - Awesome, and yeah, like you're saying, like 5,000 qubits simulated. This is much more than we have in physical systems these days.
Jin-Sung - That's right, and for tensor network methods, these really excel for shallow circuits, but there's a broad class of interesting Quantum applications, you know, various flavors of variational algorithms, tend to have relatively shallow, low-entanglement circuits, so it's a great fit for any of those applications.
Rene - Absolutely, and you know, on those nodes around Quantum simulation, in fact, we also, for one of our clients, a large energy grid provider, we have an optimization problem and we developed a custom quantum algorithm for them and we run this quantum algorithm also on GPUs on a simulator, basically quantum-inspired optimization, as you of course know, and the interesting part about it is already on this quantum simulator with quantum-inspired optimization, we're getting a 20-time saving for 20,000 people, in fact, this is already providing a crazy kind of optimization and if you think about it, we can do this already today with Quantum simulation, because the quantum algorithm, the nature of the quantum algorithm can, although we don't have the powerful machines yet, we can simulate it and we already see benefits with QIO. So, that's fantastic. Thank you so much Jin for sharing these insights. It was really good and very much appreciated because you went all over and explained all the pieces that Nvidia was working on. So, thank you for sharing all of that information. Thank you so much this very much appreciated.
Jin-Sung – Yeah, thank you so much for having me.
Rene - Well and thanks everyone for joining us for yet another episode of QuBites, your bite-sized pieces of quantum computing. Watch our blog, follow, our social media channels, and hear all about the next episodes. and of course, subscribe to our YouTube channel if you don't want to miss an episode. You can watch all previous episodes on the website from season one to season six until then see you soon, take care, and have a good time!