
Phase Space Invaders (ψ)
With the convergence of data, computing power, and new methods, computational biology is at its most exciting moment. At PSI, we're asking the leading researchers in the field to discover where we're headed for, and which exciting pathways will take us there. Whether you're just thinking of starting your research career or have been computing stuff for decades, come and join the conversation!
Phase Space Invaders (ψ)
Episode 28 - Yuji Sugita: Replica exchange, software for massive simulations, and importance of long-distance collaborations
In Episode 28, Yuji Sugita shares the story of how he developed temperature replica exchange in the lab of Yuko Okamoto, connecting to his early experience from working with Nobuhiro Go, the father of Go models. We then talk about the process of building up workflows for simulating massive atomistic systems, a multi-year collaboration with Michael Feig, and ponder the question of when one should go about writing their own scientific software rather than reusing existing software packages. Talking about molecular crowding naturally brings us to current and future directions, which for Yuji include simulating increasingly multi-component condensates and exploring multi-resolution schemes in GENESIS. Towards the end, he highlights the need for young researchers to engage with the international community through long-distance collaborations, regardless of where one ends up living and working.
Welcome to the Phase Space Invaders Podcast. In episode 28, I'm talking to Yuji Sugita, chief scientist at the Theoretical Molecular Science Laboratory in RIKEN, Japan. If you have ever heard about replica exchange, molecular dynamics, simulations, Yuji's name should sound familiar as he was the one who developed this algorithm, and its many subsequent variants starting in the late nineties. He later developed an interest in studying truly large scale systems and started working on Genesis, a molecular simulation program that enabled him to run massively parallel simulations on modern supercomputers. Indeed, already a decade ago, he managed to simulate systems comprising 100 million atoms, an effort that brought real software breakthroughs and deep insights into the nature of non-specific interactions in the crowded cell environment. So Yuji first shares a story of how he developed Temperature Replica Exchange in the lab of Yuko Okamoto, connecting to his early experience from working with Nobuhiro Go the Father of Go Models. We then talk about the process of building up workflows for simulating massive atomistic systems, a multi-year collaboration with Michael Feig. And ponder the question of when one should go about writing their own scientific software rather than reusing existing packages. Talking about molecular crowding naturally brings us to current and future directions, which for Yuji include simulating increasingly multi-component condensates and exploring multi resolutions schemes in Genesis. Towards the end, he highlights the need for young researchers to engage with the international community. Through long distance collaborations regardless of where one ends up living and working. I can only agree with him there. Hope you enjoy today's conversation. Okay, So Yuji Sugita, welcome to the podcast.
Yuji Sugita:Yeah, thank very much for invitation.
Milosz:So Yuji, people can know your name from many angles and whether it's software development or cellular crowding or the many applied studies throughout the years, but your first breakthrough and was in many ways absolutely seminal was the development of the replica exchange algorithm in 1999. Which later spawned an explosion of variants and adaptations. And I was wondering if you still remember where the original idea came from, what the environment was at the time, and you know how it evolved during the last 25 years?
Yuji Sugita:Yes. So actually when we published paper on the exchange, I was just young research associate in, Yuko Okamoto's lab was already PI at Institute for Molecular Science at the time, and I joined his lab. And uh, before my joining his group, he used Montecarlo method, but he suggest me to use molecular dynamics and for lip exchange calculation. So that's was the starting point for our collaboration. And yeah, we actually worked very hard to develop labor exchange method in computer program. And so I mainly develop the code and so we, uh work together to write a paper very rapidly and publish it. At that time, replica exchange seems to become hot topics. So we'd like to be the first group to publish MD version of replica exchange. So we had to work yeah, very quickly. Yeah, that was the, beginning of sorry.
Milosz:Did you have to write the simulation code from scratch did they have already some working code that you could
Yuji Sugita:yes. So in fact, so. When I was a graduate student at Professor Nobuhiro Go's laboratory, you know, that Go model. So he, he was very famous biophysicist in the field and at that time, I worked other development but using MD software, so. I couldn't publish many papers in profess Go's Laboratory when I was graduate student, but I struggled to develop some new functions in the one of the MD code and tested a lot. So at the end, I know almost all the details of that MD code. So this experience makes me easier to develop political exchange. Code. So once such basic idea was given from Yuko Okamoto, I didn't spend too long to develop l exchange algorithm and idea into the MD code. So I could finish the code development very quickly.
Milosz:I see. I see. That's great. And then of course, there were so many variants, right? There was multi canonical replica exchange and then many other groups
Yuji Sugita:Yes.
Milosz:took over with things like solo tempering or Hamiltonian
Yuji Sugita:Mm-hmm.
Milosz:How do you see this field evolving the way to this? They do follow it closely.
Yuji Sugita:So I mean, we first developed temperature debris exchange algorithm for MD simulation and soon after, so Yuko Okamoto and I discussed the extension for further. Improvement to the sampling efficiency of, many biological simulations. And we soon develop multicanonical exchange and also the multidimensional, uh, exchange such as the exchange sampling and hamiltonian exchange. So. At that time, so I myself was a bit satisfied about my development on the first few years, and then I switched to a different topics that was membrane protein simulation at that time. But, uh, after few years of membrane protein simulations, I had opportunity to start my own group. So I returned to the field to, uh develop, some other enhanced sampling simulation for various biological systems. For example, the in our original application, we just targeted only, I mean plug folding, actually pep folding in. But after a few years, I know that membrane protein is very important and also the glycan is also quite interesting. Nucleic acid is also, important and not only dynamics, but can binding is also important subject in biology. So I do like to develop more efficient algorithm for. Biological problems. So, uh, we continues until now and I'm still discussing with my group members trying to develop better and better, sampling efficiency, methods. That is what we are now working and this field, I think that we bit I mean. Classical approach, but nowadays people are using machine learning to some extent for further enhancement of the sampling efficiency in MD simulation. So I think that's this field still growing to, uh, I mean improve the efficiency in, yeah.
Milosz:What are the lines along which you're working now on enhanced sampling? Is it, akin to still replica simulations or meta dynamics or other variant?
Yuji Sugita:Yeah, yeah. Yeah. So I, we still like to use their protection idea, however. Not only just a simple one, but combining with other bias sampling. So then we realized that by the combination, so we could develop better and efficient new method. That's what we are now working. And of course we are discussing to introduce more machine learning idea to our new sampling methods.
Milosz:What are some, specific examples? Like do you have any specific example where there are clear problems that have to be addressed with machine learning?
Yuji Sugita:I mean, machine learning has been already become very popular and people are using, so machine learning technique, even in molecular dynamic simulation, for example selection of the collective variables. And
Milosz:Mm-hmm.
Yuji Sugita:originally people select some of the useful. Collective variables, but now people are discussing about machine learning based collective variables that can be more useful. And also I don't know any other good,
Milosz:I am, I'm asking from your own research agenda, maybe if there's any exciting developments that you have a specific direction.
Yuji Sugita:in currently, so we have very much interest in, uh, large scale simulation. Not only atomistic, but integration of,
Milosz:Yes.
Yuji Sugita:QMMM and atomistic and cost simulation. So our goal is install all these different models into Genesis and also quickly convert from one simulation to other resolution simulation like, I mean QMMM to atomistic and atomistic to coarse-grained or vice versa. So I think that,
Milosz:Right.
Yuji Sugita:some of the machine learning technique. Would be useful for such lapid switching from different resolution simulation that's particularly yeah, interest. Yeah.
Milosz:Is it like adaptive course grading? So it dynamically you change which part of the system is represented on the course grade versus atomistic versus, or do you launch multiple simulations, just detecting, you know, where potential interesting events are happening? both? Maybe.
Yuji Sugita:I currently, I don't have so clear idea about the future direction, but hopefully yeah, as you mentioned that if something interesting event happens in cost grain MD simulation, for example. Then if we can easily convert from coarse grained to atomistic and start to see the more detailed dynamics or reaction, that would be fantastic. So, and without spending too much effort by us, if, I mean machine or computer can almost automatically detect such events and. Switching from to other, that would be fantastic. Yeah. I.
Milosz:Yeah, absolutely agree. I've long been looking forward to seeing such a thing happen in, you know, a semi-automatic manner rather than exactly writing custom scripts to terminate or restart, to convert to upscale downscale. I know Michael Feig, one of our collaborators, has great tools for multis scaling too. So I could see that now you have a coarse-grained model and make it all atom instantaneously which is great.
Yuji Sugita:Yes. So Michael is expert, not only MD simulation, but also machine learning. And so she's group developed several quite useful tools for switching from course grain to atomistic and also the, uh, generative model for, uh, disordered proteins or. I think that's quite useful in this, research field.
Milosz:Yeah, I am breaking up Michael, because you also collaborated with him on those cell environment models. Right. And that was a later phase of your research after you studied membrane proteins and Yeah, I, I think we had one conversation on this podcast before, but it's always nice to hear another perspective on, you know, what are we missing in our simulations by not including the environment. So like what are the interesting effects that only show up when we put other proteins where we put crowding a cell-like environment?
Yuji Sugita:Yes, yes, yes. Yeah. So yeah, it was already, I don't know, more than 10 years ago. So we started such. Collaboration. Why we started the collaboration was bit, not so owner situation, I think. So, you know, Michael he's German, but he is still working. He has worked in the US for a very long time, right. I'm working in Japan all the time, so we are, I'm kind of very differently working, but more than 10 years ago, I think 15 years ago, I don't remember. But so he organized one of the Telluride workshop. You know, Telluride is in Colorado.
Milosz:Mm-hmm.
Yuji Sugita:He invited me and as one of the speakers and then in a conference. So we discussed a lot and realized that both of us had great interest on not only single protein dynamics and reaction, but also the interaction with many other protein or other biomolecules and. Environment also. And at that time we had many discussion how to simulate such complicated environment using MD simulation. That's not so easy. Still. It is not easy, but you know more than 10 years ago it was a bit crazy idea, right? So.
Milosz:Yes, yes.
Yuji Sugita:And, uh, we, at that time we had many discussion and which model can we use, of course, grain model or atomistic model or many ideas. But at that time in Japan we could start using supercomputers. And also I'm working in RIKEN and RIKEN develop such simple computers. So we had resources. So to use such resources very effectively. So think, uh, very big calculation. So based on atomistic model can be the, Highest impact works. That's what we discussed, but at that time, we didn't. Difficult. It was,
Milosz:From which standpoint? Like just running a stable simulation or scaling or changing the software.
Yuji Sugita:I think that I dunno many, many barriers to complete the simulation. The first one was the modeling and I mean, at that time, so we didn't have any such. Useful tools for, developing the crowding system in the atomistic detail. So
Milosz:Mm-hmm.
Yuji Sugita:yeah, basically Michael develop such tools from the scratch. And also at that time, most of the MD program was not so scalable. Toward the thousand or more CPUs. So we had to develop such scale up code by, that was the initial version of Genesis. And when we developed the model and a code, then we forethought that half of the task was done, but it was not. Even just uploading the big data. I mean, we try to simulate a hundred million atom simulation into the computer just loading. So
Milosz:a
Yuji Sugita:yeah, it takes a very long time and, uh, we, uh, had to develop parallel input and output code for such, especially big. System and also trajectory analysis was also huge, huge efforts. We have never, I mean treated such big number of the simulation. Yeah. That was the, Nightmare.
Milosz:Yeah, I very much appreciate, yes, I can. I very much appreciate, someone doing the first steps.'cause every time I go to like million, several million atom systems, I, it is truly remarkable how many things start to fail, right? If a number ink in PDB files, know all those big files all, as you say, the amount of time it takes to load and do some initial checks. Yeah, this is all, again, I'm happy someone did it before and now people can just take those systems and run them with someone else's four third.
Yuji Sugita:Hmm.
Milosz:And so what were the, findings that you, you could take out of these simulations?
Yuji Sugita:Yes. So, uh, in crowding simulation, actually, macromolecular crowding was not so, latest research field even at that time because, even theoretically and experimentally crowding effect was investigated for a very long time. And so even in a textbook of biology, the effect of macro crowding was described. and at that time, crowding effect. Is interpreted as volume exclusion effect. So, you know big molecule occupied, lot of space in the cell. So not so many free volumes available. So that what considered as main. Effect in crowding. Of course, this volume exclusion effect is still very important and one of the main, most important effect of crowding. But we use atomistic MD model and also inclusion of the water, so we could investigate that interaction. Between the protein, protein or protein metabolites in very detail. Then we found that not only the specific protein, protein or protein metabolite interactions, but many, I mean, important nonspecific and weak interaction exists in, crowded environment. And that is very important. To summarize or, yeah. Yeah. Yeah. So determine the, yeah. Many, many biological phenomenas. Yeah.
Milosz:Yes. I remember there were questions exactly of, of stability, right? Of. Certain proteins that would unfold much more easily
Yuji Sugita:Yes, yes, yes.
Milosz:change their confirmational preference are,
Yuji Sugita:Yeah. So if just volume exclusion is the only defect, we cannot explain why some of the protein stability is deduced in crowded environment, so we can understand that such instability in the crowded environment. The nonspecific weak interaction, so,
Milosz:mm-hmm. Yeah, I think it's very much changes our view on protein evolution. Right. And what are the constraints when proteins evolve because we think, oh, it looks quite happily stable, but maybe something that looks happily stable in a simulation doesn't look that stable in a cell, and the cell cannot do some things that would look completely sensible to her. But there were also studies of ligand diffusion, right, and drug
Yuji Sugita:Yes, yes, yes,
Milosz:drug diffusion, drug binding.
Yuji Sugita:yes. So in the beginning of our crowding research, we focus only on non-specific interaction between ligand and proteins because, you know, we. Focus on very big system. So using supercomputer, so this means that we cannot simulate such a long time. at that time we can simulate only 150 nanosecond for the medium system and biggest system, we could simulate shorter than 50 nanosecond, very short, right? We cannot observe. Specific binding to the active sites. So later on the big simulation, we set up simpler system, but inclusion of the crowd protein to study crowding effect on protein-ligand binding. Then, so using us, Anton Anton two I think, and also Japanese supercomputers and. Trying to simulate longer than 10 microsecond MD simulation of crowding system. Then you know, we could study the effect of crowding on protein-ligand binding. Yeah. That was history. Yeah.
Milosz:Yeah. Yeah, that's great that we also have those hardware developments in parallel that enable us to do bigger and bigger things. Of course, I. So well, you mentioned Genesis quite a few times, and there's always this question of, you know, someone like you works a lot on method development when is the right moment to start developing your own software? Because I imagine a lot of people face this question, you know, should I implement something in, in an existing package or do I, is my project big enough to warrant, you know, support for my own? For example, MD engine, right? How did you approach this, problem?
Yuji Sugita:Yeah. So, uh, even for us, when we decided to start that development, our own code, there are many, opposite ideas because MD Simul itself is like just a, based on Newtonian creation motion high school, physics, right. So,
Milosz:Mm-hmm.
Yuji Sugita:to, uh, implement the high performance code, on Latest computer environment, that's totally different story. So we have to spend not a year, but several years. Right. So to develop the code So it takes. Actually great effort of the postdoc and, staff scientists, but at that time we had to develop our own code because RIKEN and also the the community wanted to develop such high performance code using supercomputers, but before our age researchers in Japan, we don't, we didn't have such supercomputers and so just use existing software. and in that case, so it is a bit difficult to, challenge the high performance computing in MD simulations. So in our generation, we tried to develop such high performance code on supercomputers and in fact, so in our age, we experienced that there's more computer resources to huge supercomputer resource that big change. So I developed replica exchange MD calculation software. I just used eight core machine, HCPU machine. Because of this,
Milosz:Mm-hmm.
Yuji Sugita:our papers, first papers number replica was eight. Yeah, that was the reason. But now computer current Japanese one. So it equipped, CPU node. That's big change, right?
Milosz:That's a lot.
Yuji Sugita:Yeah. So yeah, we, uh, try to utilize such modern computer architectures. But again, now, the main. Computer resources is changing from CPU to GPU. So yeah, we had to change the genesis code a lot. It's kind of endless game and yeah, it's really tough.
Milosz:Yeah. I think that's what a lot of people don't think about, right? You start developing something and it stays with you for the rest of your life because you have to either drop it or keep adapting it to the ever changing environment. I.
Yuji Sugita:advantage is, I mean, for example, not only atomistic, but co-train model. So we can easily implement, not easily, but we anyhow implement new models into Genesis to test. But if we just may change other program code, to develop efficient code for a new model may not be so easy. So yeah, that is I think, advantage. Yeah.
Milosz:Mm-hmm. So when you have to, you know, you will clearly use the adaptability for method development in the future. Then it's. When it really becomes a must. I gotta say I, I'm really happy with Genesis'cause I started using it recently and yeah, especially the availability of course, grain models is a great feature because many, many other course grade models are implemented in smaller softwares that are not well documented or kind of again, belong to particular research groups that are not meant to be widely used. So yeah, I find this effort very much worthwhile on your side.
Yuji Sugita:Thank you. Yeah.
Milosz:And so the latest topic that I remember from your visit to our institute recently is condensation and those large scale systems, right? So like you are going towards again, the biology of phase separation if I remember well and this cell-like, but kind of complex environments, So what is the, what is the promise there and where are you going with this? If you can share with our listeners.
Yuji Sugita:Yeah. So you know that. Our smallest system is usually just single biomolecule like proteins or nucleic acid in little bit simple environment in water or in membrane for a protein, right? But the more, most complicated system for us might be a whole cell right, including everything.
Milosz:Mm-hmm.
Yuji Sugita:I actually gave up almost whole cell modeling using too much simplified model. So, and that is not our, interest. So if we use kind of reasonable model such as the atomistic model or semi atomistic. Grand model that is kind of limit for the target. and in this sense, so we'd like to focus on part of the cell, but still we could observe many, many interesting biological phenomena and which is made up of many different biological molecules like proteins, intrinsic disordered proteins, and I mean metabolite like ATP or lipid so we'd like to see such interplay of any other biomolecules and hopefully we'd. Some mystery of I mean molecular and cell biology using MD simulation.
Milosz:What are the lowest hanging fruits that you can see there? Like what are the first things that. should try to address that are completely, you know, unexplored or partially unexplored.
Yuji Sugita:that's, it's a difficult question because there are many, many, you know researchers.
Milosz:Yeah. I'm thinking if you would say face separation for example, because there's so many events that. Relate to phase separation, right? And this is still a pretty simple model where you very often have just two,
Yuji Sugita:hmm.
Milosz:two components, but I dunno if that's on your mind or you are thinking of something bigger and more complex.
Yuji Sugita:Yes. So, uh, currently still we are working on. Very few components of I mean flexible protein and how to make the condensate or how to destroy the existing condensate That was the current work. But in near future, we'd like to focus more complicated condensate. For example, inclusion of nucleic acid or interaction between condensate and lipid. And
Milosz:Mm-hmm.
Yuji Sugita:then so we can approach other important cell biology problems that is what we are thinking.
Milosz:Yeah. Yeah. I see there are so many interesting, even those membraneless organelles and things that are in the cell that we haven't really explored that will give us decades long of, of targets for exploration. I'm asking because I'm myself interested in going that direction, so. It's always good to see, where the community is going, yeah, I see this is very much on everyone's mind these days. So it's good that we're developing tools and,
Yuji Sugita:Yes, script.
Milosz:to that. And this is also very big systems, right? That you, you run with, with condensate. So are they mostly on the coarse-grained level or atomistic level?
Yuji Sugita:Hmm. So for condensate formation and deformation, it takes not only the size, but also takes very long time in simulation point of view. So uh, we mainly use coarse-grained model. However in coarse-grained model we miss some of the most important factors. because of the simplification, so as I mentioned in this talk. we are now trying to switch sometimes from coarse-grained to atomistic model. And trying to see missing factors in condensate using atomistic model. So we're using both atomistic and coarse-grained model. And even so we have great interest on the enzymatic reaction in condensate. And in that case, we really need to. Carry out QMM calculation in the condensate. So that is partially done in a group, but not completely. So we'd like to study such enzyme reaction in condensate in future. Yeah.
Milosz:Yeah. How does it change? Does it change accessibility of, of the substrate or the confirmations of the protein or both? Or again, what's the premise there?
Yuji Sugita:Yeah. So you, you mean that diffusion of substrate?
Milosz:mm-hmm.
Yuji Sugita:Yeah, we could observe that how diffusion of ligand and protein is changed in condensate and also in crowded cytoplasm so yeah, I think that's not only, Model, but combining atomistic model so we could see such detailed, interaction dynamics.
Milosz:Right. So that would change the effective kinetics, even though, for example, the enzyme could be the same, right? The diffusion in and out would be significantly changed, or
Yuji Sugita:Yeah, I think, I think so. Yeah.
Milosz:Great. And, uh, kind of, going back to, to our earlier conversation, you wanted to mention the question of, international collaborations, right? Because you've spent your entire career That's right. In, in Japan.
Yuji Sugita:Yes, yes. I graduated Kow University and worked several. Japanese institutes and started my group in RIKEN and I'm still working in RIKEN and all my career has been done in Japan, but I personally don't like just, uh, to communicate and collaborate with Japanese community only. So I try to make many friends friends in and colleagues outside of Japan. now I think that I could collaborate and communicate many international researchers. And, uh, that's, I think, uh, I like this, style very much. Yeah.
Milosz:Do you have any advice for people who are perhaps for different reasons, you know, stuck in countries that are well connected to the international community, but want to get there, want to be part of that? I.
Yuji Sugita:I don't know. I can give advice to many people, but I tried to go to international conference or even visit, I. Many research level, least as many as possible. And also I try to invite people to, RIKEN institute when we organize workshop And I think that research collaboration in particular international research collaboration is difficult. In most cases, most likely, we met several times and discuss or just a talk. And, uh, then we realized that we have the common interests or. I mean, useful for technique or skills, which we don't have currently. So I think that's, in person I mean, meeting is very important. So I realize that during the covid we couldn't, attend in person meetings. At that time, I couldn't start any international collaboration, but just after the, COVID, even from the first meeting, I actually could start the new collaboration and communication and yeah, in person meeting is still quite useful to start international collaboration. That's, I feel.
Milosz:That is very true. Yes. I think, well, I can add that. my perspective, your lab appeared on my map, I dunno how many years ago, but I met one of your postdocs, a polish
Yuji Sugita:Yeah.
Milosz:Marta.
Yuji Sugita:Yeah. Yeah.
Milosz:she told me about your lab. And this is when I realized, oh, you know, there's lab, there are people doing this things and so on. And everyone is an ambassador for your lab. In the end, when you collaborate, when you have, people working for you, they, they also become your ambassadors. Right? And that eventually. Somehow, one way or another connect you to, to the whole
Yuji Sugita:Hmm. Yeah. That Nice. Yeah.
Milosz:But then of course, yeah, then again you visited us and, uh, all those meetings and conference chats reinforce those connections. So yeah, it is hard to do just once. We need to find ways to, I guess, to, to strengthen participation.
Yuji Sugita:Yeah. So
Milosz:in-person meetings internationally.
Yuji Sugita:yeah, I'm now trying to get Japanese grant, which can be used for sending younger researchers, educated researchers or graduate students to research in other countries. So, yeah, once we could. Get those grants. I'd like to send more Japanese researchers and graduate student to Europe or US or other countries and to give them more opportunities to communicate with international researchers. Think in Japan, nobody, it is quite difficult. So it is island and isolated by the sea. So inside Japan, there are many, many, opportunities to meet someone else, but if we don't, spend additional effort. So it very difficult. Yeah.
Milosz:It was great effort and of course everyone is very welcome and we would also love to, you know, come and visit and have this fruitful exchange of ideas.'cause
Yuji Sugita:Yes, yes. So,
Milosz:everyone.
Yuji Sugita:but interestingly, so we are especially invited, but I mean, many people come to Japan and RIKEN already, and I also we are welcome for. People to come and uh, the flux is currently not equated. So, uh, still, I mean coming is more dominant and I think we need to increase other direction more. Yeah, that's what I'm thinking.
Milosz:the same conversation in Poland about
Yuji Sugita:Ah.
Milosz:exposing ourselves to, to the international community. I can definitely see, how this can be a long standing concern in the scientific local, scientific community in the country that's. It is not well connected at the moment. Yeah, of course. Yeah. So again, Juita, thank you so much for sharing your story, your concerns, your outlook. been a great talking to you. Thank
Yuji Sugita:Yeah. Thank you very much.
Milosz:and have a great day.
Yuji Sugita:A date. Yeah. Alright.
Thank you for listening. See you in the next episode of Face Space Invaders.