Phase Space Invaders (ψ)

Episode 8 - Rossen Apostolov: Managing scientific collaboration, the biggest project of European biophysics, and seeding schools of thought

April 09, 2024 Miłosz Wieczór Season 1 Episode 8
Episode 8 - Rossen Apostolov: Managing scientific collaboration, the biggest project of European biophysics, and seeding schools of thought
Phase Space Invaders (ψ)
More Info
Phase Space Invaders (ψ)
Episode 8 - Rossen Apostolov: Managing scientific collaboration, the biggest project of European biophysics, and seeding schools of thought
Apr 09, 2024 Season 1 Episode 8
Miłosz Wieczór

Send us a Text Message.

In the eight episode, Rossen Apostolov and I depart from the standard format to talk about the core concepts behind BioExcel, the European Center of Excellence for computational biology. We discuss their main objectives and challenges, from working with code to organizing schools and webinars, talk about the challenges of sustaining funding through maintaining excellence in research, and review ways to share the gained experience with the broader computational world to inspire similar ventures elsewhere. It is a partial attempt to answer questions raised in previous conversations by pointing to existing models for international collaboration across many subfields.

Show Notes Transcript

Send us a Text Message.

In the eight episode, Rossen Apostolov and I depart from the standard format to talk about the core concepts behind BioExcel, the European Center of Excellence for computational biology. We discuss their main objectives and challenges, from working with code to organizing schools and webinars, talk about the challenges of sustaining funding through maintaining excellence in research, and review ways to share the gained experience with the broader computational world to inspire similar ventures elsewhere. It is a partial attempt to answer questions raised in previous conversations by pointing to existing models for international collaboration across many subfields.

Milosz:

Welcome to the phase space invaders podcast, where we explore the future of computational biology and biophysics by interviewing researchers working on exciting transformative ideas today's episode marks a slight departure from the standard format. As I'm talking to Rossen the director of the BioXcel Center of Excellence, affiliated with KTH Royal Institute of Technology in Stockholm, Sweden rosson is in charge of coordinating the multidimensional collaborative effort to bring European computational biophysics closer together on many levels. From providing support for code development and ensuring code sustainability, to providing direct training through schools, tutorials, and webinars, to finally promoting interoperability between what would otherwise be scattered pieces of specialized software. So our conversation tries to address, at least in principle, many talking points we touched upon in previous episodes. We're discussing whether this model of collaboration can be replicated elsewhere and how it addresses the concerns around code maintenance, the hiring of technical staff, making sure quality tutorials are not produced at the expense of underpaid PhD students, as well as providing a framework where bespoke training can be provided by experts to highly motivated trainees. Of course, such enterprises require dedicated and enthusiastic people like Rosson to run in the first place. But I do believe this way of doing science should be an inspiration for communities across the globe and I'm happy to spread his enthusiasm. So, if you're ready, here it comes.

milosz_1_04-07-2024_100233:

Rossen Apostolov, good to have you on the podcast.

Track 1:

to be here, Milosz.

milosz_1_04-07-2024_100233:

So, yeah, this has been a fairly common theme across previous guests, but we often talk about the importance of collaboration and how in our field computation biology or biophysics, it's increasingly harder to bring about real breakthroughs on your own and whether it's big integrative projects or the run of the mill collaboration with your next door experimenter, even this very podcast is an effort to make the point that we can do things better if we do them together. But here comes my question to someone who's taking this collaboration thing to a real professional level as the coordinator of the BioExcel consortium. How easy is it really to get scientists, especially computational scientists to collaborate,

Track 1:

milosz, thank you for inviting me to this podcast. It's a really great effort and I think something that's been missing in the field. Yes, so collaboration is indeed, something really important, fundamental For science and technology and in, uh, computational, disciplines, it's even more so since, we have to work, nowadays with, big and very complex, uh, supercomputers with modern experimental, uh, machinery. Analyzing big amounts of data so it's, really critical that, that we strengthen the collaboration between scientists and bring the communities together. In Europe and also around the world, there's been heavy investment in new supercomputers. Now we have uh, exophobes of available capacity across the world, but it's really challenging to make good use of them. And that's why some, uh, 10, 20 years ago, start those efforts in. encouraging, scientists to work together on this big projects. And, uh, one of the recent examples is the, for example, the COVID pandemic, where we saw that to solve grand challenges, you need global efforts among the communities.

milosz_1_04-07-2024_100233:

It's kind of interesting to think about it, that you can increase the power the whole computational machinery across Europe. Bye. enhancing the human side, right? So not just making the machines more efficient, but also bring together a better collaboration or better training among the scientists themselves.

Track 1:

That's indeed the case. Yeah. And as we've seen in, in other fields as well, we also avoid a lot of, of repeating the same mistakes by learning from the experience of, uh, other people. As, somebody famous said, we we've seen further because we stand on the shoulders of giants, right? And that's what we try to do to stand on the shoulders of other giants.

milosz_1_04-07-2024_100233:

Right. I see, I see our field had always suffered from a bit of redundancy, right? I mean, I often bring up the question of quantum chemical codes that are often so, scattered around many labs and, the same methods into different softwares, but then every software excels at one particular thing and replicates all the other ones.

Track 1:

And same for molecular dynamics, right? You go to Wikipedia and you search for molecular dynamics codes and you get 50, 60 applications.

milosz_1_04-07-2024_100233:

yes, although this field is a bit more consolidated, I think, and I was always happier to use it exactly because it feels like there's a community of people who care. There's a community of people who support and, and seeing this grow and scale, it's like, I guess a great feeling because it also makes it easier to enter, right?

Track 1:

maybe there we also see the, manifestation of evolution itself, where you have a lot of different codes as different species who are trying to grow and only a few will grow and take over larger parts of the, of their environment. Yeah, and the question is, what, what makes one code or one application dominate? And I would say like, and also what we tried to do in BioExcel is, I think it's, if you focus on excellence on doing things the right way, usability, uh, uh, develop the applications with. With the view of the, of the longterm, how this can be sustained. And also in order to increase the adoption, we need to make it, not only powerful because that's what's usually most exciting, right? For scientists, like interesting to, to do the hard stuff, to make it faster, to make it scale better. But in order to increase the adoption, it needs to be also user friendly. You also need to provide training material so that newcomers will be happy to take over.

milosz_1_04-07-2024_100233:

Yeah, totally. I think in my head, I divide software into two categories. One is the category that you absolutely have to use because it's everywhere. And then you have to learn and you can spend hours, you know, learning the details and so on. And there's a software that you learn because it might be more convenient, but then what most people do is they just try it out. If it doesn't work the first time, they go to the next option. if you're in this other category where you just want to make people's life easier and maybe provide a better solution, you know, You totally have to think about how to make it work the first attempt, so to say.

Track 1:

And, and also it's a bit of a balance between, extending the scope, the features of an application to increase the usability and the adoption while still keeping sufficient focus so that you don't stretch too much. there was a joke saying that e every application grows to the stage where it starts sending emails so our courts are still not sending emails, but one can imagine that if you keep on adding more and more to certain application, it becomes bloated. If it does too little, it's too little, it's too much, then it becomes too much. But so there is some sweet spot where you, do have the focus, has enough flexibility, but it's just well balanced.

milosz_1_04-07-2024_100233:

Yes, especially in academia, I know that for GROMACS, the main concern was that very often people would write tools and then they would leave academia and the tools would stay in this shape forever, even if there's a list of 20 bugs to correct. And nobody knows how to work with, them. Right. So

Track 1:

That's an issue for all applications in general.

milosz_1_04-07-2024_100233:

yeah.

Track 1:

And that's why there are these initiatives on, promoting best practices for software development, specifically for software sustainability, like in the UK, the SSI, they did a lot of work on that, to answer the question, what makes software sustainable? And if you are developing such code, what do you need to do to ensure that? Obviously you can't, you know, the bus factor, like how many people can you afford to be hit by a bus and yet the software can still live on. You need to have sufficient redundancy in terms of developers and knowledge.

milosz_1_04-07-2024_100233:

Yeah. These are really hard questions, but I appreciate, for example, the adoption of standards for code writing. So probably every library or every code at some point has to get to a moment where it has templates, guidelines, as you say for organizing code for, um,

Track 1:

Profile formats.

milosz_1_04-07-2024_100233:

writing classes.

Track 1:

And also file formats. but I think it's naturally. In the beginning, everybody wants to have their own say file format, but over time you'll see it becomes cumbersome to write scripts that convert from one to the other. And then naturally developers start to adopt something something common that works for everybody. It just saves more time and it's more convenient

milosz_1_04-07-2024_100233:

yeah. That's how you get consolidation perhaps by packaging of individual components, right? So the trajectory engine will be managed by consortium A, the code will be managed by consortium B, the interfacing by consortium C and so

Track 1:

For example, yeah,

milosz_1_04-07-2024_100233:

You have, for example, plumed Colvars that are too. that are trying to port into all the MD codes, right? So, for example, this Question of defining collective variables is now partially taken away even though everyone wants to have their own as well You can partially rely on third parties now to provide this

Track 1:

are a solution sometimes very convenient, but then you get the third party risk. What happens if this, uh, library, ceases to be maintained.

milosz_1_04-07-2024_100233:

Well, so to go back to BioExcel have a bunch of Groups working on different softwares, but I understand that the effort is to also make them more interoperable, right? So How do you approach this?

Track 1:

Indeed so we, so at the core of BioXcel, we have several applications that are very widely used ox, for MD haddock for integrative modeling. Those are applications with thousands of users. and one of the topics of working in bio cell is indeed this interoperability because. We know that to solve a certain scientific problem, you rarely use just one application. You almost always need a few tools, even if you just to fetch input data, maybe convert the format into something usable But most of the cases you need to build or you can build more complex workflows that increase your productivity. That's why another part of the work about cell is the development for example, the biocell building blocks, where it's a very convenient way, a very versatile and flexible platform to put together different applications and tools into a single package and this work is, part of our effort to make the software applications more user friendly in addition to making them powerful and to scale better to run on exascale or any extreme scale on different, uh, Architecture. So, usability is a big focus in our work.

milosz_1_04-07-2024_100233:

and do you think this format of this Excellence Center Replicable elsewhere like can it become a good template for other groups around the world other let's say regional of software developers around the world. Do you, or at least do you aim to make it a template for others?

Track 1:

I do believe that this is a replicable and we, we can see many examples of this working but first and foremost, there needs to be desire among the partners to work together and do something together to do better software, better science, collaboratively because it collaborations will always require a bit of flexibility on both sides, maybe make some compromises and maybe align on your goals. But I believe everybody's realizing the benefits of collaboration So that's not an issue what is important is in any type of such initiative and joint work is, uh is to have very clear goals objectives and, uh, strive be excellent in, in all of the activities that you do. Not only in the quality of the code that you write, but also the quality of the training material that you produce. also the quality of how you deliver the training, is the, the whole chain. And that's why, centers of Excellence are a very good vehicle for, for this type of uh, work.

milosz_1_04-07-2024_100233:

Yeah, I think that's a great case for, say, a white paper on this topic, right? To share this template I, I believe there's plenty of internal insight that you have, it's not even shareable in the podcast format that, you know,

Track 1:

That's true. We, we,

milosz_1_04-07-2024_100233:

hours

Track 1:

yeah, we, we, we talked Yeah. Yeah, yeah. We, we talked about that quite a few times over the years, and it's always, uh, how we have to do that. Let, let, let's write, maybe this podcast is a good, kick for us to motivate

milosz_1_04-07-2024_100233:

start

Track 1:

you to me.

milosz_1_04-07-2024_100233:

I would be happy if that was the outcome great. On your personal side, because now I don't know what fraction of your day you dedicate to managing bioexcel. But as a scientist, do you see managing such a project as sort of separate branch of your day? Scientific work. Or do you think you specialize in something different than scientists now? Or

Track 1:

Uh,

milosz_1_04-07-2024_100233:

a typical scientific job that has a managerial side to it?

Track 1:

it's quite different from, from doing research because, it requires different type of skills and, uh, activities that you need to do in research, usually you're focused on something. You go, you do some deep work On the code or on the experiment when one is running a big project, there's a lot of destruction. There's a lot of threads going on for me, it's really, satisfying to do that because, one gets to work and bring together, excellent researchers, uh, software developers, uh, Together and it's a very positive feeling when you see their desire to work together to do, much better products and services., So my. The main objective of my work is to, to ensure that we are all on the same page, that we have very clear vision of where we want to go and how we want to do it. To bring new people on board because, Maxel has been around for nine years now and, a lot of people have worked on the project through the years. We always have new people coming and, uh, it's really happy to see that everybody's very excited to be part of the team.

milosz_1_04-07-2024_100233:

And you still have continuous funding, right? How does it work on the funding side that can manage? I mean, I know European Union is funding this whole initiative for, is it the third time in a row now?

Track 1:

Uh, right. This is the third phase, we are having now.

milosz_1_04-07-2024_100233:

is it easy to ensure that the funding is continues that there will be no break at some point?

Track 1:

so our strategy is that, if we do excellent work, if we do tools, applications, we provide services that are very needed by the communities. This is the vehicle towards long term funding and sustainability because, it will be obvious that if, uh, initiatives like bioXcel, if they cease to exist, then it will disrupt considerably how research is being done. Some of this. It's very popular pieces of software that with thousands of users, not only Europe, but worldwide, will suffer a lot from lacking maintenance support. And the funding we, we receive specifically for, for buy and sell is. It's critical to keep the center as it is but it's not the only funding that is being used for the software development. Partners are complementing from other projects, of course. But the seaweed funding is, is this glue that is needed to bring all the parts together,

milosz_1_04-07-2024_100233:

So there's a baseline that's independent of the contingent And yeah, there's the whole Yeah, I think that that sounds to me like a good strategy. Mm

Track 1:

uh, for example, national funding agencies, they, they provide a lot of funding for research. It's not so common to give funding for software development, let alone for, improving software to apply best practices. They will not give you two people for two years just to restructure the code, to write documentation, to work on your architecture without having a nature paper that you discovered a new drug. so there is a big need for this, this type of funding for, for maintenance. It's.

milosz_1_04-07-2024_100233:

Does it also do you think it also benefits the ecosystem beyond bioxcel like have people who can work on site projects and I don't know open source projects that are not formally part of bioxcel that you know maybe as a byproduct of these things it also happens.

Track 1:

Yes. And, uh, because again most of the time uh, researchers use more than one software. So if, uh, if our core applications are needed for your work, naturally the other applications that you use in conjunction with them will benefit from this, say you are developing your own small application for post processing, well, you, you might as well use the same file We buy Excel we're using because it will make your work easier. Then we share a lot of our experiences, say, you know, we've published two white papers on best practices for software development. And, we see that other groups start to notice that and they, start to adopt some of those recommendations.

milosz_1_04-07-2024_100233:

Yeah I very much like the emergence of papers in our field. I think it's something that's proliferating recently and exactly as a result of those broader approaches to collaboration that people try to not necessarily enforce standardization, but to exactly share ideas on what came out of big collaborations that required, right? A lot of coordination, a lot of storage, a lot of thinking about this foundational questions. Yeah,

Track 1:

need to go through a peer review process that is very lengthy. Uh, You don't have any restrictions you just put all your knowledge there. You document it and, and platforms like, Zenodo, we use Zenodo a lot, or bioRxiv, where you just store the material there, it's a great way to, to share the knowledge with the community

milosz_1_04-07-2024_100233:

On one of the previous episodes, we were actually discussing this question of whether we should have big papers that are researched and polished and so on, and smaller papers that are more comments or, uh, I dunno, mentioning negative results or all those things that I these things are still citable, right? So you can still have your citations. Uh, it depends of course, on every individual country on how those things will be counted somewhere. But they still kind of count towards your, your credit in academia, if someone wants to think in that terms, but still gets the knowledge out there.

Track 1:

Yeah. It, it ends up with, what is the funding model, like what is being

milosz_1_04-07-2024_100233:

yes, yes,

Track 1:

And, uh, yeah, the, the topic about. Should we have a journal about negative results, Negativica, Acta, or similar? I think everybody who starts doing a master's or PhD would like to have such because most of the stuff you do doesn't work and you feel like It's still a result. It didn't work. Nobody will know about it. I don't want this to be wasted, but I don't think we found a good way to, to handle that.

milosz_1_04-07-2024_100233:

That's true. I would love people who have more time on their hands, which is probably nobody in academia to work more on this but yes, I remembered back in the day, it was always, you know, whatever protocol your PI had was what you implemented yourself. And there was no way of knowing whether this protocol is optimal or this is just something that someone came up, came up with 10 years ago and it got frozen in time. so yeah, this is a very, very good thing to get with the tutorials and schools, right? I mean, now BioExcel has this whole breadth of activities. exactly with weekly webinars with schools of biophysics with tutorials for the individual software. How much do you think this can shape a sort of school of biophysics, you know, a school in the sense of people who have a common formation common background. I that's a great outcome of this whole collaboration, right? That you generate a generation of scientists who at least they have a common background in something.

Track 1:

Yes. Right. So the first most established activity that we have, as you said, the, the webinar series where we cover a lot of the Most popular tools, applications, or methods that are used in computational biophysics. They've generated big interest with thousands of views. So that's, that's a great way to share information. But the other one is the, for example, the summer school, which we started the first one was in 2016, I believe. And now we have already eight or nine additions. Of the summer school, it's a week long school in Sardinia. The great location. Uh, it gives a very comprehensive, educational they're up to two students. we cover molecular dynamics, docking, free energy calculations, quantum mechanics, and, we always expand the program a little. For example, now we have also AI topics. And what you mentioned about this generation of scientists, We were so pleased hear from previous students who attended the school. They sent us unsolicited emails saying thank you so much. Your school was really helpful for me to get my Ph. D. And now I am, uh, got my Ph. D. First tenure position in, uh, some university. And, and we got a few of those, already. And, and for us, this is one of the biggest, appreciations that, we get from our work So the, the summer schools are indeed, very valuable and one needs to yes. So one of the things there is that we have this, follow up feedback, surveys with students, how for a year, a year and two years after, after the school is finished we also try to see whether what we taught them, how useful this was for, for their career. We also have career sessions in the school as well. and later we see the students going to. Pharmacompanies to universities eventually they, they come back to us in, in other projects conferences

milosz_1_04-07-2024_100233:

Right.

Track 1:

creates us a family

milosz_1_04-07-2024_100233:

Yes. Yes. That sounds amazing. We can only advertise it and people if you know someone who could benefit from such a formation, tell them, you know, spread the word.

Track 1:

One other activity we started, last year is this, uh, we call it ambassador program across Europe and expanded across the world also, where, we have representatives of bioxcel in which I have in every country in Europe that they will be our connection with the local communities because we have limited resources. Although the user base is really huge, we cannot provide support to everyone. And specifically with some communities, we don't always have a very direct link and we don't know what, what their needs are so when we are organizing a new school, for example we can tailor the sessions to the needs of the local communities if we knew what they need. That's why we have these ambassadors across Europe who are organizing local regional events, for say three, four countries that are neighboring countries in certain region. And they, they do a survey to see what, what the research is there, exactly what they would like to get training on. And then we provide this training to them Yeah.

milosz_1_04-07-2024_100233:

We both come from or Eastern Europe, right? Uh, so we, I think we know firsthand how the community can be centered around just a few selected countries in Europe. So yeah, it is very important to have this reach out to countries that are not traditionally part of the most connected, most collaborating blocks.

Track 1:

Yes. And, uh, it's very helpful for them to. To get access to the expertise that we can provide. It also helps these people to get to know each other. we think the ambassador program and what we saw now is, some people from the ambassador council, they, now they want to organize separate non bioxcel affiliated, events because they met and got to know each other through working with us. So this, this is very important for all of Europe to strengthen the links Yeah.

milosz_1_04-07-2024_100233:

Nucleation of local centers. That's, that's great. That's very good to hear. Okay. So Rosan Apostolov, uh, thank you so much for sharing your stories, your insights into this amazing collaboration. I hope it keeps on going and we hear more great results from MyExcel.

Track 1:

Thank you, Milosz. And, uh, congratulations again on your, initiative with this podcast. And, uh, I hope a lot of, quality, uh, presentations will be presented there.

milosz_1_04-07-2024_100233:

Thanks a

Track 1:

I'm looking forward to the new episodes.

milosz_1_04-07-2024_100233:

Thanks a lot. It was a pleasure to talk to you. Have a great day.

Thank you for listening. See you in the next episode of Face Space Invaders.