Phase Space Invaders (ψ)
With the convergence of data, computing power, and new methods, computational biology is at its most exciting moment. At PSI, we're asking the leading researchers in the field to discover where we're headed for, and which exciting pathways will take us there. Whether you're just thinking of starting your research career or have been computing stuff for decades, come and join the conversation!
Phase Space Invaders (ψ)
Episode 30 - Zan Luthey-Schulten: Whole-cell modeling, integrating biology through computation, and why honest collaborators are the best
In Episode 30, Zan Luthey-Schulten tells us the story of her most ambitious project over the last fifteen years or so: creating whole-cell simulations. In a reminder that true science knows no boundaries, she ties together a whole range of scientific disciplines - hardware optimization, stochastic calculus, reaction rates, advanced Hamiltonians, synthetic biology, cell imaging data, to eventually approach bioengineering and medicine. Zan also shares her stories and thoughts about how to pick collaborators, how to work together, and what we computational scientists can learn both from each other and from the broader field of biology or biophysics.
[00:00:00]
**Milosz:** Welcome to the Phase Space Invaders podcast. After a long break, we're back with episode number 30 and my guest is Zan Luthey-Schulten, the Murchison-Mallory, professor of chemistry at University of Illinois, Urbana Champaign, which you'll frequently hear abbreviated as U of I. In the unlikely case you don't know Zan, her second surname, Schulten should remind you of her late husband Klaus, who's untimely passing almost a decade ago has been one of the biggest losses for our field. But despite Zan working close to atomistic simulations or bioinformatics for many years, first on protein folding, and then on the biochemistry of tRNA, her most ambitious project over the last 15 years or so has been creating whole cell simulations. You hear that right, simulating entire cells, and the amazing story of that development is what we're talking about today.
So as you'll be able to tell [00:01:00] from the over an hour long conversation or even more if you ever met her in person, Zan is an incredible storyteller, and the way she's talking about the challenges and solutions to them is a testament to a great scientific spirit and her wholesale dedication to simply outstanding science. She's famously an avid collaborator so, expect lots of acknowledgements and references. I try to get all the names right in the transcript, so look it up if you need them.
Another thing that might strike you by listening is the blending of essentially all scientific disciplines into a single project that requires hardware optimization, stochastic calculus, reaction rates, advanced hamiltonians, synthetic biology, cell imaging data to eventually approach bioengineering and medicine. And I think it's a beautiful reminder that even though no single person can understand or learn everything, true science really knows no borders between individual fields and [00:02:00] Zan's career highlights that.
A side note, um, we've had a few technical issues while recording this one, which is one reason it took so long to have it out. If anything sounds off, which I hope it doesn't, this is entirely my fault, but this is also the longest episode so far, so I welcome. Any feedback on which time format you like more as this year, I will try to keep the episodes coming every other week. It wasn't sustainable to the weekly. And with that, let's not delay it any further.
Hope you enjoy our conversation.
**Milosz:** Zan Luthey-Schulten, welcome to the podcast.
**Zan:** thanks For inviting me.
**Milosz:** So you oversaw a long history of development in computational [00:03:00] biophysics, starting with how the theory of protein folding unfolded in the nineties with, Peter Wolynes and then, with an excursion into the biology of tRNAs. But in the last, I believe 15 years or so, you're mostly focused on the computational modeling of entire cells. And, uh, talking to accomplished scientists, I often notice the brilliance of insight of when the right moment is to start a massive undertaking in a direction that, you know, had never been even possible to explore before. And my guess is that, uh, simulating a cell is something that has been on people's minds for decades.
Perhaps even going back to von Neumann's. Cellular Automata,
but do you remember any particular time when you realized we had enough resources and experimental data to actually pull it off?
**Zan:** Hmm, that's an interesting question . So, you know, I keep on remembering a, a quote from Stefan Zweig, sort of about [00:04:00] decisive moments in history. and he said a lot of the important things that you do that are lasting, They seem to come in just, uh, infrequent moments of inspiration. And it really is a bit like that in that, you know, when I was younger I had to learn a lot about, physics and, uh, chemistry and biology.
And I was very fortunate I had somebody like Carl Woese when I was here. he, you know, he discovered the third domain of life and he was a colleague here. At the University of Illinois in the Life Sciences department. And then, uh, Peter Wolynes, who'd actually been a classmate of mine at Harvard when we were getting our PhDs.
So I had worked, with Peter on the protein folding that was very, you know, molecular oriented and seemed like something we could do, uh, that there was enough information. As you know, in the sequence, maybe not in the [00:05:00] structure database, that's has since come. And as you know, obviously there's enough because that's why alpha fold three is so successful.
There's just a lot of sequences, a lot of structures. But we looked at it from the more the physics perspective, the energy landscape perspective, like what drives it down a particular pathway to, uh, lower energy. And, and then with Carl, uh, I got more into looking at, into sequences, but that was of the whole ribosome because he used that to even set up the tree of life that we all know.
So he was comparing it and then, uh, it just not enough to do, the ribosome. You have to think about all the translational machinery. So some of my first grants with the National Science Foundation. We're on the evolution of the translational machinery, and that's a very fundamental process in a cell.
So by that, you know, look, not only are the sequence I had to [00:06:00] convince, Carl, we should also look at the structure because some of your sequence alignments are probably not right. Uh, or small errors here or there. And it was about the time that the Nobel Prize for the ribosome structure was given.
So he used to come into my lab and sit there and look at my 3D screen when I'd overlap the structures of two ribosomes and I go see Carl. The, the alignment from structure doesn't quite agree with the sequence and just trying to combine the sequence and structural world was a big thing at that time.
And, you know, but, if you look at the whole evolution of the translational machinery and you know, you have to have a somewhat similar code or, although I think they're what, uh, there's a standard code that most people know, but then they're like, you know, I, 30 possible codes where you've used some of the codons to do some other substitutions.
But, um, and it seems natural. Yeah. It better be pretty similar if organisms are gonna [00:07:00] share information. So every time I started out on a sort of a major area, you know, like with, with Carl, I'd say it was more like the genetic information processing and all the molecules that played a role in that.
so I was be, you know, looking at the structure. I even did molecular dynamic simulations on, uh, like the aminoacyl, tRNA, synthesizes, how they interact with tRNA. That became a module into our VMD, that you could compare sequences and structures. So I was very proud of that. And more importantly, doing that work, had me start, uh, interacting very, uh, strongly with our programmer that we had up at the NIH resource for macromolecular modeling and bioinformatics. Uh, that had been started by my husband, Klaus Schulten, very famous for his work on molecular dynamics.
And, uh, what, uh, what we did, there is I [00:08:00] added many of the bioinformatic tools and really tried to connect the two worlds of sequence and structure. And then, you know, every time you do this, you start looking, uh, how complete do we know this process? I've looked at the charging of the TRNA, I also did it quantum mechanically, so we could add, um, a QM module, to our molecular dynamics code.
and then you look at, you know, the, uh, elongation factors and you start asking yourself, well, is there enough information about all those? Sure, there's always some depth, uh, you know, aspect of it that's missing, but when you then start going and thinking about doing the whole cell, well, you know, you've gotta get genetic information processing covered and you also have to get.
Metabolism. Now I come from a chemistry department, so it's sort of natural that I look at reactions, right? so I did that and then it, uh, because again with Carl, I'd looked at archea [00:09:00] as well as bacteria, and I started getting into comparison of all the metabolic networks. What's the same in archeas?
Is it anything similar? Like what's on in bacteria? you know, they have a whole different way of running their energy economy, most of them are called like methanogens, so they use uh, methane. Anyhow, that got me into looking at what are the sources, what's out there that would allow me to compare.
whole Pathways even, and I say there's just so much work that has been going on. the, the keg or the work that was done in Japan, those guys were really leading it trying to, uh, you know, collect all the information on, on pathways. So I knew where to go if I wanted to look and create a model for metabolism.
So, okay, got that. I know where the structures are. I know where the sequences are, even if I don't have all the [00:10:00] information about any one cell, which you never do. I don't care if you work on e coli. And I used to think it was a big mistake to work on e coli because you just interpreted the whole world through e Coli.
And the
**Milosz:** we're all guilty of
**Zan:** It, you know, really, I think now one realizes one's a gram-positive, one's a, they're gram-negatives and they're gram-positives. And so, you really knew need to do a little bit of comparison and thanks goodness a lot more work started to be done on, bacillus subtilis sort of the model system for the, gram-positive, uh, bacteria. A again, you have to just be aware of evolution at every aspect of it. 'cause you know how to use the data. Like, oh, if I can't find it in this system, maybe I can find information about it. Somebody's has studied it over and then another system. And the one thing that really helped with Carl, uh, was when we were looking at [00:11:00] this evolution of the aminoacyl tRNA synthetasis.
Because he felt they're the next step. You know, they, they set the genetic code, but, uh, it's like, you know, if you think of evolution like an onion and the ribosomes in the middle, what about the next layer that's feeding into it? does it have the same canonical pattern? Uh, no. And then you start realizing some organisms have taken over perhaps a whole pathway or, or at least major steps in it.
From another domain of life. So you start seeing, mixing, going away from those canonical patterns. And then that told me already, Hey, I can use all the information that's out there. I try when it's possible when I start a project to look how they're, uh, related, maybe to some other organisms. But as you know, when you do these evolutionary comparisons, they do it mostly on the ribosome.
And then maybe you concatenate that with putting all the genetic information processing in it [00:12:00] so you can push the, the evolutionary comparison, fairly far. But most of the time what happens, like if you're working on certain bacteria, you better just look at all the bacteria, everything that's known about those sequences and structure within that sort of clade or um, uh, you know, the closely related ones. And we did. So when we started, my next question was, well, what's missing Dynamics? Of course. How am I gonna get subcellular dynamics? And because already, what was that, uh, Elowitz and colleagues had, oh, even before 2010.
Started to look at, uh, some processes are, you'd have to call them stochastic. Whenever you have a few numbers of something, the fluctuations become really important and he so demonstrated that in some classical experiment. I was, uh, friends with Sunney Xie. He was a professor [00:13:00] there at Harvard and he was doing these just incredible, beautiful experiments where he would look like at the LAC genetic switch.
Uh, in a, in, I think e coli, or he would then make up libraries where he could label a thousand different proteins in e coli. That's not enough, by the way. Uh, and he did experiments where he'd do a light sheet microscopy on it, get the distribution any protein within his population. So that became beautiful.
And then of course you start learning systems biology. And then again, I was extremely lucky. It was just at the pinnacle of this beautiful work from, uh, Bernard Paulson and colleagues. You know, if you look at one of the other, uh, sort of that led us in trying to do a wholesale simulation, it was Marcus Covert.
And he had been, I think, a postdoc with, uh, Bernard [00:14:00] Paulson. So I love his books. I love his tools. That means I could try to build together a metabolic pathway, and that's what our first paper was on when we worked either on e coli. That pathway had been, all the metabolism had been pretty well worked out, but it's when I came to appreciate, oh my God, you may know the network, but the fluxes are so different through parts of it.
Let's find where the main flux is. And e coli was a very interesting from that perspective, uh, 'cause I had control of the metabolic pathway. We had these beautiful experiments from Sunney Xie, um, and we could start putting the two things together. So, but that was very limited. That was like on a certain Lac, genetic switch.
But we could at least understand. Where you know the carbons are going, let's put it that way. Like when you take in sugar, what happens to it? [00:15:00] Well then as I said, I felt it was way too complicated, even e coli. So I looked for something simpler and then came. Craig Venter Institute there are gonna make a minimal bacterial cell and damnit that thing didn't have less than 500.
Genes. I invited them to give a talk at the FI think it was even the second synthetic biology meeting. It was in Germany where I was on sabbatical in 2015, I was talking to uh, Clyde Hutchinson, and they were getting ready to publish their science paper on this minimal cell. and it's in synthetic biology 'cause they are synthesizing the, the genome and have tried to remove all or as many of the, uh, non-essential genes as possible.
That's also very important process. I don't care what organism you look at, [00:16:00] there are too many genes. Like, my God, e coli has over 5,000 genes. And we still don't know what about 1700 of them do. So try to find out what's essential under some conditions that you can manage. And so that's how I felt about this minimal cell.
We still had some, we don't know exactly what they do. The numbers getting smaller. We have structure predictions of everything thanks to alpha fold three, right. But, um, we still don't know a few, of the proteins that do seem to have an essential function. They are membrane proteins, so I don't know if it's an alternative way of getting things into the cell or out of the cell, but uh, alright, they made this cell, they were beginning to characterize it and I thought, well, if I cannot simulate a minimal cell, then I can't do anything.
And that seemed to just be the right number. So you just start all over again. The first postdoc went and got what he developed the essential [00:17:00] metabolism. Of this organism. that was Marion Royer of, uh, he came from Germany and he was so methodical, and we already had to change that metabolic network.
We did steady state analysis. It seemed to agree with the growth rate that Craig Venter was measuring. we, it was fairly complete in that, you know, it didn't have what the peel people call it, like holes or you have to say there is a gene. It seems to, we haven't identified it yet, but there must be something connecting it.
So it was a pretty good, metabolic network. Plus that paper had the proteomics for this organism. It had also the essentiality test doing transposon bombardments. Um, plus the, the metabolic network and the steady state fluxes. And so I go, well, now we're ready to try a whole cell simulation. So we wrote a series of papers going [00:18:00] after, first the genetic information processing the kinetics for that.
Again, one was helped along the way. Some beautiful work, that had been done by others. I could take their models for the, say, making a polymer. You know, uh, I think it was, uh, Lieberman and oh, oh no, there's another hoffmeyer, uh, rates and, uh, like once you start making a polymer, whether it be the messenger, RNA, the DNA or even like, uh, the protein, there's, you know, a certain sameness to the process and you can use a general formula that they had.
We modified it so. You know, depending on whether binding of like, the enzyme or the monomers to the system is an equilibrium or does it, or does like the polymerase bind first and then you have elongation rate. So those were established over another series of papers, like in, Frontiers, in molecular biophysics, and then [00:19:00] frontiers in.
I think developmental and cell biology. And most importantly, each of those papers came about through association with another experimentalist. So where did we get the shape of the cell? Well, la la cryo-electron tomography. Elizabeth Vila, who is also out in UCSD, she used to just cross the street and come over to the Craig Venter Institute and hand off the samples almost in the parking lot.
and her student did the first couple of tomograms of that. So we were fortunate to have, um, images of the cell, uh, that time. It was harder, to get really good images, uh, of this. Fast forward now to today and go look up at the Chan Zuckerberg Biohub, their Imaging Institute, and they have a hundred tomograms.
Of the minimal cell,
**Milosz:** Oh wow.
**Zan:** you know, uh, before we were lucky to get a couple, but [00:20:00] we had enough that we sort of saw the small system and then, uh, when it was much bigger, about twice the volume and what do you see? And cryo-electron tomography besides getting the shape for the small enough things, you see the ribosomes.
So we also had the ribosome number, right. now we're getting into genetic information processing. So we have good initial conditions, right? And it just, uh, so we had that in hand. Uh, we had the number, it had enough that we could extrapolate how it changed with size, and that's where the work of my, uh, my students really came in.
Uh, we took the tomograms and, uh, we analyzed them. We got the ribosome count. We also started looking if, do we have all the molecules, biomolecules we need. To, to do this processing. And that's what the first couple papers were about. There was genetic information processing, then there was one just on tomography, [00:21:00] and then come yet another collaborator, Ramos Dame in the Netherlands.
He, he did the three C maps. So we had an idea, how much structure is in, in the DNA, and the answer is very little. It's not like you're looking at a three C map of e coli at all.
**Milosz:** These are our listeners, the contacts within DNA, right, between different,
**Zan:** that's right, that's what a three C map does it.
I'll just call it a DNA contact map. So this part of the DNA interacts with another part of the DNA and uh, so we were trying to understand, well, why is that ? And again, you are working with a minimal cell. So there are not that many proteins that interact with the DNA.
So that really, uh, brought in yet another collaborator and, uh, so for us, it's essential to have extremely [00:22:00] good experimental collaborators because I need the information.
Sometimes it's input, but also sometimes it's used to validate. So the same way, like with the tomography, you could use the small cell to start off with initial state. You had the proteomics. You knew what's essential. You could, uh, come up with an initial state and
then you had, the DNA, uh, contact map with told us very little supercoiling.
So replicating the DNA became an easier task to do, uh,
but still it took a very gifted, uh, new physics graduate student
who solved that problem.
actually,
uh, the first pass through was done by a student that came to me after he did a master's in mechanical engineering. So we developed a Hamiltonian for the dynamics of the DNA.
and then we knew this wonderful work of Case Decker,
[00:23:00] again at Te UDel who had measured the. Few remaining essential proteins that interact with DNA. Two of them are, uh, the SMC, their structural maintenance, proteins that form like a big loop around a section of the DNA and ratchets it through.
So it, it really helps with getting large scale segregation. But you can also get entanglement. So you better have, topoisomera ses or a gyrase there, and you do. So those are the two essential proteins that this system still had, and through Case's work and through the modeling he did with the colleague in England, I think his name was Michaletti.
we looked, we modified it slightly. We, we felt we had some better insight. On the modeling part, but we were able to start replicating it as the system is growing and now how did we get it to grow? Well, we had the [00:24:00] metabolic network for lipid metabolism. We also had the, you know, the model for making proteins, particularly those membrane proteins.
Beautiful work had been done on what's the rate? Of inserting those proteins into the membrane and how much it does every step, whether it's in metabolism or, protein, uh, membrane insertion or ATP driven. So we had the energy economy sort of to, uh, to test if we're getting that right, more or less.
Uh, a, a colleague name. I think Lynch had written an interesting article about what's the cost of replicating a gene. But the big missing number in there was, uh, getting the membrane proteins in. Now, this minimal cell, because it is really so minimal, it has to bring in nucleosides. It doesn't make amino acids, it has to bring them too.
So you have a lot of membrane proteins that have [00:25:00] to be there to import the thing it needs. So in a few steps. Like you start with a nucleoside and then in two or three steps you make up the nucleotides or the deoxy nucleotides. Um, for comparison, e coli I think takes about 47 steps to do that.
Uh, we do it in three, or this organism does it in three. Yeah. So you can model those and, and the kinetic parameters. Well, there's, Brenda is sort of the main, uh, repository of kinetic information. I had used that both for e coli and yeast and saw it improved dramatically. But when we were searching it, we would search it based on a phylogenetic, uh, argument, trying to find parameters of any organism that, or as close as possible to the one that we're looking at, at least the, the parent one.
but when they simplified. The genome. Craig Venter Institute at that time did not, had not done the transcriptomics so they weren't [00:26:00] exactly sure about where to eliminate it. In fact, their technician told me when I asked, well, how do you know where to cut it? Well, we know where the genes are, then we take enough, whatever that meant.
So it, it turned out they did a pretty good job, but in one place. They affected the control of expression for HU, actually it's called HUP, and that's why in our organism it had like less than 30 and it should have had 3000.
**Milosz:** And this is a histone
per
**Zan:** yeah, this histone
like, uh, protein. So that was, uh, a key to understanding, at least for us, why it didn't, you know, really support, Pervasive, uh, super coiling, which is also another way of controlling gene expression. So, uh, I think Ramos Dame is going to repeat the experiments once they get the HUP gene added back in. [00:27:00] It's harder to do. Um, so they build something is called syn3B. So what, what the Craig Venter people published in 2016 was Syn30.
The trouble is I told them that thing has an inconsistent morphology. We saw that from Elizabeth Villa's work. Um, and if you can't get, similarity in each generation of it, no physical or biophysicist is want to, going to want to touch it. And it's so also difficult for us to model. So, uh, that's why the paper that we wrote in eLife in was so
important with Mary and Broyer. Not only did we have the essential metabolism, we looked at what's what is really essential. We did the proteomics and in doing that they had added about 20 genes back to syn30. And then they went on and wrote some subsequent papers that said, well, did we really [00:28:00] need to add all those 20?
Uh, it turns out about half of them would've done it. Um, but one of the things they did add, by the way. Where the so-called the majority of the genes they had taken out in the, so, so-called DCW operon, which is Division and Cell Wall Operon. So they had taken out FtsZ and CepF,
**Milosz:** Oh, yes.
**Zan:** uh, which are, I think, uh, FtsA was still there.
Those, uh, are the key ingredients to making the septum. So that's why sometimes it'd be very long and sometimes it would just butt off. Anyhow, they're back. And now we started to get more consistency in it. But, we ourselves, even in our latest simulations that we've submitted, uh, for publication over the 4D simulation of the cell over the entire cell cycle, we could not come up with a good kinetic model for the formation of the septum.[00:29:00]
Um, there, too many of the parameters were unknown, and it's not an easy way to test it. So what we did is we turned to another really good experimentalist, Taekjip Ha. He had been here at the U of I and then John Hopkins, and now he's at Harvard Medical School. And they were able with help from my colleagues here, to label the the DNA in the cell, the membrane, and the FtsZ.
So, uh, the, uh, venture Institute was able to put a label on the gene
for the FtsZ, uh, I think M Cherry, and they imaged it and we could catch it in, sort of a beginning state and then as it goes into early pro late, and then when it doubles the volume. So we had that. But in looking through all the different cells they had.
They notice, oh, sometimes there's even a cell within a cell. So this [00:30:00] organism, uh, still can have some difficulty, I think, but we're actively looking at this now is the, um, rate of the surface area growth to the volume if it gets outta step, which you can sort of understand if you've ever used the so-called Helfrich Hamiltonian to minimize cell surfaces.
And we, we've done that. We looked at and worked at with a colleague named Weria Pezeshkian. He's now in the physics department at Copenhagen, but he used to work as, I think as a postdoc with Siewert Jan Marrink, who's another one of our collaborators. 'cause being a theoretician, yet, another nice way to judge your model is to try to make an all Adam model of it.
So I was going to do that with my husband, but then he passed away. Um, as he put it, I'm gonna check your results, Zan and uh, so can you build a cell with all that stuff in [00:31:00] it, as you say? Uh, and about, not even a full year after my husband passed away, I was at a Secom meeting when Siewert Jan Marrink came up and asked me, do you want to collaborate?
And we'll do it with the Martini model, which is a coarse grain atomistic model. And I go, oh, sure. Well. We published in, in 2023, a perspective, uh, that it should be possible to do molecular dynamics on an entire cell. It was a perspective, but in 2024, at the first computational workshop of our newly funded science and technology center for quantitative cell biology, we made, uh, not quite the whole cell, but did a mini version, and we did it in a workshop in a week.
I mean, that's all due to, Marrink's Lab, uh, and particularly a very gifted graduate student, uh, physics graduate student, [00:32:00] uh, Jan Stevens, who's coming to work in my lab now as a postdoc. So we can go back and forth between our two methods because for me to do an entire cell cycle along with the changes in morphology.
And the replication of the DNA, I can't do it at an all atom level. I mean, I can give them snapshots and they can put it in all atom or coarse grained atomistic model, but I do it at a particle
model
**Milosz:** Yeah, exactly. I wanted to interrupt here if you allow me, just to discuss the, the format of your model or the initial setup of your model. 'cause as you alluded to, you had, uh, kind of, continuous space diffusion, element to it. Right. And there was a stochastic. components where particles were moving in, in space.
Stochastically, uh, if you mind to elaborate where it came from and how it was put together.
**Zan:** Um, you know, it, I call it hybrid stochastic, uh, uh, deterministic methods because for most of the metabolic reactions. [00:33:00] We're talking even in the minimal cell of millimolars of something, you don't need to worry about the fluctuations in that. But when you only have a few messengers of a certain, uh, gene or one or two copies of the gene, then it makes a big difference.
The, the fluctuations that you see into the outcome that you're going to have in terms of the protein distribution. So, um, when we do the full 3D. It is what they call a reaction diffusion
master equation. So, uh, so the master equation indicates we're doing something with the kinetics and the rate of change of states of the cell. So what we define as a state of the
cell, uh,
would be like, okay, how many ribosomes are there? There
are an object, they have a certain size, they'll take up so much
space in the
cell. Oh, we start with the full.
3D and then four D [00:34:00] as we put in the time, uh, simulation of the cell, by putting a lattice, imposing a lattice onto the cell, whether the, we have a certain size for the sub lattice, uh, domains.
And, so this thing is about a half a micron across. So you, pick a lattice size, that is maybe like 10 nanometers. Well, if you know the dimensions of a ribosome, it's bigger than our lattice
sites, so you have to have other rules of how you move something over, like the ribosome. otherwise you, you pick, you know, the lattice size and such that, uh, that in any one move, an object that was in one lattice can only move to the next one.
In your time step, you don't want it jumping over several of them. It, it's a much more complicated probabilistic model to calculate. So ribosomes are taken out of that and handled slightly differently. [00:35:00] The metabolism reactions are handled through ordinary differential equations 'cause the cell is so small.
That you really don't have a, you know, you don't need a partial differential equation for that, you know, you assume they're all, you know, well mixed but it's controlling your level of your metabolites. And the proteins that go into that, of course, are being created over in the genetic information processing of your cell.
So you're, treating, uh, lipid metabolism, central metabolism. Nucleotide metabolism, import of, uh, amino acids and any co-factors that you need are done with one methodology. And then the genetic information processing, so that's DNA replication, transcription translation is being used within RDME.
So you can have like the messenger once it's formed, it has a [00:36:00] fate like, well, what can it do? It can diffuse to the cell. And either be degraded at a degradosome, which in our case we know is anchored to the membrane or it can diffuse to the ribosome and get translated into a protein. So, uh, we also had an idea about how many of those degradosomes are there.
It's a bigger complex. But it has like one endonuclease and two exonuclease that can chew up the mRNA, say from the midpoint or from a position interior to either the five prime end or the three prime end. We had all that
**Milosz:** Right. Anyway, every protein goes from the gene that is, transcribed to being translated , actually
**Zan:** That's right. So if you think about genetic information processing. You. Oh, my initial state is, I said that's why proteomics was so important. Well, and we had almost it genome [00:37:00] wide of the 452, uh, genes, that code for proteins, you know, the rest are coding for rRNA and tRNA or tmRNA. Um, we had like 425.
So we took a, an a guesstimate of the missing ones, from other systems. And said, okay, 10, we know nothing, and uh, so we knew how many polymerases we had. You know what the stoichiometry is of the complex you have, how many intact polymerases do you have? So they have to go and diffuse to the beginning of the gene on the DNA.
In order to make the messenger. And then there's a, an average rate of elongation that we have, which is dependent on the course, the length that you're copying,
so diffusion and binding to it.
And then elongation. Very important. And, and then actually in a simple [00:38:00] view of the proteomics, it's sort of, at the end of the day, it's the balance of the competition.
Of the messenger binding and getting destroyed by the degradosome to the ribosome. That determines more or less your, your proteomics that you see. So one of our tests is at the end of the cell cycle, have we doubled the protein count and it's peaked around two, but there's some .
**Zan:** [00:38:28] That we don't make quite as many. And there's some that we make a little bit more, and there's a little bit of a correlation between those ones that have, uh, what shall I say, bad, uh, proteomics values,
like if you said there should be, uh, 10 and it comes up with 40, looks really bad.
Uh, but maybe it really should have had one. You know, for all we know, right? So, um, the ratio, we call it the scaled, proteomics. Uh, it would be wonderful if they had single cell proteomics on the system. They don't, you do mostly single cell proteomics and single cell transcriptomics on, eukaryotic cell.
That's getting to be standard, and that's wonderful because. Even out of that, you can get what's called a pseudo time, like for transcriptomics. So if you look at enough cells in, in different states of the cell cycle, you can get an idea [00:39:28] about the, the pseudo time and order the data. Well, we can't do that here, but we do have transcriptomics data and as usual with bacteria, the amount of mRNA is much, much less, It's, you know, a handful of, messengers. I think in general,
one messenger can make, uh, at most something up to like, uh, eight proteins, you know, before it gets destroyed. so there are a lot of things
**Milosz:** is also policistronic, right? You have like correlations between multiple proteins, made from the same messenger, right ?
**Zan:** Uh, yeah. One of the things I looked at, because I had a colleague who, you know, when I was making more analytical models say of genetic information processing, we just tried to solve this, uh, make it well stirred, and then you can write down a chemical master equation and in some cases you can solve those.
I I joke, I tell my students, do you know Stochastic Chemical Kinetics? That was [00:40:28] the title of a, a really important paper written by a leading, statistical mechanist named, um, McQuarrie of I, God, back in the seventies. But nobody took it seriously,
**Milosz:** I
remember the textbook from McQuarrie.
**Zan:** Yeah, yeah. No. Everybody knows his statmech book, but he wrote this beautiful article and he was writing down how you would write these chemical master equations and could you solve them analytically and when should you get a different answer.
Uh, doing things stochastically than doing like an ODE analysis of the kinetics and
very, very important work, in my opinion. Of course, there's
the Gillespie algorithm that came along. His work is just, uh. We use it all the time. And thank God I got to meet him once before he passed away. I was at a Gordon conference where they were, uh, honoring my husband and also honoring him, and unfortunately he passed away like six months later.
But I was talking to [00:41:28] him about how we had used it for various processes within the cell, and he and I felt really good when he was giving us all his blessings for it. Um, although I will argue that the equations are best derived for gas phase reactions, um, but, you know, you make certain assumptions as you go to, you know, condensed media to, to be able to do that.
But anyhow, to go back,
uh, we could solve the chemical master equations in some cases for like a single or a couple genes analytically. And, many beautiful things came out of that. Uh, it did also depend on, you know, where that gene was sitting. 'cause how many copies of the DNA did you have?
And then does the second copy of the gene, does that have all the same, um, transcription kinetics that the first copy had? Well, we just assumed it did, and I had a colleague who was trying to measure them. And said, [00:42:28] no fast growing or slow growing e coli, there could be a difference. So we look, there is some difference, like when you make the second copy of the, of the genome and as they get further and further apart, of course you're gonna see more and more, differences.
But they're, they're not huge. cause a cell is only, you know, it grows from like a with a diameter. of about, you know, 400 say to 500, and you've doubled the volume and then it starts going through making this, uh, dumbbell shape. So, uh, up to doubling its volume.
we have a many different ways of looking at our answers. And one of the nicest, of course, was this experiments of Taekjip Ha uh, from the fluorescence of the protein, uh, of the, DNA, the membrane and the septum. So, uh, just fantastic that we had that.
And from those in measurements, we saw that it was symmetric cell [00:43:28] division. So we just, uh, we let the growth come from our lipid metabolism, um, and also the membrane metabolism. But we just assume that it is symmetric. So in that case, for that part is what I would consider an agent base. We, uh, impose symmetric cell division.
Uh, we've almost got a, the model for, the septum formation working. And I think that together with the Helfrich Hamiltonian and the work with Pezeshkian will perhaps allow us to get to any state of the cell that we see. So when I say state of the cell, for us it has to be like how many ribosomes there are.
Um, and, and then, how many of each protein, the, uh, number of metabolites every metabolite that you have. And, uh, so you transcribe something, you have changed the state of the cell. You've [00:44:28] gotten one more messenger than you had before, or one dies. Gets degraded. You have one less, you make a protein, then you've increased the number of proteins by one that is a different state.
So, uh, what you're connecting in those reaction to fusion master equations are transitions between these various cell states and some of the information is being calculated from a, um, you know, ordinary differential equation for metabolism. Now, how do you pass information back and forth between these?
We wrote a whole paper how to do this,
and there we were working with, yeah, with computer scientists. so, uh, radi, uh, airon is very, very good. Uh, he was giving our program, to a rotating student in his lab just as we were writing this one paper. And he was automatically checking. our algorithms.
So, what happens, [00:45:28] um, is, we wrote a paper, I think it's for IET Systems Biology in 2018 of how you let just a chemical master equation talk to an ODE. And so, uh, maybe you do some processes for a while with the CME. Uh, again, 'cause it's stochastic, the variations in the time, that a reaction takes place.
And then after a certain time tau the communication time that information gets given to the, say, the ODE. So if, the CME is telling you, well, how often do you make this one protein, then that number of proteins is communicated down to the ordinary differential
equation because it needs it for metabolism.
You know, it's concentration dependent, so, and then you do the ODE for a while, then what does that do? It changes the metabolite number and then you go and pass that information back to the CME or to the, to the master [00:46:28] equations.
**Milosz:** Do you also have to keep track of things like post translational modifications and interactions between proteins that change the states of proteins, like throw them on and off? Or is this model.
**Zan:** Yeah, some of those things we have, um, some of those things, um, we've just submitted, um, for a special issue of, the Journal of Physical Chemistry, uh, B one on, um, assembling now all the complexes in the system. And, it turns out many people know what the complexes are. They don't have very much information about the assembly kinetics.
Now, again, I'm really lucky 'cause I worked with one of the world's best experimentalists on this and made a model for ribosome biogenesis, which is already in our work. And that was a Jamie Williamson from Scripps. He is sort of the guru for, uh, ribosome formation, and he did it on [00:47:28] bacteria. For the small subunit, that's where we wrote a paper together.
And so he did these pulse chase experiments. So let's say to, to assemble the
small subunit. You have,
You
know, the 16 s, the 23 s, the five s,
and then the proteins start binding one after the other
on it. and for the, small
subunit, it's been much clearer. Um, the steps of that ever since this beautiful work of Nomura in Japan years ago when he would just add the proteins and you could see the size change.
So, uh, he would, put in, like, let's say if you put in, one of the first primary binders is S four. You put that in and now you start titrating in the next one. Uh, then you can see how it changes to these Pulse and chase experiments. We took his data. We made, it looks almost like a folding funnel for the ribosome of kinetic pathways and there's several paths that lead to Rome.
but you get almost so [00:48:28] many states. When we did that first paper, uh, it was surpassing our capacity to put it onto the GPUs to do it. So, we trimmed it down and took just the main pathways. and now we try to get it down to the one with the, just the main flux. Do we get stuck in any one intermediate where we're waiting for a protein to be made?
So it can be added onto the, you know, the growing small subunit. And then, one of Jamie Williams's, uh, postdoc, Joey Davis, is now a professor at, uh, I think MIT. In the chemistry department and we used his model for the large subunit measurements that he had done. So, we had models and that's the most complicated, uh, complex to form,
but it's the one where the kinetics of the individual assembly steps had been really well studied.
So [00:49:28] we could do that. Some of the others, nah, not so much. But we, we, so we looked at a range of assembly kinetics, like it has to be faster than this, but if it's fat, you know, and if it's slower than this, then you won't get enough intact proteins at the end of the cell cycle. So what we haven't done to my complete satisfaction is look at the ATP synthase.
Oh my god, that's a beautiful, complicated machine. Oh, Walker rightfully got the Nobel Prize for getting that thing. Do you know how hard that is to assemble that? And I, I just noticed, I don't think they measure how often there are incomplete, assemblies
of it,
**Milosz:** Oh, yes, we do work with that, but never with the assembly itself,
**Zan:** Yeah. Well, as you, well, it's not in every organism, but in ours and in many, there's like one operon on that contains each of the components, but they're [00:50:28] not in one-to-one to one. No. In our, it looks like we have to get about 10 of these so-called C subunits in the membrane, um, to, you know, be made, diffuse there, assemble before you put on the external units.
Oh my God. Now proteomics said we had no C subunits. If you put that in, the cell dies. Of course. 'cause I had a student years ago who had done that, an e coli, and he goes, he doesn't survive. Something's wrong. And I go, it's an and rule, right? And you put in Z Oh my god. Uh, no wonder, right. Uh. And I don't, I think 'cause it's not a very big protein and it's a membrane protein, those are very, still very difficult to do with proteomics.
So,
we put the complexes and then sort
of like trying to get an idea. Well, what rate has to form the F zero subunit. The F1. that you can [00:51:28] get an intact, ATP syntase and you still will only get maybe up to 80% of it, uh, of them are completely assembled, right? So, um, there many places we've made predictions.
I would love it if the experimentalist could give us more insight, like, how often is the f Do you have more f zeros than F ones? Or if it's incomplete? I, I don't know what experiment exactly you'd have to do, uh, to do this, but it'd be great if somebody could tell us Yes. You know, at certain times of the
they're incomplete.
**Milosz:** so given the complexity of, You know, each every individual assembly stage, where do you think you can take this model? Like what would be the, ladder of complexity that you are envisioning? Going towards a eukaryotic cell in the end?
**Zan:** That, that's
**Milosz:** Is that the
goal?
**Zan:** So I, I feel like we've learned enough from here, and also [00:52:28] really standing on the shoulders of biochemists for and decades of information. That is out there. I think we had enough to attempt a model enough to tell us what still needs to be
confirmed. Let's put it this way. Right. Uh, the others have more complex, networks. We have almost no regulation was left in the cell. So this is just, if you get things to work, sort of what's the range of parameters you have to be in. In order to have the cell grow and divide in this over the cell cycle and, and double
its population.
And as I said, the big breakthrough was this also for us was the physics student, uh, Andrew Matin, who was able to put the DNA replication onto its own GPU. So now you have
to have it communicating with the rest of genetic information processing and , uh, the [00:53:28] ODE to do this. But that enables us to do the whole cell cycle within about four days.
So we're going to be learning a lot on just on this, what needs to
**Milosz:** On how many GPUs, sorry, just to have an idea of the resources involved. How many GPUs are you using for a single simulation like
**Zan:** Uh, so you need at least two and, and then they should be pretty good. Uh, GPUs like A100s. From Nvidia and they continue to help and to advise us because my main programmer, at the NIH was John Stone. He is a GPU guru. He was the main developer of all the VMD and for us to make our movies in VMD,
he, it enabled
us to put in these little cubes to represent the lattice representation.
I think I told already at the meeting that Nvidia tripled his salary and hired him away,
**Milosz:** Yes.
**Zan:** but he continues to help because [00:54:28] he knows this is really frontier work. I mean, uh, so many people asked me about this article that was called The Virtual Cell in Cell, uh, and that was on Eukaryotic Systems.
Can you get enough imaging data? and omics data,
like if you have transcriptomics, proteomics, metabolomics,
can you predict the state of the cell or if the cell's gonna undergo cell death, maybe you could predict cell death. But I tell those people, that I also feel you need to understand the underlining chemistry to understand.
Why it goes from one state to the next. Because I think for eukaryotic systems, you do have a lot of these alternate pathways, and particularly when it transitions into an unhealthy cell, it is because something has gotten screwed up, uh,
with one of those pathways and you can perhaps understand and try to create, a [00:55:28] better solution to it.
If you can go in. And, you know, first, uh, I'd say treat the symptoms if you can. And then meanwhile look at what is causing this other pathway to disappear. Or, if it's a branch pathway, why does it go down one pathway and not the other? 'cause you need both products, you know, something like that.
I think it just opens up the possibilities of finding cures so much more. So, in this es the science and Technology Center that we have, uh, my, associate director is the head of the Cancer Institute and he's very interested in these transitions to unhealthy cells. So, now that I feel that we have a good basic understanding of the cell biology, at least for bacterial cells, we're trying to go forward looking at eukaryotic cells.
And first we're doing yeast just because there has [00:56:28] been, consortiums working on yeast for over oh two decades. I think there's even a thing called the yeast book. There's so many good yeast, databases telling you everything from under these conditions. What are the metabolites? You know, it provides all the omics data you need.
And for a theoretician, I'm telling you, those are called initial conditions for us. We can put in the complexes, we can, uh, but we have to have reliable omic status to start it. Uh, and the difficulty in going forward in my, opinion, 'cause this is what I see, what's happening to us with yeast, you know, people already know how metabolism is sort of split up between the various organelles.
That's all right. How the organelles interact with one another. Exactly. Hmm. That you have some insight. I'm very lucky. I work with another fantastic experimentalist [00:57:28] and her lab is called Julia Muhammad and she has a very good postdoc, Marie Spindler at the EMBL, and they're the ones that told us, aha, where the ER is in yeast.
So if we have like a switch there, like a galactose switch, which has a transporter, which is a membrane protein, you know, that messenger had better be translated on the ribosome that it's embedded in the er and their work is now so good that we can see the difference. Like, people used to try to do predictions.
How many ribosomes are within a certain distance of the er. Maybe 40, Marie and Julia only counted if that ribosome is sitting on the ER and oriented in the right way to be translating a protein. So that now goes into our model. what is going to be difficult and seems to [00:58:28] really need a lot of work is on the folding of the, of the 16 chromosomes.
in the nucleus, there are various groups who are trying to do that. they use all the data like CHIP-seq or these contact maps, um, to get an idea of what part of the chromosomes are, touching one another. And what about between the chromosomes? That's helpful. Uh, but it's much bigger calculation.
So we're, right now we're evaluating, trying to use other people's programs to come up with various packing of the DNA into the nucleus. Um, what's sort of a little bit less understood is what parts are more towards the inner lamina of the, of the nucleus, you know, um, and then, how it's folded and how things get expressed.
You know, and I, I've already looked at a model like of speckles inside of [00:59:28] the cell, and then the nucleoli, the condensates start playing a much bigger role because you're just not making, uh, like ribosomes at a standard rate. And so that's why, you know, you have a collection of all the intermediates sort of sitting there.
Um,
**Milosz:** don't have a great grasp of what happens in the condensation right
**Zan:** Yeah. And you know, and it's just trying to, uh, find what are the best experiments, what gives you the best kinetics? We've tested it ourselves a bit, trying to get an idea, well, it has to be faster than this, but slower than this. But what's missing for us oftentimes are a nice experiment.
You know, cell biology experiments are difficult to do, and. You know, if you have to label it, then you know, you don't know if you've changed the dynamics of it. Uh, but label free has, its also, its problems. And again, we're very fortunate here [01:00:28] at the U of I, we have, uh, an NIH Institute for Label free Imaging.
Uh, and I know what they can do and cannot do. and we also have colleagues who will label it and look because, uh, even before our science and technology center. Was started, I worked with a very good colleague of mine, Martin Gruebele, to write two grants, to finance, getting a min flux microscope here. Now, that was the one that was developed
by Stefan Hell's Lab in Goettingen, and he of course shared the Nobel Prize.
for super resolution imaging. So those were, those were the techniques that would allow you to see these fluctuations in the cell. Somebody, you know, you don't use these stochastic chemical kinetics if you can't measure the variation. So he had that, but he could speed the whole damn thing up by a factor of 10, um, with the in [01:01:28] flux and microscope.
So it has tracking of like of a hundred. Microseconds. The resolution is one to two, nanometers, you know, that's like what we get from cryo-electron
tomography,
**Milosz:** That's incredible. Bonus.
**Zan:** yeah. So, techniques have come up, several new ones or there's just been adv advances in them, that it just has now suddenly opened another couple windows, like, uh, with cryo-electron tomography. They're coupling it with, of course, machine learning and trying to get more data out of it 'cause it didn't use to show you so well where the DNA is still is not great,
but
**Milosz:** Now we can see,
huh?
**Zan:** Yeah. Oh my God, it is so much better. You also have soft x-ray tomography. You can see the host cell at one shot, but only at 45 nanometers. So you don't see the ribosomes, but you see the organelles and you certainly don't see the er. So you, [01:02:28] from a perspective of a theoretician, I gotta have friends in several camps.
So, What can I get from cryo-electron tomography? What can I get from soft x-ray tomography and then the biochemist for some of the experiments? And, uh, then try and synthesize it all
together.
**Milosz:** It is absolutely incredible. You gotta say, Yeah. how we created this whole network or graph of collaborations over, over the years. You
**Zan:** You
**Milosz:** have any ideas for how this human environment shapes, you know.
your research? Maybe to wrap up the discussion with a more human side of, of research.
**Zan:** Well, it's funny you asked, to set up this network. Yeah, I'm pretty proud of it and that's what I go, that has to do with also I'm being older. You know, I've worked with many of these people along the way,
and what you value most when you have a collaboration is you're very honest with each other.
Either you deliver it or you don't deliver it, and you say, so. And you also [01:03:28] say what qualifications you feel you have on the data. And I think because of that, even though we don't even at times even share a grant together, working together helps each other. Uh, and that's invaluable. So you don't have to pay me to work with certain people.
It just talking with them is such a pleasure because they're so open and so honest. About the work and, you know, and scientists, we
just wanna figure out how the damn thing works, and then I'm hoping
that other people can
take that information and really figure out, well, what happens when it becomes goes wrong? how does it
become unhealthy? I, I tell my colleagues that are funded by the NIHI cannot study a particular disease. Because I'm such a hypochondriac, I'll get it. So I will, I will be
more than happy to work
**Milosz:** I.
**Zan:** for [01:04:28] the reference state. Right. And I can think about what would cause it to, you know, go off equilibrium, but don't ask me to really concentrate on one particular, disease.
So, I think. Uh, the other question I get often asked by my, coworkers who join the lab, how do you keep in contact with all these people? Well, come on folks. What did we all do during COVID? We zoomed and zoom got much better, right? But it still doesn't replace just getting in a bloody plane and going over and visiting one another.
And we do that. It, I try to do that once a year. I'll go visit a certain set of. Collaborators. Some of them I know really well, but like with Jamie Williamson, when I wanted to work on the ribosome, biogenesis, I, I just took my student with me. We went to Scripps and we made an appointment.
We saw him, then we stayed [01:05:28] another day, worked with his postdoc to get all the data that we needed,
and then we just. Made a model, then we were invited back, presented it at Scripps, got sort of their blessings, and then we went on, you know, that's more or less how you do it. And as I said, I, I think what I was blessed with at the beginning, like to look at genetic information processes, you cannot underestimate how lovely it was to have Carl Woese here to talk to, you know?
I would start working on something and I would make a prediction and he'd go, oh, Zan, that's a great idea. Then he'd get up, run to his office, come back with a paper that he had written about it 20 years ago. Then it was 10 years ago, and then five years ago, and then finally I came up with an idea, nah. You know, I never thought about that.
Right. And if I needed
**Milosz:** Takes time. Yes.
**Zan:** if I needed information, he would call up a colleague 'cause he was so well known. And then plus [01:06:28] when he got the Crayford Nobel Prize for the Tree of Life, or the discovery of the Archies came even easier. And the other thing that helped us was, at least on this minimal cell, one of our collaborators was, um, Hamilton Smith.
Now, Craig Venter had paid him a lot of money to come and help start with Clyde Hutchinson, the synthetic biology group at, uh, JCVI. And, in fact, our model to initiate DNA replication comes from some work he had done. So, uh, this was the importance of this molecule, this multi-domain protein called DnaA binding near the origin.
And there are two, there are multiple domains. Some of it binds to the double stranded and some of it binds to the single stranded. You have to have enough bind to the single stranded that you get a filament that stabilizes the bulge to get the machinery in. Now, you can even check this now with alpha [01:07:28] fold three 'cause it also allows you to look at first the interactions
between the DnaAs and then the DnaAs with the DNA.
And I gave this now as a homework problem. In my, uh, class on concepts
in biophysics, but that was only possible this year, right? Uh, 'cause alpha fold three, it's great. It's still not perfect. but you know, there's a way to go about processing this. So, it takes a while to get to the top of any field that you understand where the reliable data is.
Who is doing this work? And what is still unknown. And then it's like, we'll try and make a prediction and then oftentimes that will at least help the experimentalists. Uh, well, we can perhaps look at that, you know, but they also, there are limitations to what they can do at the moment, and, This [01:08:28] disruptive, uh, technology has to come along that is suddenly allows you to get, tracking a factor of 10 faster or go get more information out of it. Maybe there's some things you just have to wait for that you can get an idea about something happening. But, uh, as I said, I've never enjoyed science more than I do right now just because I finally know enough and I know enough people.
and uh, I just go, oh, the, I have some colleagues that keep wanting me to write a paper about, oh, what are the next, important biological questions to answer? And I go, what am I the pope, you know, or Hilbert was for math problems. No, I have an idea. better or with bacteria, but I think with, uh, eukaryotic cells.
We're getting there. And I, I will say them particularly to my students and postdocs, 'cause I hope they will be the ones that [01:09:28] will, will lead that revolution into whole cell modeling. For eukaryotic systems. I think the imaging is helping enormously and, and institutes like that. Chan Zuckerberg, Biohub, oh my God, have they contributed data, uh, under the leadership of Steve Quake?
I mean. There was a beautiful paper in science a couple years ago on Think Tabula Sapiens, where they were giving you the transcriptomics data for 500,000 cells from 20 organs, uh, from people of two different ages. He's done it now for, um, proteomics as well. So, so as, again, as I put it, the initial conditions and some intermediate points are there, it remains the, the business of the theoreticians.
To come up with the right physics, the kinetics, the mechanism, and uh, put it together and see if you can do it. And I'm, you know, when our latest paper gets through review [01:10:28] process, we always give out our software. But I have to say it wasn't written so modular. You know, now we spent a lot of time, so if you wanna come in and apply it to another bacteria, a little bit more complicated.
Metabolic network, that module can be easily modified and you can do that if you have more information about the, uh, complex assembly, you can do that so the thing is very modular, so hopefully that will invite young people to come forward and just try it.
**Milosz:** Yes. I think if we have this situation like with Molecule Dynamics Code, right, where you can just take a tool and.
Put your own system inside. It's gonna be a really, really exciting field
**Zan:** no, it,
will, you know, and look how many great, uh, you know, uh, in the heyday, at the beginning of molecular dynamics, you had. So many, the theoreticians at all the major universities were contributing. Like, what does it mean to do an [01:11:28] NVT simulation? What does it mean to do an NTP? How do I get an, how do I keep the pressure constant?
How do I keep the temperature constant? Uh, what do the moves have to look like? You know, uh, you're totally right and I think that's going to come with the cell simulations, together with machine learning and the ai. You know, I have, computer scientists looking at our cell simulations, analyzing it because there's so many moving parts.
There's metabolites, there's proteins, there's DNA states and how their correlations are going. Um, I'd rather have him tell me what the phenotypes are than trying to look at some correlation map, you know?
Uh. And they have ways of, of,
doing that. And
so I think it's, it's just gonna be such an exciting frontier.
but I still,
at least I tell my students, we are the lucky ones. We have the tools to go all the way down to the molecular level [01:12:28] and then up to the, the pathways, and then to the host cell. Do it. Just don't start relying on the end product. If you want to test, if that could be even possible, like with this DnaA interaction, I was just so happy when Alpha Fold three gave the same answer that Hamilton Smith had come up with.
And, and you know, that
**Milosz:** I see.
**Zan:** studied here at the U of I, mathematics, by the way, when he went out to California, he decided he was not going to get the field prize in mathematics. Switched into biology and got the Nobel Prize or shared it for the discovery of restriction enzymes, which he proceeded to take out of the minimal cell, you know?
Uh, and he goes, he once said to me, Zan, if it would help you, I can put some things back in. And I go, no, no, no, no, no, no. That would only complicate my life. Right. But he was [01:13:28] wonderful. He could check our math, so to say. He was as interested in. the theoretical models we did as the, well, as the algorithms.
he was just an incredible collaborator and it sort of ended, abruptly, uh, with COVID because, uh, his wife became ill and then he moved with her back to the East Coast. Uh, but I'm telling everybody who gets started, it's so important. Those initial contacts you have, the people. And look at Lucky me, I worked with Peter Wolynes on the protein folding funnel and, and Jose Onuchic.
And then I had Carl Woese for genetic information processing, uh, and then Case Decker and Taekjip Ha on some of the measurements. I mean, I've just been so fortunate to work with people who, how should I put it? They're experiments I can believe. Right, and then I will [01:14:28] figure out how I can use that information, if not as input, at least to test my simulations.
And that gives one so much confidence.
**Milosz:** Now I.
want to say it's, this conversation is very optimistic in the way that, a promise of. Enjoying science more with time, with experience. And we have the promise of going into cellular models in the computational, Uh, sphere, right? So using computers to maybe one day test new drugs, this new therapies, this is really, really exciting and groundbreaking and well,
it really makes people excited for science, hopefully.
**Zan:** I find it really exciting and each of my collaborators, like with Siewert Jan Marrink and Eric Lindahl, they have the Martini Force field into Gromacs. You know, uh, hope my husband forgives me for not getting it into NAMD. but we're working on that
too.
**Milosz:** it'll be
there. I'm
**Zan:** it'll be there eventually, but I'm saying, if you're coming from an MD world, then I think MD of a, [01:15:28] an entire cell is now possible.
And it opens up so many problems that have to be solved. You know what works, what doesn't work? where do we need to do it better? I also had a fantastic experimentalist
who just coming in with a few days, again,
we're gonna give a summer school together. He did all the lipidomics
on the minimal cell.
So how do you think I checked whether my lipid metabolism was working right? It only makes two
lipids, and he measured them. Right.
And, uh, the guy is fantastic.
you know, and his lab is just
growing and growing and growing. So as Siewert Jan Marrink, you know, you now get more people applying to your lab than you can really handle.
'cause I like to keep my group small enough that I can talk to everybody in the week. Right. And that's, for me, it's important. I'll never get, a huge group. Maybe that's my downfall. I
don't know. and
also to have the same excitement and sense of [01:16:28] cooperative within the group.
You know, you'll know who's leading an effort. there's a, a student who is with me, he is not a postdoc up the Beckman Institute. Zane Thornberg. He has been fantastic at, uh, working on the models of genetic information processing. And then along came Andrew, and then they are just a wonderful pair together to get it, or Andrew taking it onto this other GPU.
And then with the help of some advice from Nvidia, you know, just like, oh, they can give you advice, but it's like everything.
If you don't take it.
that's your problem.
[01:17:04]
**Zan:** Well, I'm extremely optimistic that we're in the next age, that, you know, if you think of all the technology and some of it is indeed,, disruptive and it's really changing things quickly. And I think the ability to get, all the information about a cell. You know, and, and that in terms of omics data, transcriptomics data, proteomics data is metabolomics data.
you have the information about what's in there. the maps, the reactions, to the most part are known. But if you look at the human cell, if that's the ultimate eukaryotic system, they are on recon three, maybe recon four. The ye to model the human cell as a yeast cell to some degree. And on yeast we're up at yeast nine.
and it's working, uh, to a certain degree quite well. Anytime you have a consortium, you have [01:18:04] to look at it for a first, it improves and then it goes downhill. 'cause too many people are changing things. But I think we're getting close to a steady state again. So can you use that information? Yes, you can.
And then there are so many new techniques, as I said, single cell omics. I love to get more information about subcellular dynamics. That was one of the reasons for the foundation of our science and technology center for quantitative cell biology. Can we combine all of this information coming from omics along with dynamics and the reactions and go forward?
And if you concentrate on just the essential components, I think one can continue to steadily make contributions, but you do have to pay attention to what are the growth conditions. [01:19:04] And, some are easier to use experimentally and some of them, oh my God, this is how it really works in the cell.
That's a little harder. Um, 'cause that environment is so heterogeneous. But, you know, think about it, um, every time I go to a Gordon Conference, I am inspired. Right? I was at one by the way, in Barcelona a couple years ago. And I look up and there's a fellow giving a talk on the MINFLUX where he has measured the kinesin walking on the microtubule.
He's done it in the cell well. Hmm hmm. That's Jonas Ries. He was at that time with the EMBL. Now he's at Vienna and he is also on my advisory board because who could tell us better , how to use the MINFLUX than he Right. And I knew I had a colleague here trying to do it and I go, maybe you need to be communicating with this guy.
'cause he [01:20:04] showed it beautiful work. Right? And , I tell you, it was jaw dropping. And you know, the, from Stefan Hell's lab, they've also measured kinesin walking on microtubules. Uh, the first paper, I think on MINFLUX was about the diffusion of the ribosome in the cell, and they were measuring three different diffusion coefficients.
And think about it that could correspond to different states of the ribosome free with no messenger, maybe in a polysome, you know, so you start understanding what they're seeing. And you can, if you want, you can put that into your model. Right. Um, so as I said, I think it is just, looking at what is a major step, forward, and then trying to, you know, zoom in a few of those things is going to be the answer.
And for me, going forward with eukaryotes, I'm gonna try and do first an understanding of yeast. And then I think if I understand that, that I [01:21:04] can move on to other human cells. But you know, the problem there is what human cell to do.
And if you look at the people who study cancer, they would love you to study, breast cells.
'cause breast cancer research is very advanced here. But I think any area where there's a lot of people focused, uh, and they're doing excellent work. Uh, that's a place to at least look at the information and try to make use of it. and I'm just hoping that with the present cuts in the NIH funding that we're having over here in America, that hopefully, uh, enough of the, information and, uh, and people working on cancer research, that that will survive.
And, uh, continue to provide us with lots of information, So, uh, if I were a young person, I'd say, get into this field but please, if you're a physicist, [01:22:04] learn a little bit of chemistry and biology. and, uh, if you are a biologist, well, almost all of them know how to do Python. Write a program and then one can share.
No, you're laughing, but sharing the data. Back and forth with my collaborators is very important. It's just like you call that a spot, uh, really. And, uh, we went back and forth looking at each other's data. I even got to use some of their, their tools, trying to understand how yes, I was just above a threshold or below a threshold.
Um, that's very important if you're gonna have this very tight and very honest. collaboration, uh, with people, right? And I think the one shouldn't be worried about getting credit for it. Uh, the people in the know will figure out, who was the main contributor for that. But, uh, like right now when I write a paper on the minimal cell, it looks like a high energy physics [01:23:04] paper.
There are a lot of collaborators that are on that, but I think it, anybody would read it would know, oh, the imaging comes from Taekjip Ha, the lipidomics came from James Sense. Right. And sometimes when the journals, you know how they ask you to give, what are the contributions of the authors? they should let you do a little bit more, uh, more,
**Milosz:** Nice.
**Zan:** there to be really frank.
Okay.
**Milosz:** yeah, Zan Luthey-Schulten, it's been an absolute pleasure talking to you. I think we've got a lot of inspiration and a lot of interesting stories that you shared with us. So thank you so much for spending this time with me and, and talking.
**Zan:** Oh, if I can get more people into, uh, cell biology and quantitative cell biology, then I have done my job. Okay. Bye bye.
Thank you for listening. See you in the next episode of Phase Space Invaders.