Leading engineering for Postgres on Azure with Affan Dar
CLAIRE: 00:00:00
Welcome to Talking Postgres. In this podcast, we explore the human side of Postgres, databases, and open source, for the target audience of developers who want to hear from other Postgres community members. I want to say thank you to the team at Microsoft for sponsoring the community conversation about Postgres. And I am so pleased to be your host, Claire Giordano. I would like to introduce Affan Dar, who is our guest today. Affan started his career as an embedded software engineer. After studying computer science, he worked at Microsoft, at Zynga, at Meta, and now he's back at Microsoft again. He has worked as both an individual contributor and technical lead, as well as an engineering manager. And he switched back and forth between these roles a few times during his career. Currently, Affan is Vice President of the Postgres team, the Postgres engineering team, at Microsoft. Welcome, Affan.
AFFAN: 00:00:57
Hey, good morning, Claire, and thanks for having me on the podcast.
CLAIRE: 00:01:01
I am so glad that you're here to join us. All right. So let's dive in. Today's topic, the title of the episode, is Leading Engineering for Postgres on Azure. And I guess maybe a good place to start is today, and before we go back in time, and forwards in time. But let's just talk about what's the scope of your job as VP of Postgres engineering at Microsoft?
AFFAN: 00:01:25
Right, yeah. Well, I mean, in this role, I manage three distinct but very related engineering teams. Firstly, there's a team that builds and runs the Azure Postgres services that we offer. It's a Flexible Server, Postgres, and then the single server Postgres, which we're deprecating. Secondly, there's a team that continues to work on open source projects that include Citus and PgBouncer and Patroni and so on and so forth. And then there's a third team that makes contributions to upstream Postgres. So this team has a group of committers and contributors. So in terms of scope of my specific role, these are all pretty smart engineers and product managers. So my main goal is to just make sure, my and my product management counterpart peer, our goal is to make sure that they have clear vision, strategy, and they're set up for success, and then just to mainly stay out of the way of these very smart people.
CLAIRE: 00:02:33
And just to get people listening who might be unfamiliar, when you talk about flexible server, you're talking about the Azure Database for PostgreSQL managed service, right?
AFFAN: 00:02:42
Correct, yes, exactly. Yes, Azure Database for PostgreSQL - Flexible Server.
CLAIRE: 00:02:48
OK, so the managed service, the extensions and other open source projects, as well as the upstream Postgres open source work [That is correct]. And I put the kind of work that my team does, where we focus on open source contributions, but not code, we focus on things like this podcast or the POSETTE event in that third bucket too. Even though I don't technically report into your org, that's kind of, I don't know, where my connections are.
AFFAN: 00:03:21
Yeah, it's definitely a very, very critical function that we have, definitely. I think generally our stance on Postgres open source is that we want to make Postgres a successful database, which includes many different things we need to do. Number one is contributing to the open source aspects of Postgres, but also contributing to the open source extensions that build the ecosystem of Postgres. But also the things that you're just mentioning, the role that you and your team play in our groups, which is basically promoting Postgres, supporting all the activities, building all of the content that you build out. So I think that's a very important part of what our collective team does as well at Microsoft.
CLAIRE: 00:04:12
So do you ever meet people or run into people who are surprised that there's a Postgres team at Microsoft?
AFFAN: 00:04:20
Well, yes. There's always an elephant in the room, which is that Microsoft does have, we do have, our own proprietary database called SQL Server, which people would obviously know about, which is incredibly successful also. So oftentimes, we do get, actually we used to get this a lot more in the last, about a few years ago, lately in the last couple of years, since we started actually making good contributions to Postgres. And again, the great work that your team has done also, Claire, this has become less and less of a discussion point. Everybody kind of now assumes that Microsoft does do Postgres as well. But yeah, certainly, there has been some kind of, not struggle, but some kind of effort that we had to make to kind of make sure that people understand that, customers understand that, we are very, very serious about Postgres in that.
CLAIRE: 00:05:21
So what does it mean to be serious about Postgres? So why, I suppose I know why I care about Postgres, but why does Microsoft care about Postgres? What is the strategy and that vision that guides you and your team?
AFFAN: 00:05:36
Right, yeah, I think there are a couple of things here. Number one is that, obviously we, again, as I mentioned, we do have SQL Server, which is doing very well. But we have had a lot of demand signal from customers on supporting open source databases, like MySQL and Postgres. And this happened quite a few years ago, and which is why... Microsoft is a platform, we build products at multiple layers. We have Azure infrastructure. We build IaaS. We supply IaaS services to our customers. We supply PaaS services to our customers also. We build databases. We also run databases. So basically, Microsoft's goal, or Azure's goal as a whole, is to be the computer for the world. It's the world's computer, is what Satya kind of called us. And part of being the world's computer is to be able to run every application, whether it be a very top to bottom proprietary, hey, Microsoft built everything from soup to nuts, or it is an open source database like Postgres and MySQL that we pull in and kind of run as a managed service. So it is important. It is important for the completeness of the platform. And also, customers actually have asked us for supporting this. And customers have their own reasons. Sometimes they want, vendor lockin has been a big concern, which is why I think this has become a thing. So I think, getting back to your question specifically, I think it is the completeness of the platform and the offering. We have demand here. We will support it.
CLAIRE: 00:07:14
OK. So you joined this team, I don't know, I want to go back in time now and talk about how did you get into engineering management. You started life as an embedded systems engineer. Is that right?
AFFAN: 00:07:30
Yes. Yes, that's correct. It's a long time ago, Claire.
CLAIRE: 00:07:34
Why embedded systems?
AFFAN: 00:07:37
So I was an electrical engineering major from college. So I think that was a wrong career for me. So I did not see myself as an electrical engineer. So I had more of an aptitude for computer science. So when I was looking around, and a role which kind of married the two, my professional ambitions plus my academic experience, embedded engineering, software engineering seemed to be a good fit. So I ended up writing a bunch of code for, gosh, it was so long ago, I think it was MPEG-1 encoders and decoders, H.263 encoders, decoders, for the Philips TriMedia processor, which was something that I don't think even exists on the internet now. So yeah, it was fun. It was fun. I think it was like, it got me to a very good start in computer science. So yeah, it was a long time ago.
CLAIRE: 00:08:48
Was open source on your radar or something that you worked with, either in academia or when you were first getting started, in embedded systems engineering? Or did that come later?
AFFAN: 00:09:00
Yeah, that came later. I think my experience with open source is interesting. So the first time I developed with open source was when I worked on this project called the Durable Task Framework, which is a workflow engine, basically, that we--
CLAIRE: 00:09:16
What was it? Can you say it again?
AFFAN: 00:09:18
Durable Task Framework. [OK, got it] Yeah. So that actually was something we created in a hackathon, me and a friend of mine, a colleague of mine. And we were excited about it. So we open sourced it back in 2015, I would like to say. And we use it pretty regularly within our teams at Microsoft. And it turned out to be a pretty interesting piece of tech. That there were companies built on top of this, not on this specific framework, but companies built on top of the same idea later. So that was my first experience doing open source. But my second stint with open source was when I was working in Meta. And I was leading the team which built and ran the social graph database, which was all based on MySQL at Facebook. So yeah, that was MySQL. So I actually had a pretty interesting set of experiences in MySQL as well. Lasted about 3 and 1/2 years. And then I came back to Microsoft doing Postgres. So interesting experiences. I've built projects that went open source. And then I worked on MySQL and Postgres.
CLAIRE: 00:10:40
And then I know as we were getting ready for the podcast, you told me that you had switched back and forth a few times between management positions and individual contributor positions. And I think that those decisions about whether to switch gears and solve different kinds of problems, you're still problem solving as a manager, but it's not the same as being a pure engineer. And so why did you make those switches? And how did you make those decisions too? Because that is something I've seen so many people struggle with.
AFFAN: 00:11:14
Yeah, I think the first switch I made was from an IC to a manager was back in 2007 or 2008. And that was an untimely switch. I think in hindsight, this was not the right decision. I was not, I mean, my general recommendation to other engineers also who want to go the path of management is get some good set of experiences as an IC under your belt, because that's very important in being a very effective engineering manager. You need to know what's going on. So the first time I became a manager, I was not, I don't think I had done, I was not sufficiently technically grounded. And I'd not had the sort of experiences that you should have had when you become an EM. So that made me a bit uncomfortable as a manager. I wanted to know more. I wanted to kind of understand in depth what's going on. And it was just something that, actually, I just missed coding, basically. It was too early for me. So I went back. I went back into a IC role, coding, architecture, design, and technical leadership, and things like that. But at some point in your career also, it's that you just want to have an even more outsized impact. And at a certain point, I think the way to do that is also, I mean, obviously, as an IC, you can become a technical lead for a very large project as well. But for me, the path was more management, more coaching, mentoring. So the second time I became a manager was when I figured that to make more impact, I had to kind of drive a group of people instead of just myself. So that was my second stint. Again, people have different paths. Some ICs are incredibly good at leading very large projects. So that's, if that's the path, then that's the path. But for me, the path was management. So yeah, and it was not easy. Becoming a manager, I think, requires a mindset that you have to let go of things that you hold near and dear to your heart, like what's happening in this component, what's happening in this line of code, things like that. But I think that was the key for me, letting go and treating the team not as an extension of my own coding, but as a disaggregated, decentralized group of people who can take up a task and then fully deliver on it and without me having to know everything about that particular thing.
CLAIRE: 00:14:04
And oftentimes, come up with solutions that surprise you, I would expect. Like, that's the cool thing about letting go. And I know it's hard to let go. I still struggle with that sometimes. But when you do, people do amazing things. [Exactly] That I think, oh, I wouldn't have thought of that.
AFFAN: 00:14:24
Yeah. I think, again, the realization that I think I also got was interesting. Like, it was a very eye-opening event in my career where I went to, somebody was explaining to me something. This very, very super, super good IC engineer who was going over, like, there was some issue and he was explaining to me that, "hey, this, this, this, this, this." And I told him that, oh, I didn't know that. And he said, "Affan, there's a lot of things you don't know." And that, I mean, he was joking. But that really stuck to me, like, oh, my god, this guy is so correct. There's so many things I don't know and so many things I don't have to know because there's so many smart people around us that, who are smarter than you as well. And so you just need to figure out how to get into a situation where you can trust them and then just do, just checkpoint. But that was, I think, the key opening, the eye opener for me. "Affan, there's a lot of things you don't know."
CLAIRE: 00:15:29
Well, I think trust is a really interesting thing because I feel like people, and this isn't true of everybody, of course, I have to be careful not to stereotype. But I feel like people do some of their best work when they're feeling trusted, when they feel like their boss has their back, when their boss understands them. And so I'm just curious if you have any, I don't know, perspectives on how do you make your team feel trusted?
AFFAN: 00:15:56
Yeah, I think my mental model, again, it's not like I'm doing it perfectly, but my mental model is just to assume that my manager does the same thing to me. He has the same conversation with me. How would I feel? And that has helped me make some decisions very easily. For example, if there's a project that is not doing super well, or there's some issues there, then should I come in and start micromanaging it? Should I do something else? Should I do coaching and what not? And then I just put myself in, I just assume that I'm the person and my manager is talking to me. And that really helps ground me. All these things I don't want my manager to say. And so I think that has been very useful on that. But generally, I think trusting people and just having a good frequency of checkpointing them and actually coaching them, I think, always goes a long way. People, when they feel empowered, again, no surprise, people, when they feel empowered, they do their best work. Micromanaging or not empowering people actually is very stressful for the manager as well as the person, obviously. And also, the system does not work. The system, you cannot scale out like that. So if you are interested in growing your scope as a manager or as a leader, not even a manager, even as a technical leader, it's very important to figure out, have the right people in the right spots, and then empower them and just let them be.
CLAIRE: 00:17:28
Makes sense. OK, let's switch gears back to your current job at Microsoft, head of Postgres engineering. What keeps you up at night? Is it too soon to ask that? Do I need to wait till later in the conversation?
AFFAN: 00:17:46
No, not many things. I sleep like a baby most of the time.
CLAIRE: 00:17:51
Oh, that's good.
AFFAN: 00:17:52
But you have to do that. I mean, you've got to have a little bit of a thick skin for high-stress jobs, like what we do in databases. But if I were to pick something, it's probably that Azure Postgres is running a lot of mission-critical workloads for customers and just want to make sure they're behaving properly. So I would be lying if I said that I don't sometimes wake up and look at a few health dashboards here and there and just make sure things are working, look at the live site and incidents.
CLAIRE: 00:18:26
In the middle of the night.
AFFAN: 00:18:27
Yeah, I mean, yeah. It's a good, yeah, again, I said I would be lying if I said I don't do that. But mostly it's good. Again, the team is great. Postgres is awesome. So things just work out. But it's definitely a little bit stressful, just understanding the impact of what Postgres has on, I think everybody would relate to that. Postgres needs to be rock solid. A lot of things will not work properly if Postgres did not. So I think that's what, I guess, keeps me up at night sometimes.
CLAIRE: 00:19:05
Boriss Mejías in the chat is commenting that the expression "sleep like a baby" just doesn't make sense. So many babies wake up so many times in the middle of the night. [100%] But I think you meant that you sleep really, really well.
AFFAN: 00:19:17
I sleep like a log, how's that?
CLAIRE: 00:19:19
OK, that's better. I like that one. OK. Well, it's interesting that you say that when you do wake up in the middle of the night, which I understand is rare, that you might occasionally look at the health dashboards. And that just reinforces how important it is for Postgres to have good monitoring and notifications, alerts, metrics, observability, all of the above.
AFFAN: 00:19:43
Exactly. I think there's a lot of interesting things that have come up from that. Like, again, manageability, monitoring, self-healing, ability to diagnose, enough tooling for customers to be able to diagnose problems if they run into them, workload problems, specifically, if they run into them themselves, making people more reliant on their own skills. I think that has been super important. I think these are challenges every big cloud vendor would have in this.
CLAIRE: 00:20:17
Well, so that leads into, are there database challenges with Postgres, or I suppose other relational databases, that are unique to hyperscalers? Running all of these workloads for all of these customers, obviously, we're not the only big hyperscaler. But are there things that we have to think about within the Azure team?
AFFAN: 00:20:40
I think it's probably going to be mostly around manageability at scale, I would say. With the fleet sizes that these large cloud vendors have, I think it is important to be able to run or manage these servers efficiently. But it has to be done with the right controls for security and privacy in mind. We can't peek inside the workload. We can't mess with anything that the customers are, with the customer workload. So how do we manage systems efficiently and make sure that they're running as effectively as possible with all these constraints? So basically, a lot of this comes down to what I was saying just before this, that they have to have all the diagnostics and tooling to figure out solutions by themselves. The other interesting point is that some of these databases, and I have experience with MySQL also on this one, some of these databases are not really built with cloud scale in mind, which is fine. I think which is what the nature of this beast is. But it oftentimes makes it hard for us to run this at scale efficiently. So the key thing with these large vendors like us, like Azure, is that we want to be able to leverage economy at scale. And we want to pass on the cost savings to the customers. So how do we securely and performantly leverage common infrastructure for running these servers? So for example, many cloud vendors have disaggregated the storage and compute. And they have gone for a shared storage model and potentially, in some cases, shared compute also. Does the database support disaggregating storage and compute and all of these things easily? Or is it an uphill battle for the database? And Postgres is a fantastic database, as evidenced by its success. But there's certainly some room for improvement in all of these aspects, I believe. I don't know. Better interfaces for storage, improvements to the security model potentially. So really, for example, we're a service operator. So we'd love to have some notion of a service operator, some security model improvements that can help us build a service operator kind of role so we can securely manage the databases without, again, getting more access to the data itself. So some thinking could be done there. So a bunch of these challenges are thought upfront when you have a cloud-first database. But these things, I think, kind of lag in some of these open source databases. So yeah. And there might be a few more things. But this is what I think our challenge has been so far.
CLAIRE: 00:23:27
OK. One of the questions on the chat from Adam Wølk is whether working on a managed service that's based on an open source project, do you ever feel constrained as a leader where the upstream open source project might decide to not accept changes or patches and where that could be an issue or an impact or could hold back the managed service? Is that something that's on your mind ever?
AFFAN: 00:23:58
Yeah, I think it's a balance. I think at the end of the day, it's all like we're all solving customer problems. If a customer has a problem x and if the upstream Postgres doesn't have enough controls for solving the problem x, then we have a few options. Number one is that we could, obviously, we can build the control plane around it. We can use the database in a way where it solves that problem. Or number two is that we could lean into and see if there's an extension that could solve that problem, which is a great thing about Postgres. Postgres is good at extensions. So I think that's a great thing. The extension is not available? Can we build an extension that can solve that problem? That's another interesting aspect. If even that is not possible, then can we, then is there something that is applicable to upstream as well? Would it help everyone in the community and industry to solve that problem? If the answer is yes, then definitely we do push on that. But in cases where the priorities do not align and we still have to solve the problem, then we obviously have the option of creating a fork with the intention of figuring out over time how to contribute this back to the community as well. But we always try to keep that as a minimum, because I think our goal is to be as true to Postgres upstream as possible. It has all kinds of benefits. The benefits of vendor lock-in that customers do not want, I think--
CLAIRE: 00:25:39
The benefits of vendor lock-in, that's like an oxymoron.
AFFAN: 00:25:44
The benefits of not having vendor lock-in, the benefits of not having vendor lock-in is only, maybe the triple negative that I'm trying to pull off here, Claire. Yeah, so that, and then also the fact that if we have any changes that are not upstream, then our rebase time, our time to snap to the latest and greatest version of Postgres minor and major versions is higher. So it's in our interest to have it upstream. If all else fails, then we investigate things. For example, maybe we can, maybe its extensibility point. Can we still do an extension with the minor addition to the extensibility of Postgres? How can we minimize the amount of work that needs to be done in that one? So it's not like, oh, it's not going to be taken, so let's not do it. I think it's a whole gradient. It's a whole spectrum of all the way from extensions, to control plane, to a bunch of other controls that we have. But our main thrust is always that we want to upstream any and all interesting changes that will benefit everyone. Postgres is a huge pie. We love to contribute to open source as it goes, hopefully, in our commitments and our contribution over the last few releases.
CLAIRE: 00:27:05
So anybody who's a regular listener knows that I got my start in Postgres when I joined Citus Data. And I worked on the Citus open source extension. But as I learned more and more about Postgres, I remain convinced, and I'm not the only person who thinks this, that part of the health and vibrancy of the whole Postgres ecosystem is enabled by this ability to create these runtime extensions and that ability to, "Hey, I need this feature. It can be a small thing, right?" I need this capability. It's not in Postgres today. But that ability to add that shim and tack that on and create that new thing, whether it's pg_cron, or whatever it is, is so powerful. And I think it's led to a lot of innovation and problem solving and has made Postgres better and better. So that's my bias.
AFFAN: 00:28:01
And Claire, I think to your point also, just to add to that, for example, pgvector, great example. I think Postgres led all of the other relational databases in terms of vector support because there happened to be a pgvector extension that was already there. And would pgvector have made it in the open source if it was not an extensibility point that was available? I don't know. I don't, maybe, I don't think so. Maybe I don't know what list of priorities would the core team have for a vector data type support. But the extensibility mechanism is pretty rich, and it enables these kinds of things. So definitely, I think it has one of the strengths, as you said, exactly, the extensibility. Also, it's a double-edged sword, though. I think it comes with the, so all the cloud vendors are, so I think the quality, the QA, the quality control for extensions, I think, is an interesting topic. So how do we maintain, how do we know which extension versions are good, which are not good? A lot of testing overhead goes with it, which is why I think there is only a limited set of extensions that all the cloud vendors support and the ones that we test and validate. So another area, actually, would be interesting to experiment with or have some good ideas on is that how can we, how do we do QA? How do we, is it always going to be crowdsourced? Is there a better model here possible for this? I think we can do, [QA for extensions, you mean?] Extensions, yes. How do we do quality control for extensions? All the version mismatches. Does this extension work? Does extension A work with extension B? Sometimes we have seen those problems also. But I think the benefits of extensions far outweigh any of these additional work that needs to be done there. I think it has helped Postgres move faster as a database, which has been very critical to its success.
CLAIRE: 00:30:02
There's a talk I've given a couple of times at Postgres conferences called the Map of Amazing Postgres Extensions You Might Not Know About. And I gave it recently in Seattle at the PASS Summit. And what's interesting is a lot of times, the people who attend this talk are people who are experienced database practitioners, but maybe in the process of migrating over from some other database like Oracle, or even sometimes SQL Server. And maybe they're starting up a new project or new workloads, but their skills are not necessarily in Postgres. So they're not familiar yet, growth mindset there, with all of the various extensions and how to choose between them. And so the question I will often get asked is, well, how do I know which extensions I can rely on? How do I know, and sometimes these people are in on-prem scenarios, so they're not looking to run their app in a managed service yet. They're focused on doing the migration on-prem. And it's hard to answer that question. And my answer will often be, well, I give a little cheat. I say, there's a hack you can use. You can go look at the hyperscalers. Go look at Azure Database for PostgreSQL. See what extensions the managed service providers are supporting. And that gives you a clue as to maybe what's well-maintained and what can be QA'd and is deemed high quality. And then I'll also tell them, obviously, go look at the GitHub repo for that project. See how frequently it's being updated. See how active the team is, how responsive they are to issues that get filed, things like that. Because when you're relying on an extension, what you're really relying on are the people who are working on the extension. And that's who you're trusting. So who are they? Go find out. Anyway, those are the two answers I give. If you have better suggestions, tell me, because I know I'll get that question again in the future.
AFFAN: 00:32:04
Claire, I just look at your talk and then answer this question. I mean, in my opinion, you've hit the nail on the head. I think there's only a heuristics-based approach to figuring out what's a good extension or not, at least in my opinion. But to your point, all of these clouds, the hyperscalers, if something is running in Azure Postgres, then it probably means that we have done some amount of testing and all of these things. So we are confident about the quality of this extension. And if push comes to shove, we will fix things also. We'll commit, we'll make changes to extension also to make it work. So that's always a good cheat sheet. But I guess the question is that beyond these, if you have an on-prem setup and you just want to run an extension, then all the points you mentioned are very valid. It's more crowdsourced. Look at the popularity. Look at the bug reports. Look at the activity on the repos, things like that. But again, it's not a science. It's an art right now. I hope at some point it becomes a science also.
CLAIRE: 00:33:09
OK. So I still want to talk a lot more about your management philosophy and more about your learnings from customers who run their applications and their workloads in the cloud. But while we're on the topic of extensions, I know that Citus was acquired by Microsoft. And Citus has had a few lives within the Azure ecosystem. Originally, there was this deployment option to Azure Database for PostgreSQL called Hyperscale (Citus). And then we rebranded and basically moved that deployment option over into Azure Cosmos DB for PostgreSQL. And that was a couple of years later. And most recently, there was an announcement like two weeks ago. [Right. Yes.] OK. And so now the Citus capability, the Citus extension, is available in Azure Database for PostgreSQL again as a feature called "Elastic clusters". Did I get that right? [Yes, that's correct.] OK. So talk to me. How do you think about that? Why did we do that? Not why did we do what we did before, but why did we do this here now this year?
AFFAN: 00:34:20
Yeah. I think we have a class of workloads. And again, this is completely based on customer demand, like a demand signal. We have a class of workloads that are more like, think of them as maybe multi-tenant or consumer-facing workloads, internet class workloads, internet scale workloads, which are easily shardable. So I think we got a lot of demand from customers that they want the distributed nature or the sharding capability that Citus gives within the infrastructure or server that they're already used to managing and kind of keeping healthy and all of those, monitoring all of those things, which happened to be Flexible Server. Flexible Server, a widely popular database for Postgres. A lot of our customers use it. And they wanted, a bunch of these customers wanted, a lot of these customers wanted this capability within Flex, not as a separate service. So I think we just responded to that demand. And Citus is a great tech. It's a great foundational tech. As people on this forum would already know that you can use it to scale horizontally. You can use it to run distributed SQL queries. And you can, if the data model is done a bit carefully, then it can pretty much scale linearly. As you add nodes, you can keep adding nodes. It will rebalance your cluster. It will rebalance your shards across this cluster and all of that. So it was basically a demand signal from customers on this one. So I think that's the main thing. I think of Citus as an enabling technology, which is currently being shipped in multiple Postgres offerings within Microsoft. At some point, I think we'll have more capabilities showing up in Citus also. There's a roadmap for Citus that will, I think will, either it's published already or it will be published that we plan to do. And those will benefit all the services that are supporting Citus at this point. Did that answer your question?
CLAIRE: 00:36:35
Yeah, absolutely. I mean, it's interesting for me, having been with Citus for so long, the project. I obviously used to spend, I'd say, 100% of my time focused on Citus. Nowadays, more of my focus is on the Postgres open source project more generally, right? So Citus is a smaller piece of what I focus on. But it's been interesting to see us try to figure out at Microsoft, where's the right place for it to land? Where do customers need this as an enabling technology? And given the growth in Flexible Server on Azure Database for PostgreSQL, I'm really glad to see Citus kind of come back into that platform. And I'm not... Is it push button easy yet to go from a single-node Flexible Server to adopt elastic clusters? Or is that something that comes later, where it gets to be push button easy?
AFFAN: 00:37:36
That comes later. And that's part of our plan. I think as we evolve, right now, it is in public preview. As we evolve towards GA, we plan to have a lot of interesting options whereby you could easily convert your single-node workloads into your multi-node workloads. But it's a bit tricky, though, Claire. I think there is one thing that we, I think, have to be pretty, play around, is that there's a data model that works very well for sharded database architectures. And there's a data model that does not. So it depends, really. I mean, if you have a shardable, and this is the same constraint, I guess, as every other distributed database has. If you are able to provide some information to the system on how to manage your shards, then we work very well. If there's no information, then we try to work as well as we could. But yeah, I think the trick between migrating from single-node workloads to a multi-node workload is going to be figuring out that thing. If the migration process or the update process can give us enough hints, then we can do a really good job. So I think it will come down to that. It's not as simple as you can imagine, it's not as simple as that, oh, just add another node. It will magically convert.
CLAIRE: 00:38:59
Yeah, that makes sense. And that's always been the sticky point with Citus, right? And anybody who has the kind of data-intensive workload where they need that distributed database scale, they've been motivated to make sure their data model is going to distribute effectively, right? And anyone who doesn't need that sometimes creates a data model in the beginning that doesn't lend itself well to distributing. So yeah.
AFFAN: 00:39:30
Exactly. I think a good fit for the Elastic clusters and some of these applications, or the flavor of Elastic clusters that I've shipped so far is newer apps, right? So one of the things that we observed in the previous release of Citus Postgres service was that the migration is actually, is an effort, right? So if you have a consumer-facing app or a multi-tenant app or an app that is, again, as will probably grow to the internet scale at some point or you expect it to grow, then it's a good idea to kind of review or look at how to build a good data model that can be sharded across any tech which supports sharding like Citus.
CLAIRE: 00:40:25
OK, so you mentioned pgvector earlier. So we have to talk about AI and Postgres, because doesn't everybody talk about AI and Postgres? Like, doesn't that always come up? I'm on the talk selection team for POSETTE: An Event for Postgres. And I was on the talk selection team for PGDay Chicago last year as well. And there are a lot of talk proposals about the topic of AI, because it's everywhere, in terms of people trying it, adopting it, talking about it. So how do you see AI playing with Postgres? What's your perspective on this right now?
AFFAN: 00:41:06
Yeah, I think the AI, so again, as I was mentioning, the AI, I think with the pgvector extension, Postgres happened to be in a leading position in terms of relational databases. I think it was a good spot for Postgres to be in. But I think the interesting thing here is that customers, when they come to thinking about the Gen AI applications, obviously, everybody needs to build Gen AI apps using data. They are conversing with data, or they are using data to converse with other types of data. And LLMs happen to be the intelligent agent in the middle somewhere who can make sense of this data and give reasonable answers on that. I think we've found a lot of cases, or generally, the cases are that customers who want to look at the data and provide the data to the LLM to get back answers, they don't want to move this data from the database to another system that does vector searching and similarity search and all of those things. They just want to do things in database. That's the key. Bring the function to the database, do not pull the data next to the logic itself. It's the same thing, along the same lines. So a lot of customers are leveraging Postgres AI, like Postgres vector features to do in-place similarity searches. And certainly, I see a lot of innovation happening on pgvector itself. The algorithms, the default algorithms, keep improving. HNSW keeps improving. The recall keeps getting better. We just announced this DiskANN at Ignite a couple of weeks ago, which is another algorithm which we feel is better in many respects with HNSW, and we'll keep working on that. So we'll keep seeing a lot of innovation within Postgres to support these Gen AI application app building efforts much more closely, much more efficiently. It will be in the form of faster searches on the data, more accurate searches on the data. And it will also go into deeper domains, like maybe doing semantic searches, maybe doing hybrid searches where we mix the results from, for example, text searches and vector search. And we do some kind of algorithm where we mix these results to come up with an even better recall, more relevant answers to your questions, and all this. So I see a lot of movement in this, and Postgres is in the right spot. It already has a lot of developer mindshare. It has the most developer mindshare. And most developers are right now experimenting with AI applications. So this is like putting two and two together to come up with a five. So I think the industry is going to keep pushing towards more interesting AI features in Postgres.
CLAIRE: 00:44:17
When I look at friends in the industry, and I've been around a while, worked at a few different companies, started my career at Sun Microsystems, which doesn't even exist anymore, my friends are all over the place. And it's interesting to see who is embracing these LLMs and incorporating them into their day-to-day work life and who is resisting it, focused on the drawbacks, or the negatives, or the mistakes you can make, versus other teams that are hiring people to focus on figuring out how do we adopt this technology into our services or into our products, or even just small things, like who's using Copilot and who's not. Are developers becoming more productive by augmenting their skills? I've often joked that my daughter uses her iPhone to augment her already super capable brain. And she's got this $1,000 supercomputer in her pocket that kind of makes her scale. And so I feel like Copilot does that for a lot of people, too. I don't know if that's your perspective.
AFFAN: 00:45:38
Yeah, I think the thing that we're observing, again talking to customers, and these are not just like Postgres customers, but generally like AI application developers and whatnot, is that the recall is actually very, very interesting. The accuracy is very interesting. So I think people are OK spending a bit more time, like having slightly lesser TPS, if they can get better results. So that's one thing. And the reason for that is clear. If you use a particular like chatbot or LLM and you do not get good results, you probably will not use it often again. I think it's very, you may want to wait for maybe like 50 more milliseconds, but you want to get a good result. So I think there's potentially some of the what you call resistors. Probably I think this is playing into that. I suspect it's going to play into that also. The recall becomes more interesting. The amount of AI applications that have been built over the last like one or two years is immense, which means that without having, I think the industry is still coming around to figuring out the right patterns and the right controls to manage the quality of responses and quality of the data. And I think it's going to keep happening. But what we have, the signal we've gotten is that if the quality is not there, then the customers will churn. And then that will impact. That will create a bad rep for the application. And it's also like, I mean, generally somebody using AI, if this is their AI experience, then they will definitely become, go into the resistors camp a bit more. So it's all about recall, in my opinion.
CLAIRE: 00:47:28
All right. So let's switch gears for a sec. I want to talk a little bit more about the Postgres open source work that's being done at Microsoft and in your team and how you think about it. And I'll preface this with the fact that I'm biased. There are biases in any questions I ask you on this topic because this is what, I often talk about the toothbrush test. This is what motivates me in the morning when I'm brushing my teeth. I love the open source community that works on Postgres around the world across many different companies. I especially love the team that you've been building here at Microsoft. Some of my favorite people are on that team. But how do you think about that Postgres open source contributor work?
AFFAN: 00:48:16
Right, yeah, yeah. So I think there is two parts to our all up strategy for Postgres at Microsoft. So there is part one is making Postgres the best relational database that it can be in the world. And part number two is making Azure the best place to run this Postgres database, the best Postgres database, the best relational database in the world. So I think both of these are very important aspects to winning in Postgres. And part one, the first part, is what this very capable team of contributors and committers does for us. So we, I mean, this team is pretty fairly independent. I think they have access to vast treasure of data within, obviously, within privacy constraints. They know how the fleet runs. They know the typical issues. They can look at the issues. So they get a lot of data, which they can use to identify patterns and problems and issues, which they do. And oftentimes, it's become something that is useful. So in some sense, our fleet of databases is helping, hopefully, it's helping the whole industry to move forward faster. So again, Postgres 17, a lot of interesting contributions. There's async I/O work going on. There was, I think, some performance improvements in partitioning of tables, which is a big scenario coming from Oracle migrated workloads, some optimizations, and memory allocation performance improvements, query planner improvement, a bunch of interesting things went in to Postgres 17, which were all independently driven by our group of contributors to open source. So I think this is also important. I mean, this team is independent. I mean, they have access to data. And then they make their own decision. And they work with the community to figure out what is the right model to work on there. And I think it's worked very well. I think as a large cloud vendor, and as an industry, I think industry has a responsibility here to support Postgres. I think Postgres is doing well because the industry is actually, the whole community industry is behind it. So everybody is sharing and moving the boundary forward, pushing forward the boundary. And we try to do our part on this one. And so far, it has worked well.
CLAIRE: 00:50:49
I think that's a really interesting phrase, that industry has a responsibility to support Postgres. I think I've heard you use the term, was it you that said that we need to be good open source citizens? That if we're going to build a managed service, we also have to be contributing, giving back, leading.
AFFAN: 00:51:10
Exactly. I think it's, again, we look at it, the pie is very, very big. We're solving the world's database problems. The addressable market is huge. Anything that we do, and customers, actually, they want a database that can work across clouds. They want the portability of their database workloads. They want that freedom. So if we, I think which is why a very important aspect of the work that we do is to not try to fork as much as we can. Because the more forks, I mean, if the behavior starts becoming different, then it is harder to maintain that promise with customers on this vendor portability. So I think it's our promise to customers who are coming to our clouds to run Postgres is that there is going to be minimal vendor lock-in. And we need to maintain that. And the way to maintain that is to make improvements and contribute to the open source upstream version of Postgres. I think that's very important to maintain customer promise.
CLAIRE: 00:52:20
So you mentioned how independent the open source contributor team is. And by the way, I say contributor, even though a whole bunch of people in this team are committers. Because not everybody on this team is committers. We also have people who are kind of growing in their contributions and are kind of like tomorrow's future committers, right? So we have people at varying levels of experience and skill. But you talked about their independence. I was wondering, how much freedom do they really have? Do they pick, bottoms up, what their projects are going to be? I know that when people are doing planning for Postgres 18, there's no top-down roadmap that anybody sets, not even the core team, right? It is a very bottoms-up system.
AFFAN: 00:53:07
Right, right, right. So I think it's completely by design bottoms-up. I think this is a community-driven effort. Because I think these individuals have to work with the community to kind of make the case for whatever work they're doing. So it's better driven completely by these individuals. And as I said, the thing that we offer is the data sets. The data sets, customer problems, customer escalations even, right? I mean, sometimes our top engineers in the contributors team are involved in figuring out issues with customers. And oftentimes, there are improvements in Postgres that come out as a result of these as well. So that's basically the, I think, interesting thing that open source contributors kind of have access to. But otherwise, completely independent. There's no top-down, "hey, we've got to do this" kind of thing. Because I think the system is working this way. And there's no need to change that.
CLAIRE: 00:54:14
I know that within the Postgres world, there's a number of different companies that employ Postgres committers and Postgres contributors. And sometimes people move around between employers. I heard this phrase recently. I don't know what your reaction is going to be. But if I got the names right, I think when Melanie Plageman moved from Greenplum to Microsoft, somebody on the internet commented like, "oh, you changed your paycheck provider." Because there is this independence, if you will, to engineers who work on Postgres. So I don't know if you've ever heard that phrase, paycheck provider.
AFFAN: 00:54:54
I have not. But certainly, it's an interesting phrase.
CLAIRE: 00:54:58
Yeah, it is interesting. But what I also think is interesting is people who have joined this Postgres open source team here at Microsoft, they're all still here. Knock on wood. I don't want to say that and have that change here. I'm going to go knock on wood just a second. The team has grown. And it just keeps getting more amazing, which I think is cool. So kudos to you. You're doing something right.
AFFAN: 00:55:28
Yeah, hopefully.
CLAIRE: 00:55:30
You're doing more than something right. You're doing several things right. I mean, that's part of why I invited you on the show. One of the things that I think is cool is when a leader, when a manager, is accessible and approachable and you feel like you can go to them as a sounding board or to get advice or validation or whatever, I just think that's so wonderful versus a VP that people are annoyed with or don't want to talk to or would rather avoid or you see their name on your phone and you don't want to answer it. Like, I see your name on my phone, I want to answer the call. So that's why you're here, because I do think you're doing a bunch of things right.
AFFAN: 00:56:13
I appreciate it. Yeah. I mean, it's all the team, Claire. I think, again, the Postgres team at Microsoft, fantastic group of people. Could not have asked for a better team. I think it's just like standing on the shoulder of giants kind of thing. But yeah, absolutely.
CLAIRE: 00:56:29
All right. So are there any more stories that you had in your pocket that you wanted to share today about management philosophies or approach to leading people? I'm wondering if there are engineers or developers listening to this episode who are thinking four, five, six years ahead in their career about whether they want to move into management, about what those challenges will be, or I don't know, any seeds you want to plant.
AFFAN: 00:57:05
Yeah. I mean, so I think in terms of just generally moving to management, I think it's not, so first off, I would not think of moving to management as a promotion. That's something that oftentimes people make a mistake of assuming. It's not, it's a different role. It's not like, oh, this is something better or different. It's just a different role. And this role comes with its pros and cons. The pros are that, yes, you can, your impact can span, you can do more impact through a group, right? But the cons are that you get a little bit, I mean, you don't get a little bit, I mean, you can, you're still technical and all this, but you're not as technical as you would want to be. So if your passion is just writing code, if you just want to go home at the end of the day and then just feel happy about the PR that you kind of merged or the check-in that you made or the problem that you solved, the algorithm that you came up with, or some issue that you fixed in live site, right, then I think it's a good sign that you enjoy that job and then you would want to grow in that. Technical leadership is a very viable career path as well. But if you actually like talking to people, if you like solving people problems, if you like, for example, to coach and mentor people, and this is not something that, by the way, it's not like I keep telling people in my team also that coaching and mentoring is not a cliche. It is actually a thing. It has to happen. You need to enjoy doing it. You need to actually go out and seek opportunities to coach people. That's a litmus test. If you actually go out and seek opportunities to coach people and that makes you happy, then probably you will enjoy your job as a manager. So be very clear on what you want to, what you enjoy. Because it's definitely something that if you don't enjoy management, if you don't enjoy technical leadership, then obviously that would be a first thing. Second thing is generally management is a game of scale, how to scale out. And it's just like an engineering project. You need to be decentralized. You need to be as distributed as possible to be successful. So just a couple of things, Claire, I think we're out of time as well, but that's the thing that came to my mind.
CLAIRE: 00:59:32
Very cool. Are you hiring?
AFFAN: 00:59:37
We're always hiring, the quantity depends. It varies time to time. But we're always looking for good people with Postgres experiences and knowledge. And not just Postgres, actually, distributed systems. Databases are good in that they are a microcosm of all the CS topics, operating systems, system engineers, compilers, everything. So I think we're hiring all kinds of people. Again, the quantity varies over time. But always hiring.
CLAIRE: 01:00:04
Yeah, there's some seasonality I've observed at Microsoft in terms of which quarters, and it seems to be different year to year. It's not always predictably the same. But sometimes you're hiring like gangbusters, and sometimes you're hiring more slowly, but like you said, we're always hiring. [Exactly.] Well, that's good to know. All right. Well, I want to thank you for joining us today. I'm trying to make sure that I've talked about all the important things, about what it means to be leading Postgres engineering on Azure. You mentioned Ignite a few weeks ago. So that's like a very Microsoft event. It's something that happens every year. It spans all of the Azure technologies, the whole ecosystem?
AFFAN: 01:00:54
That's correct.
CLAIRE: 01:00:55
And it happened in Chicago this year. They've already announced the date for next year, which will be in San Francisco, yay, in my backyard. And I know that's where a lot of new features get announced. A lot of customers come. Charles Feddersen, who heads up PM, he's your counterpart for Postgres at Microsoft. He's head of the product team. I know he was there. I know that some of the people on his team gave joint presentations with customers like UBS. So it was a really big deal. And I don't know if there's anything we want to mention for that. I mean, it's not something that the Postgres open source community actively goes to, because historically, it's been this, I don't know, the focus is on parts of the ecosystem that are not Postgres. But what I thought was really cool this year is there were more Postgres sessions than ever in past Ignites. And more people in the room, 10 times as many people in the room as there had been in the past. So I thought it was an interesting inflection point for Postgres at Microsoft.
AFFAN: 01:02:09
I think it tracks our general adoption and growth of Postgres in the industry, as well as in Azure. I fully suspect we're going to start seeing more and more participation from customers and our partners in these forums. So yeah, I think the interesting thing, what you just mentioned, the most interesting thing, at least for me, was besides all the great work that the team did, which was released, was the customer case study. UBS came over, and then they presented how they're building a Gen AI platform using Postgres. And that's a good presentation to see, even if you're not Azure Postgres customer, or you're not Microsoft Postgres. It's a good insight into how customers use Postgres to do AI applications.
CLAIRE: 01:02:59
There should be a video recording of that. I'll try to dig it up and include it in the show notes.
AFFAN: 01:03:04
Yeah. So that was a good one. And then we announced a lot of interesting things. Elastic clusters was announced. We announced automatic indexing and tuning, server parameter tuning. We, gosh, I think there's a long list of things. A DiskANN, we announced a DiskANN, supporting pgvector, among a bunch. I'm sure I'm missing a few other things. So yeah, I may not be able to do justice, but there's a lot of interesting things going on at Ignite.
CLAIRE: 01:03:34
I've been writing this blog post twice a year. I think I've done it, have I done it two or three times so far? Anyway, I'm working on the next version right now, and it's called "What's New with Postgres at Microsoft" And it tries to span kind of the whole area that you own, so the Azure database for Postgres, investments, new features, capabilities, et cetera, and then our open source contributions, and then our community contributions, and any of the work we've done on extensions or things like Patroni. So I've got these different buckets, and there's a whole infographic. And oh my gosh, it's so much work to try to pull it all together into one place.
AFFAN: 01:04:17
Yeah, it's a great team delivering like clockwork, I think, again, as I've got a great set of engineers and product managers working on this. So I'm very happy with what we keep delivering semester after semester.
CLAIRE: 01:04:32
Well, I am so glad that you joined us today. Thank you so much. I think we're going to wrap.
AFFAN: 01:04:41
Thank you for having the discussion.
CLAIRE: 01:04:44
Let me just go through a few logistics for everybody who's listening. The next episode, episode 23, is going to be recorded live on Wednesday, January 15th, at 10am PST. And our guest will be Daniel Gustafsson, who is one of the Postgres major contributors and committers who works on the team at Microsoft. And the topic is going to be How I got started as a developer and in Postgres. If you want to mark your calendar now, you can with this calendar invite, aka.ms/TalkingPostgres-Ep23-cal And you can always get to past episodes and get links to subscribe on whatever your favorite podcast platform is at talkingpostgres.com. And there are transcripts included on the episode pages on talkingpostgres.com as well. And before we leave, if you've enjoyed this, tell your friends, in person, on social media, in DMs, we love compliments, we love word of mouth recommendations. The hashtag is #TalkingPostgres, all one word, and that's it.