A weekly newsletter and podcast diving into Clojure programs and libraries by Daniel Compton.
Download: MP3 - 00:52:06
Daniel: Hello, welcome to The REPL, a podcast diving into Clojure programs and libraries. This week, I’m talking about InstantDB, a real-time database built with Clojure, with Stepan, one of the creators of InstantDB. Welcome to the show, Stepan.
Stepan: Glad to be here, Daniel.
Daniel: Great. So I came across InstantDB, not from the Clojure angle, but just from Hacker News. I saw InstantDB as this kind of real-time database syncing type thing, sort of like build your own linear in a box. I was looking through the comments and I was just thinking, “Oh, this is a cool, interesting kind of project.” And then I realized, “Oh, this is built in Clojure.” Huh. Okay, this is now really interesting to me. And so that’s why I thought I should definitely get you on the podcast to talk more about InstantDB. So tell us a little bit about what InstantDB is. What’s your history with this project? Because it’s newly announced, but it looks like you’ve been working on this for quite some time at this point.
Stepan: Yeah, so Instant gives your front end a database that you can use. One way to think about this problem is that writing web apps is just full of schleps. You start with a database, you add endpoints, you take all this data, you normalize it in a store, then you write optimistic updates. If you want to think about offline mode, you have to consider that. Then there’s multiplayer, right? All these steps that we take are just kind of cruft that everybody has to do over and over again.
The insight that we had, the very first one, was around 2020. We realized a lot of these problems are just database problems in disguise. When you write an endpoint, that’s just kind of like writing a query, right? When you normalize and denormalize, that’s literally what databases do. When you put stuff in for offline mode, you might use a transaction queue; that’s actually how databases work. So if we give you that, then all of a sudden, a bunch of complexity compresses.
Daniel: Nice. And so what would be the benefits to a developer who uses Instant? Why would they care?
Stepan: Yeah, we’ve gotten pretty good at writing endpoints. So, why would I want to rethink every way I manage my data? I would say there are kind of two benefits. The first one is just plain productivity. I think if you use Instant today, if somebody asked me, “What is the fastest way to build an app?” I would actually recommend them to use Instant. Instead of having to spin up infrastructure for yourself and for every single feature, go through the schlep of creating these endpoints and doing all these things, you can just start with Instant. You get queries out of the box. Transactions just work.
So I think at this point, you can literally, if you have a front-end app and you’re using React and you have something like useState
, you can just change that to useQuery
. With Instant, suddenly that will start to persist. So the first benefit is literally productivity.
But I think a second benefit is if you try to make today the best apps, what comes to mind? It’s apps like Notion, Linear, and Figma. What’s really amazing about them is, one, they tend to be multiplayer by default, in places where you wouldn’t expect them to be multiplayer. Like in Linear, you can change your profile picture name, and everything will be reactive. That just creates a better experience. Secondly, if you wanted to work offline, right? If you use Notion and you’re typing away on a plane, it would suck if you couldn’t type anymore. If you want to actually implement these features, you end up building what we built at Instant yourself.
The way that we came up with this architecture was just scouring article after article. I think the original, actually, the great-grandparent of a lot of these architectures, funny enough, came from Asana. We were looking at Asana’s lunar architecture and Figma’s live graph and all these different ways of doing this in order to build Instant.
Daniel: So Instant is, I guess, like a complete system for managing your data, both front end and back end at the moment?
Stepan: Yes. One of our guiding lights was when I started programming, I’d say there were two tools available to me that really made it easy for me. One of them was Heroku, right? You didn’t need to know how to be a DevOps engineer. You could literally just use Heroku, and your Rails app would just be alive. A few years after that, I think Firebase came along. A lot of front-end developers didn’t know some of the complexities at the back end, and Firebase just let you start coding. So I think that’s kind of our inspiration. When we were building Instant, we wanted to make sure this wasn’t something that only big companies had to use, where it would take you a few days to set it up. We wanted it to be something that you could just get started with right away.
Daniel: So tell us a bit about what the actual InstantDB is. There are multiple components to it. What does the stack look like from whichever side, front end or back end, whichever makes the most sense?
Stepan: Great question. Okay, I can kind of lay it out for you, and then we can go deep wherever you like. Let’s look at it from the perspective of a user. The very first thing they use is the client SDK. This is written in JavaScript, and inside this SDK, there is a database that can understand queries and transactions. The reason we need this inside the client SDK is so that we can do things like offline mode and optimistic updates.
There’s a reactive layer on top of that, which takes these queries and starts syncing them to the back end. When we move to the back end, there are two layers there. One is the sync engine, and this is responsible for keeping track of a session and its queries. Then there’s a write-ahead log that we subscribe to from Postgres, which lets us take a transaction and figure out all the queries that have become stale as a result of it. This section is what I would call the sync engine.
Then there’s a section below that, which is a multi-tenant database. We actually had to build, you know, one problem we wanted to solve was if we gave people a free tier, we didn’t want to freeze them or do things like that. So we ended up creating a multi-tenant database. What we do is you can write queries to it, and the queries look like data log. You can write transactions to them, and they look like add and retract statements. But we make sure that those are logically separate. This is what lets us be able to spin up a database for free, basically, for users. That’s the kind of outline of the structure.
Daniel: Gotcha. And so this database, both on the front end and the back end, is a datalog, sort of Datomic-like style database. Is that right?
Stepan: Yes. That’s maybe unfamiliar for the general public, but for Clojure developers, it might be a little bit more well-known. I was thinking about maybe one day writing a post about this or something because what’s interesting about datalog and graph databases in particular is if you start thinking about the simplest way to express relations, you discover graph databases. We weren’t looking to write datalog. In fact, in the front end, users don’t write datalog. They write something that looks a lot like GraphQL, with the change that it’s actually just pure data structures that you can manipulate.
The reason we ended up doing this is if you think about the problem, “Hey, I want to be able to express it.” With Firebase, you can save objects, but that’s all you can do. What if you want to say an object is related to another one? A user has posts. A post has comments. The simplest structure that you could create to express this turns out to be a graph database. Then a lot of these surprising and weird things came out of that. We started writing datalog. I remember my co-founder Joe and I used to work at Facebook, and the underlying database is called Tao. A nitpicking person will say that Tao is different, but if you really read it, you’ll see that if you squint, there are a lot of similarities.
One of the things that I think distinguishes Datomic from Instant is that Instant is an EAV database (entity-attribute-value), whereas Datomic is EAVT (entity-attribute-value-time). There’s a transaction attached to everything, and I don’t think Instant has that concept. Is that right?
Stepan: Yes. When we were first building Instant, one thing we realized is that I do think Datomic’s historical model is something that we eventually want to get to. It’s something that I used personally in 2014 when I was at the first company I ever worked at. It was very magical. But the problem with it is that in order to support that, Rich had to invent a custom data structure, a custom index that isn’t easily available right now. So what we wanted to do was make sure we could offer this database. We needed to know that it could scale to a certain number of transactions. We needed to know that if anything went wrong, we could look into the source code. So the choices we ended up making moved us more into an EAV model.
Daniel: Gotcha. And so that’s stored in our Postgres database at the moment?
Stepan: Yes. You can think of it as all the data is actually stored in this giant table, with the asterisks, but it’s stored in this giant table of EAV.
Daniel: And it looks like you’re using Aurora Postgres in production.
Stepan: Yeah. Is there some special Aurora features you needed for that, or does it just scale better than standard?
Stepan: Mainly scales better. I think Aurora made a really great decision about separating storage and compute. Similar, I mean, Rich also did that. So lots of smart ideas there. We just wanted to make sure, in terms of write throughput, being on one Aurora machine will be quite fine. The space will just scale out. So I thought this is a nice 80-20 to make sure that at least for the next few years, even if we have a tremendous number of users, we’ll be okay.
Daniel: So on the back end only, I think you’re using Datascript?
Stepan: In the back end, we do actually use Datascript in a few places. The place where we use it is we have this in-memory data structure that keeps track of all the queries. We have this concept called topics, which you can think of as just an identifier that says, “This is the kind of thing that this query cares about.” Very often, when we’re doing these invalidations and making sure that things are working, we need to query all of our active connections. So we thought, “Oh, why not just use Datascript?” And that’s how we use it.
Daniel: I see. So you’ve had to write your own query planner on the back end for your datalog queries to kind of query Postgres efficiently.
Stepan: Yes. The front end query, the way that users see our system, is this GraphQL-looking object. With the one change we made from GraphQL is that instead of using a custom language, we just use JavaScript. The benefit there is there’s no build step, right? There are a lot of benefits to that.
Daniel: Oh yeah, exactly. The Clojurians know the value of data, right? So we made that choice. On the back end, we can take this query and transform it into one datalog query. When I say datalog, it’s kind of like an asterisk. We start changing it to be the way we need it to be. Then we can take this one datalog query and convert it into a CTE that goes and returns the choice that we care about. Then you return those triples back over the WebSocket down to the client.
Stepan: Exactly.
Daniel: Okay. So if you’re on the client, let’s say you’re watching a to-do app, something very simple, but a multi-tenant to-do app, that one client says, “Hey, I’m user one, give me all of my to-dos.” That’s going to send a watch query up to the server. You’re able to tell not just that some data in this app changed, but like data specifically for this user. How exactly do you do that?
Stepan: Yeah. I’ll explain that and also say this is one of the things I think that separates the kind of architecture we’re thinking about from some of the more traditional architectures and the local-first movement, for example. We are kind of semi-local first. The thing that we do is we make sure that from the perspective of the user, we only give them the data that their queries care about. When there’s an invalidation, we only invalidate and give them new data when it’s related to their queries.
So how does this work? The way it works is taken from the playbook of Asana and Figma. Imagine you write a query like “select users where name is equal to Stepan.” If somebody says, “I changed my name,” we can go and match on the where clause. So that’s effectively what we do. Whenever there’s a transaction, we generate a list of what we call topics. Based on these topics, queries also generate their own topics. We can find an intersection and say, “Oh, these queries are no longer valid.”
The alternative to doing things like this is you could try to replicate transactions down to the client. But then the problem with that is you’ll have to do a form of partial replication. When many users change the same thing, generally the number of transactions is actually much larger than just the data that the user needs.
Daniel: Gotcha. And then the other thing we need to make sure of is that the reason we only get the data that a user wants is that for many apps, it’s not quite practical to say, “Get all the data and put it into the browser.” So tell me more about that, kind of like limiting data input or scaling to very large amounts of data.
Stepan: This is where we, from the design decision from day one, thought that was going to be very important. We wanted to make it so that if a user uses us, they can build something like Twitter. The way that works is from the perspective of the client, you just write the queries that you care about. When you load a page, you generally don’t want to load all the data; you just want to load the data for your page. The front end keeps a cache of the last number of queries that you’ve made.
That way, when you load the page the first time, we’ll make that query. But the second time, we will be able to service it from the cache immediately. These queries start connecting to the server. The server starts keeping track. The invalidator starts getting to work. Suddenly, you have a reactive app that can scale out to be something like Twitter. Nice. Hopefully. If you had Twitter today, our services, you know, we’d have to have an enterprise contract and make sure that, you know, give us some heads up.
Stepan: One of the things that is interesting about Instant is that it is open source. The code is, I don’t know if there are maybe some other private repos, but you can run a full Instant setup yourself, even push it to AWS. It looks like there are a lot of the DevOps scripts there. So what’s kind of the thinking behind both the front end and back end being open source? What’s the business model there? How do you think about that?
Stepan: The way I think about it is as a user of a system, right? Like I can think, “Okay, let’s say I’m building my own startup. Would I bet my company on something where if I don’t like what they’re doing, I cannot get out of it? Or if there’s a bug and they’re not able to prioritize me, I cannot fix it?” So I think that’s kind of the question mark: we need to be able to answer this question. In my opinion, we should just go open source. This way, if somebody’s unhappy with us, they can always run their own instance. If there’s a bug and we don’t get to it, it keeps us honest.
From the perspective of our user, that is the best experience that they can have. Now, how are we going to succeed as a business? I think realistically, people don’t, if you just give people a great service, they’re not going to be like, “Oh, I actually want to have a team of 25 people running infrastructure for me just because.” I think we’re going to run this system in the most efficient, cost-effective, best way. We’ll have the best service. This open source just keeps us accountable to that.
Daniel: And I guess there’s nothing stopping you from running a private, single-tenant version of this for a larger customer.
Stepan: Yeah, exactly. I think there’s more stuff coming for larger customers who will probably want to support their existing databases rather than say, “Oh, you must use our database.” The vision is interesting. If you think about it, the way I see Instant, there are kind of layers to it. One of them is just better Firebase. Another one is like a more productive tool. But I think another one is the way that we build software has been changing. We used to build desktop apps. Then we started building server-based apps. Then the web browser got much more powerful, and now we’re building these hybrid apps. What that means is there needs to be a different kind of infrastructure and a different kind of architecture for apps. That’s kind of what we want to build and bring to the world.
Daniel: Nice. At Wimzacall, we have a sync engine, which is sort of semi-offline capable. It’s certainly optimistic updates. But yeah, I know firsthand how difficult it is to build something like this. Well, tell us, how long have you been working on Instant?
Stepan: I would say it was like this. Our first idea, the inkling, was in 2021, and we wrote an essay, kind of like a “what if.” But at that point, I think we were a bit sheepish about following this as a business. We weren’t sure how big this business could be. It felt very, very challenging. We were at WebShed at this other startup, and this whole story is this. What we ended up doing was one year later, we realized while building this other startup, we were needing Instant ourselves. So we thought, “Okay, we have to do this.” Around 2022, we kind of kicked it off for real.
This last two years has been like a semi-cave because both of us, and now there’s three of us, you know, we’re pretty experienced engineers, but we had never built a database before. There was a lot of exploring and playing and making sure what we’re building is going to actually work and putting it in the hands of users and seeing how they’re reacting. So around 2022, we really started, and then like two months ago, we released an actual version. We were actually available to use. We just never publicized it because we wanted to have a small number of users to iterate with.
Daniel: Gotcha. Nice. Do you have any kind of public reference customers so far that you can talk about?
Stepan: We have some users on the homepage that you can look into. Right now, I would say we have startups, indie developers, and younger companies basically. We try not to go with the bigger companies right now because the thing to optimize at our stage is about getting feedback and making sure this thing is really, really good. The best way to do that is to work with smaller teams.
Daniel: Makes sense. One of the things I noticed was you’ve got auth built in, and I thought, “Oh, that’s maybe sort of limiting for people that have their own existing auth setup.” But then you also have custom back-end auth as well. So, you know, if someone had an existing users table and a whole app, they could still start integrating Instant on the side.
Stepan: Exactly. They totally could.
Daniel: Nice. Tell me about permissions and security and how that works.
Stepan: Yeah. So that’s also another kind of like clothing we can go to. The way that we, you know, there were a few inspirations for permissions for us. One of them was this thing called the end framework at Facebook. The really cool thing about how permissions worked at Facebook was they worked at the object level. Most companies create an endpoint and do some checks in the endpoint, like, “Oh, is the user an admin or is the user this or that?” But the problem is, if you make any mistake, you might inadvertently show data that you’re not allowed to.
What Facebook did was they pushed all of the permissions onto the object itself. Even the backend developer, if they’re making a query and they’re not allowed to see this post, it just won’t show up. We really liked this idea. What we needed in this case was you should be able to write almost like code to say whether somebody is allowed to see something or not because sometimes it’s just much better to express it this way.
But the problem is if you let users write code, then you get all these issues, right? You need to sandbox it, and it might be slow and all these other things. So lo and behold, as we’re doing this, we find this thing called CEL. I don’t know if you’ve heard of this.
Daniel: I hadn’t before Instant.
Stepan: Yeah. So CEL is this language that Google developed. It’s not Turing complete. You can run it inside your system. It’s extremely fast. It compiles, and it’s pretty cool. It turns out they have a Java library, which means we can use it in Clojure. So that’s what we do. What we do is every object, when you make a query or when you make a transaction, you can write a CEL rule for them. Then we can filter whether users are able to see it or whether they’re able to make a transaction or not.
Daniel: Gotcha. And so the level of your CEL evaluation is at the object or per namespace level?
Stepan: Namespace. You can say, yeah. If you make a query and you get 10 objects back, we’ll make sure that out of those 10, you only see the ones that you’re allowed to see. Is that kind of like an additional, like does that get compiled into your sort of datalog query, or is that kind of another layer on top?
Stepan: That’s such an interesting insight. Yeah. Right now, it’s another layer on top. But for the view permissions, we definitely want to take those queries and insert them as effectively like extra where conditions.
Daniel: So tell us more about developing a query language for the front end. You’ve got GraphQL as sort of a maybe inspiration, but this is not exactly GraphQL. So how do people query with it? What kind of things can they express with it?
Stepan: It’s surprisingly powerful at this point. You can do some crazy things. You have inquiries or queries, and you can go users, posts, comments. It’s also interesting. Side note: when you have users and you see what kind of queries they write, some of them can get very interesting, very big.
To walk through the design journey of the front end, it’s like this. A lot of people think, “Oh, wouldn’t it be great if SQL was the language on the front end?” But there are kind of two issues with that. One of them is if you want that, you do basically need to implement SQL, which is SQLite. That’s like 300 kilobytes to zip. That just kind of changes the dynamic of what kind of app can start to use this.
But there’s another problem, which is very often on the front end, you do want these deeply relational data. I load this page; I want the list of tasks for each task of their comments, for the comment at the profile page, right? Writing that in SQL is not difficult, but it’s also not the most trivial SQL. But it is the most trivial query, if you think about it.
What we wanted to do was make sure that the most trivial query that a user wants to write should feel like a trivial query. I think GraphQL is the one that comes closest. You just write data in the shape that you want it, and we will get it to you in the most efficient way possible.
Now, I think there are two issues with GraphQL. One is it was built for bigger companies, so it’s a little bit harder to set up. Second, because it’s a different language, it’s a little bit harder to do meta programming on top of it. Wouldn’t it be nice if you had a table, and when you wanted to filter, and then when you click the filter button, you just change your query? You can do that in Instant pretty trivially.
The other change is in GraphQL; there’s no optimistic update that comes for free. You have to write it. So what we said was, “You know what? We’re going to make our own transactions that the database can just understand.” That’s from the front-end perspective. You write these GraphQL-looking things. We call it InstaQL. To fulfill them, we actually have a datalog engine underneath too, which can take those and do the magic.
Daniel: Gotcha. And you don’t use Datascript on the front end. Is that because of the sort of Clojure compile size penalty that you’d have to pay?
Stepan: It’s like two things. One is that. Though it’s like, you know, Datascript is a lot better than adding SQLite. But there’s another problem. I think when you’re making a developer tool, it’s very important that if there is an error, at least in my opinion, if there is an error, it should feel from the developer’s perspective like much more tenable to look deeper into it.
But the problem is, like, most front-end developers don’t know Clojure. So what will end up happening is they’re going to get the stack trace, right? And it’s going to look extremely daunting. So we built a very light, Datascript-like thing in JavaScript for this purpose. I studied Tonsky’s database very, very intensely and spoke with him as well. He has the essay, I think it was Next Web, which is in the similar spirit to what we really want to do.
Daniel: So the developer experience of Instant, like, let’s say I’ve got, I don’t know, like a sidebar with users or chat channels, maybe like a Slack. Let’s say like a Slack. So can I write one query that says, “List all channels” and have that sort of mounted in that channel component and then have a chat window component, which listens to whatever the currently selected?
Stepan: Exactly. Yes, you can.
Daniel: Gotcha. So this is like, again, like the GraphQL co-located queries and components.
Stepan: Yeah, I would say GraphQL takes it further in that they do eventually take all those separate queries and generate one giant query that gets sent to the client. We don’t do that. The main reason we don’t is if we do it that way, there’s a lot more ceremony that the developer has to go through. But what we realized was, look, let’s say you do those two separate queries. Fine, you’ll make those queries. The first time you’ll see loading screens, but the second time, both of those queries will hit the cache.
So it’s going to look very fast and nice.
Daniel: Right. Okay. So they could, in theory, come back at different speeds.
Stepan: Exactly. And if I’m subscribing to a user’s profile picture through multiple queries, that will come back, I guess that the turtles will come back multiple times, but they’ll end up living in the same spot in the client-side turtle store.
Stepan: Effectively with some asterisks. If you make the same query in many different places, those will all catch together. We have this project called Single Store, where we want to take all the different queries and put them in one place. The benefit of that will be, let’s say you make a list of profile pictures, and then you make another query that just fetches one profile. We should just immediately service that one. Right now we don’t, just to keep the kind of, if you do that, there are a few other question marks that will show up, like what if the data is only partially there, etc. So we just want to get that experience really good, and then we’re going to open up that option as well.
Daniel: Let’s talk more about optimistic updates. How does that work? What does that look like for a developer?
Stepan: From the developer’s perspective, it’s really cool. You just write the query, you write a transaction, and immediately the query just gets updated. Now, how does that all actually work? One of the biggest problems you need to solve is Figma talked about it famously in a multiplayer essay. Let’s say you change the color of a circle to blue. An update comes in saying it’s red. It should actually stay blue because your optimistic update is the latest version of that information because it hasn’t come back yet, right? That means it must be blue.
This is the kind of stuff that you have to handle. We solve it in a way that I think Clojurians will love. What we do is we keep a separate list of transactions that are considered pending. We have a store, like a database that’s on the client, and it’s immutable. When a transaction happens, we apply the transaction and create a new database that has that in it.
Even if there’s an update, let’s say that I changed the color to blue, and somebody changes it to red. The red will go into the committed store, but the blue will stay in the pending queue, which will override the red, and it will stay blue.
Daniel: Right. Okay. So you can still apply transactions while you have pending ones. It’s just they won’t show up if you’ve got an outstanding transaction.
Stepan: Yeah. If one overrides the other, it will do the correct thing, saying that, “Oh, by deduction, that I have not received information for my pending transaction, I must be the latest.”
Daniel: Is there a way, like it’s great that it is instantly acknowledged, but sometimes you really do want to know, does the server know about this as well? Is there a way to kind of wait or await on this?
Stepan: Yes. So each transaction is a promise. If you await it, it will return saying the state of this. If you’re online, what it will do is it will actually wait for you to get it back. If you’re offline, it will return saying, “This is queued.”
Daniel: Right. Okay. So you can decide what you want to do about that.
Stepan: Exactly.
Daniel: So what kind of data types can I store in Instant?
Stepan: Right now, you can store effectively blobs of columns and links. A link is just you can have all the kinds of relations that you would want. You can express it in Instant. So a user has one profile, or a post has one owner, or a post has many comments, or tags. Posts have many tags; tags have many posts. All of these can be expressed.
Now, the blob value type, we are evolving that one. Eventually, you’ll be able to store strings, numbers, ints, whatever you like, but right now you can just store JSON.
Daniel: Okay. So on the back end and the Postgres table, it’s just a JSON blob value, right?
Stepan: Yes. If I query for like id equals three, how does that kind of get turned into, does that get queried in the JSON? Do you just feel like a JSON query in Postgres? Or how does it know?
Stepan: That one would look like a pattern that says, “question mark, user slash ID, three.” That will return the entity ID, which in this case will actually just be three. After that, you’d have to, I guess, want to get all the attributes for this.
Daniel: Yeah. In which case you would say, we have different indexes. There’s one index called the EA index, which will have all that you could say the object values on it. You could say, “Give me all the triples in the EA index, where the ID is three,” and then the values are just JSON parsed, basically.
Stepan: Exactly.
Daniel: So what are some of the bits that we haven’t talked about, perhaps, that you’re really proud of, and maybe things that people might not even realize were difficult or even know to think about?
Stepan: I think one that’s a fun one is how do you actually do these kinds of relations, right? Has one, has many, right? Many to one. What we do is we have these different indexes. One trick we do is we use, I don’t know if you know about Postgres partial indexes.
Daniel: Yes.
Stepan: Exactly. So we use partial indexes. The triple doesn’t just have EAV; it also has the indexes that it wants to be on. Based on the kind of link that you want to create, we can just insert them into their proper index and then generate a many-to-one or a one-to-many or a many-to-many relation. I think that one’s a very fun kind of surprise way to do it.
Daniel: Yeah, I was looking at the schema, and I saw these like A-V, E-A-V, and I was like, “Hold on, that sounds familiar.” But I was like, “But why is it on the tuple?” A few moments later, I realized, “Yeah, partial index. Very clever.”
Daniel: You’ve said that the front end is written in JavaScript, but it’s really, I guess, TypeScript compiled to JavaScript. Can you talk about the TypeScript typing that you’ve introduced? Because it seems, yeah, I’m not really much of a TypeScript developer. It seemed pretty advanced what you could do with kind of typing your schema and then having the types kind of flow back out of Instant.
Stepan: I will say it’s been interesting because I think the front end, I would say actually still to this day, it’s actually maybe more JavaScript than it is TypeScript, especially in the kind of heart of it, where we really needed to be free to change things around very quickly.
Now, there is a layer on top, which is the schema, and one of our co-workers, Mark, did an amazing job at this. I think TypeScript itself can be extremely powerful if you use it. We took inspiration from effectively like Zod, where you could just write these objects and we can start inferring types. Right now it’s in beta, mainly because when you use TypeScript so intensely, sometimes the output types don’t look as beautiful as you would want them to. Once we get that kind of beauty right, I’m really excited to get everything moved over to that flow.
Daniel: Nice. So I’m a ClojureScript developer. TypeScript is wonderful, but it’s not much use to me. What’s kind of the story, or is there a story, or how would I use Instant from Clojure?
Stepan: Beautiful. The other thing that we always want to maintain is Firebase does not have types by default. There is a zeitgeist now where you do want to have types. But I think the option of them is very important because one, you might be using something that doesn’t have types like ClojureScript. But even if you are using TypeScript or just JavaScript, you might want to just move very quickly.
What we do is if you don’t provide types, it’s actually still fine. It will just work. We’ll also infer, if you do a link, we’ll generate the link for you. You can use the admin dashboard to set the kind of schema that you want to use. It’s only an optional additive thing on the client where, if you want TypeScript to tell you early on, like, “This is what the return of this query is,” you can add the schema. But if you don’t, we’ll actually just do the best we can and give you the data that you have in the database.
Daniel: Gotcha. So one thing which we haven’t really touched on is why Clojure? How did you end up in Clojure? Because you’ve worked at large companies; Facebook was one of them. I assume you probably weren’t using Clojure there. So you’ve no other languages. You’ve got choices of many. How did you end up using Clojure for this?
Stepan: Funny enough, the way I got to Facebook was by working at this company called Wit.ai. We were so new into AI that you could get any domain you want. We had voice AI, we had parse AI. That was the kind of stage of things. At that time, I was 20, and these four renegade engineers kind of took me in. They were all French, except for me. We all spoke Chinese randomly. I don’t know how that happened, but that was the situation. They all liked Clojure.
Before then, I was like a Ruby programmer, and that was kind of my thing. But Clojure just changed my perspective. After using it for years, you can see it has so many advantages. It really is a different way of programming. If you like exploratory programming, which for instance was critical, right? Because we had so many problems that we just didn’t have solutions for until we started trying to solve them.
Being able to try things, being able to jump into a production instance and see the value of an object was just unbelievably important. One big advantage of Clojure for us is the REPL—well-named podcast, sir. The second big advantage is I think there was this craze around the last 10 years, for example, for microservices. I tend to actually disagree with that idea pretty intensely. I think it can add so much more complexity.
If it’s possible at all to keep things simple, you should, right? The problem is, if you only have one thread, it gets very hard to keep everything in one machine, for example. But with Clojure in the JVM, you have thread after thread after thread, whatever you need. You can make very complicated software as simply as possible. One thing I love about Clojure is that in other languages, you want to make a thread pool. It just feels like a serious thing, like, “Oh my God, I better write a doc about this.” In Clojure, it’s just like, “Do times eight, future, done.”
I think Clojure lets you compress as much as possible very complicated problems. The third amazing thing about Clojure is the environment, which is, you know, we needed to use CEL, right? What other language that’s esoteric has a special library? Well, Clojure does because we’re on top of Java. The fourth, okay, the other big one is the community. I love the Clojure Slack. I also love going on there, and I always have to choose, like, which one am I today? Am I a Clojure or am I a beginner? I click beginners sometimes. I don’t know, would you call it stealth mode?
Maybe it wasn’t that stealthy because you had a website that people could use for a while, but you’ve launched recently. What was the process of going through YC like?
Stepan: I think YC was probably one of the best wins. There’s kind of a saying nowadays, “Should you go to YC?” My answer is like 100% yes, you should. I think it was funny thinking about that YC experience because prior to YC, my co-founder and I actually quit our jobs for like two years. We were spending our savings and trying to build a startup. You would imagine, for example, that spending your savings, you would feel very stressed out. Turns out you actually don’t after a few months. You’re just going to forget about it.
What surprised me, for example, was when I joined YC, we actually worked way harder than when we were using our own savings. The reason is you have this group of people that you have to meet every two weeks and then say what you did. The social embarrassment is like a giant motivator. The other huge benefit is once you get in, the way that the system is structured is that the partner really has no incentive to treat you with kid gloves or to discount you. They’re just like benevolent effectively. They just want what’s best for you.
If you don’t want to do it, it’s also fine by them because they’re not tied to your success necessarily. So what is the structure? Do you have a single partner, or how does that work?
Stepan: The way it works is once you get in, you can join a group, and you’re assigned to like four group partners. Usually, one or two of them are the ones that you interface with the most. Ours were Aaron and Jared, and they were really, really good. They know things. Another funny thing with startups is some of the advice usually follows something like this: “Hey, we need to do X. We’re going to do it in six months.” They’ll say, “What if you do it tomorrow?” You would think that you’ll get used to this and then you’ll actually just keep doing that. But actually, no, saying that will make you move like 10 times faster.
There are just simple ideas that they can tell you that will really push you forward. In startups, for example, when you’re fundraising or when you’re doing certain kinds of things like deals or hiring, very often, in a negotiation with a VC, you’re going to be at a disadvantage because you’ve only done this once while the VC has done it a thousand times. Well, guess who’s done it 10,000 times? It’s the YC partners. They have your back. A lot of things change as a result of that. To summarize why I really love YC: it makes you work much harder, the partners are benevolent, and they have your back. They will tell you things that will affect the trajectory of your company.
Daniel: What is the current status of Instant? You’ve got the YC funding. I forget how much, but was it around?
Stepan: I think at some point now it’s around $400,000 or $500,000.
Daniel: Oh, right. So the way it worked for us was two years ago, we wrote this other essay called Graph Base Firebase when we finished YC. Effectively, at this point, we raised about $3.4 million. After we raised, we went through this adventure trying to figure out how to actually build Instant. If there’s one other piece of advice I would say, sometimes I think we didn’t mean to be in a cave. If I was a better founder, it would have been faster. We just had to learn lessons that were painful. Hence why it became two years instead of probably we could have done it in like nine months. But we went through all of that, and then we launched now.
Stepan: Cool. And so it looks like you’re hiring for developers. You’ve got one already, but you’re looking for more people?
Stepan: Yeah. For us, another thing that I really believe, and this is another Clojure thing, is I do think when you’re building things, a small team is generally more productive than a bigger team. We’ve always been very careful about hiring. Right now, we have this, in my opinion, the coolest team. All three of us have known each other for 10 years. We’ve worked across companies together. I think it’s very special when you can work with somebody who takes their craft very seriously and their word very seriously. When you need help, you have somebody that’s very good by your side.
That’s us right now. A lot of things have changed. We’re in New York like a month ago. Now we’re in San Francisco. I’m making this the first video call podcast in our office. Things are happening. We need two more people. Right now, if you think about it, Instant is this large surface area. Off by itself is a startup. But there are only three of us, right? Right now, we really want to make this core experience excellent. If we can get two more people to join us, it’ll be fantastic.
Daniel: Great. And if they’re a Clojure developer, that’s a plus, I assume?
Stepan: Certainly a plus.
Daniel: Okay. But not a requirement?
Stepan: Not necessarily a requirement, but hey, if you like Clojure and Rich Hickey, we’re going to get along.
Daniel: I mean, there aren’t that many other places apart from, I guess, Nubank where you get to work on a datalog database in production, so building one at least. Is there anything we haven’t covered before we start wrapping up?
Stepan: I think we’ve covered most of these things. Is there anything that you want to cover?
Daniel: Oh, there was one thing that you mentioned, which was about syncing arbitrary Postgres database rows.
Stepan: Oh, yeah. How would you do that? What does that look like?
Stepan: If you think about it, right? What we have right now is this three-layer architecture. On the front end, we have this client SDK that understands InstaQL. In the middle level, we have this sync engine that can take InstaQL queries and keep them up to date. At the very bottom, we have this multi-tenant architecture for a database. This one is an optimization about supporting lots of different people, making a free tier actually a viable thing like a user can use, rather than it freezing every few hours or something.
What if, instead of taking an InstaQL query and making a datalog query as a result, we could take an InstaQL query and make a SQL query as a result? Similarly, when you’re invalidating, instead of creating these topics that look more like datalog queries, we could generate topics from the SQL transactions. If we did that, then we could separate out the bottom layer and give this to larger companies that already have a Postgres database, for example.
Daniel: Nice. And would that be something, would they run sort of an agent on their own servers that kind of does this tailing of the logs for them?
Stepan: The vision for this one would be, we’d have to think about the security concerns, but one idea would be you could just give us your URL, and then we will spin everything up for you. That’s another great Clojure win, right? A lot of these things, in a wimpier language, I have to think, “Oh, we have to spin up a box,” but really we can just spin up a thread and have that work.
Daniel: All right. So if people want to get involved, they can start using Instant straight away. There’s a free tier, free plan, no credit card down. You just start writing code.
Stepan: Yeah. You get, is it a gigabyte free? Honestly, I don’t remember anymore, but it’s a lot of free. You don’t even have to sign up. You can actually go to instantdb.com/tutorial. If you click a button, we will make you a database.
Daniel: Right. Nice. And then there’s a pro plan. If you use enough storage, you start paying. How do you pay for this?
Stepan: Credit card, $30 a month.
Daniel: Yeah, that’s me. That’s how you move.
Stepan: Gotcha. Cool. And so you pay per gigabyte for your storage.
Stepan: Oh, yes. Sorry, as the storage goes, the price will go higher. I would say right now, the thing that we’re trying to optimize for isn’t, let’s say, enterprise value or contracts or things like that. What we really want is people building apps that are active and cool. We want to support even people who don’t have a lot. Maybe they’re in a hackathon. Maybe they’re a student. It still works for us.
We believe in the Patrick Collison way of, you know, maybe one person will be our Lyft.
Daniel: Right. And yeah, and by Lyft, I mean L-Y-F-T for the users. Great. One other thing is that you’re going to be speaking at the Conj in a little bit less than a month. So tell us what your talk is going to be about.
Stepan: It’ll be about building Instant and how Clojure was critical to it. I really do think, in one thing Rich shared in one of his talks, I don’t remember which one, he said, “One thing you want to get good at as a programmer is to solve things you’ve never done before.” Instant was for our team something we’ve literally just never done before, and Clojure was a big part of that, making it possible.
Daniel: Awesome. Is there anyone else you’d like to thank or mention before we wrap up?
Stepan: Yeah, thanks to my team, Joe, my co-founder, Daniel, our teammate. Us three have been in this together, and it’s really fun to work on this. There are our investors who took a shot on us, and there are so many people in our lives, right? I think there’s these guys at Wit.ai. I was 20 years old, and I didn’t have a visa and didn’t even have the ability to drink in the United States, but they had me join them as a second engineer.
Daniel: Nice. Actually, I’m looking at your list of investors, and I have to ask about a couple of these names because, like, Paul Graham, obviously, you know well-known. He’s co-founder of Y Combinator, but he individually invested in you? That wasn’t like, you buy proxy. He’s, you know, an investor because of, you know, he’s sort of involved in YC. That’s a separate thing.
Stepan: That’s a separate thing. Yeah. He individually invested in us.
Daniel: Amazing. And Jeff Dean, how did you even get in contact with Jeff?
Stepan: Jeff was introduced to us by one of our other investors.
Daniel: I see. Nice. Okay. Benefits of YC, perhaps?
Stepan: Yeah, I mean, you know, I think one other kind of Silicon Valleyism that’s interesting or that at least I’ve found to be true is most things take a very long time. For example, I actually cold emailed Paul Graham when I was 18, you know, so I’m proud of the question I asked him because I think I asked him something like, “Paul, who are your mentors now?” And he’s like, “I’m 50 years old. I don’t have mentors anymore.”
Okay. So I think the energy of the value is actually, it’s a very small community. Things take years, but your reputation does tend to build up. People are very helpful with each other. That’s kind of how a lot of these investments happened was, you know, Joe and I, for example, have been at it for the last, since I was an adult, I read about PGSA and startups, and this is what I really like.
Daniel: Nice. All right. Well, thank you so much for talking about this. Thanks for making Instant open source. It’s really nice to be able to kind of look at the code and not just read the docs. You get a better sense of what’s going on underneath. We’ve got some work at Wimzacall where we might need to make our app even more offline capable. I think we’ll probably take some inspiration from what you’ve shared here.
So, yeah, thanks so much, and I look forward to seeing what people build with Instant.
Stepan: Cheers, Daniel.