Technology
Don't Throw One Away - The Value Of Getting Things Right Early
“The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers. Seen this way the answer is much clearer. [...]
Hence plan to throw one away; you will, anyhow.” – Fred Brooks, The Mythical Man Month
This is one of the more famous quotes in software engineering, but I noticed a while ago that a lot of technical leaders at startups seemed to be applying this idea in a way that I actually think is wrong, or at least changing rapidly. Specifically I see this idea or some variation of it used to justify what look to me like straight-up bad technical choices made in the early days of many startups. The idea seems to be that “we know we’re going to throw it away anyway so it doesn’t matter if we use this technology/platform/tool that we know cannot possibly work for us if we succeed.”
Even Brooks himself revisited this later, saying that this concept is too simplistic and may be wrong in cases where rapid iterations can be used to fix software in place. However, this seems to me to be the other side of the same coin. While I don’t believe you should design a system to be thrown away in many cases you also should not assume you will be able to fix everything gracefully or iteratively over time.
The challenges of designing a complex system when confronted with a blank text editor can be daunting to even experienced software professionals. I think that what is really happening here is that both of these concepts – “I don’t need to design carefully since we will throw it away later” and “it’s okay if it’s not right since we’ll just fix it as we go” – are really acting as psychological crutches to help technical leaders deal with the dread (and writer’s block) of going from zero to something by essentially forgiving any mistakes made in advance. As someone obsessed with the creative work of software engineering I find this fascinating. I also believe there is a better and more productive way to think about this if you find yourself in this situation.
What Are We Really Talking About?
First, it’s worth taking a moment to clarify what we are really talking about. Fred Brooks was originally writing about lessons learned during the development of OS/360 at IBM. Operating systems are obviously unforgiving and were relatively new to the world at that time. OS/360 specifically introduced several innovations in Operating System design and was estimated to have included 1 million lines of software in its first release. This was a massive project, demanding in terms of quality, and including multiple novel features.
In this context, when Fred Brooks says “plan to throw one away” the correct response is “of course.” This is not because you are not going to make every effort to build things the right way or because you are not going to use the best tools and methodologies. It is because you could not possibly have understood the design of the system correctly at the beginning of the project. As I like to say: your future, smarter self will hate almost all of the software you write. And sometimes that means you will need to start over, especially on large, complex, novel projects .
What Brooks was not saying – and my first objection when I hear this raised in the context of tech startups – is to deliberately choose the wrong tools (or frameworks, or platforms) with the intention of changing them later. The learning that he envisions leading to your next-generation design happens specifically because you are doing the best you can with the best tools and learning lessons about what works and what does not. You can’t benefit from experience if the experience itself is wrong or irrelevant to the problem you really want to be solving.
Also, let’s be clear here: a lot of software projects these days involve commodity solutions to commodity problems. We know how to build CRUD services for authenticated users to update a data store (though some variations of this problem can still be a headache). Is your problem really large? Is it really complex? Does it really include unsolved problems? If not, then you probably should just build it the right way from day one.
“Agile” and Iteration
If we aren’t going to assume up front that we can just trash the whole system and rebuild it later then we should probably just get started and assume that we will “fix forward” any mistakes we make along the way, right? This certainly fits with most of what we are told about your favorite flavor of “agile” development (I am constitutionally incapable of writing that without the scare quotes).
The problem with this is that you can’t really iterate on a ground-up rebuild. My co-founder likes to joke that anything interesting that you are doing will always be 21 story points (or whatever number is one size too large for your current scoping exercise). Since work like this never fits in a sprint it generally never gets done and you end up living with technical debt and incorrect architecture forever.
Unfortunately, even on fairly straightforward projects, some choices matter. The wrong storage layer, or in some cases the wrong language, framework or tool, can lead to drastically different outcomes, typically on scales so large that it can make a project effectively unworkable. Making mistakes on these kinds of choices lead engineers to tell product managers on mature projects that certain features “can’t be built” or that there is no way to make a system faster at any cost.
Data Storage Is Different
Most of what I am saying here is not particularly controversial and is probably pretty obvious in most cases: pay attention to the parts of the design that are hard to change later, remain flexible about as much as possible, when problems have been solved by other people try to do what has worked before. But if this is all so obvious why does this idea of “throwing one away” come up at all?
My theory is that it comes from the current state of data infrastructure, particularly data warehouses (which I have written about here) as the centerpiece of a modern data architecture. Because of the limitations of data warehouses (too slow to power your application, too expensive to use 24/7 continuously) most sophisticated modern data architectures evolve by adding other components and systems to add the other features needed. That probably includes things like these for most projects:
An operational database for CRUD and application data retrieval
Nightly ETL/batch jobs to compute metrics from the data warehouse
A reporting database to power internal analytics
Stream processing components to provide intraday analysis
Caching of common content to reduce load on the database
Data queues to smooth out the rough edges between all those layers
Etc. etc. etc.
When you look at this you realize that it is completely rational for a technical leader at a startup to say that they do not have the time, the money, some of the skills or probably the patience needed to build all of that. That is correct: a small startup cannotbuild all of the above.
But it’s very easy for the understanding that all of this cannot realistically be built at that stage to turn into justifications for why you don’t really need to (or even maybe want to) do all that stuff anyway. “We can’t” starts to turn into “we wouldn’t even if we could.” This is where you hear things like “we don’t need all of this on day one”, “we can just start small and it will be good enough”, “we can always add those things later when we need them” and “we’re going to throw it away anyway.”
This is not correct. Of course your startup is going to want audit trails, log visibility, intraday metrics, integration with ML and AI training, automated data discovery and anomaly detection. Of course it would be better to have all of those things. And, if you are successful, you will need those things soon and it will be a bad day not to have them – or at least the ability to develop and deploy them quickly.
Getting Started While Planning for the Future
Essentially technical leaders at startups are stuck between a rock and a hard place: on the one hand any business, especially SaaS startups, will need all of these capabilities and of course on the other hand they can’t afford them (note: if you work at a startup that has unlimited time and money to address these challenges then I guess I don’t have any advice for you as I have no idea what that would be like). Because of this we are going to have to manage some tradeoffs, but we can at least approach it with some careful thought.
The most important things are to make sure that we don’t make any choices that risk completely killing our business at any point. That means we should understand the cost (per unit time or per operation), the scalability and limitations, and the extensibility and adaptability of the technologies we choose. If your core storage has a per-query cost associated with it (e.g. DynamoDB or BigQuery) then you should understand what those costs will look like if the business suddenly has a spike in growth. If your storage has hard scaling limits: are those limits high enough for your business to be a success or not? Can the technologies you have chosen be deployed to all of the environments you must support for the business to succeed?
You can compromise (or “throw one away”) for the auxiliary parts of the business. Maybe that is reporting dashboards, maybe that is log aggregation. You can build off a platform that won’t scale infinitely if it scales enough for the business to succeed while you replace it later. But if you are selecting platforms that cannot meet the minimum threshold for success of the business you are doing it wrong.
Above all: do not make tradeoffs that you don’t have to. Are you sure you can’t provide intraday metrics? Why not? What would it cost (time and money)? What about supporting 2-3 orders of magnitude more traffic on your site/app? What would you want to use to handle that? Why can’t you use it from day one? If you can’t, what is the most frictionless way for you to switch later and what will that cost (again time and money)? Thinking this way can save a startup from its own success.
The Greatest Ability Is Fungibility
The single most important dimension to think about is probably your ability to operate, maintain and extend your technology platform over time. I think of this in terms of “fungibility”. Over time you will need to maintain multiple environments (think dev, test and prod), you will want new environments from time-to-time (for partnerships and internal groups with specific missions), you will want to move data seamlessly between them and you will want to manage deployment of major breaking versions of your software by deploying entirely new environments.
Developers have talked about “cattle not pets” for application layers for decades, but when it comes to data platforms most organizations cannot provide the same set of capabilities. I have seen many organizations where the absolute bottleneck was a single data platform that had been built by hand and maintained organically over years and probably could never be recreated correctly. If you can avoid this situation then you will have a lot less to worry about.
Fungibility (especially of data platforms) leads to adaptability of software projects and teams over time and either eliminates the day one design tradeoffs or reduces the potential risks massively.
Balancing Agility and Scalability
Software startups should want it all: all the capabilities, scalability, the ability to adapt and change over time, and predictable (and reasonable) costs. The key to avoiding "throwing one away" is to avoid assuming you can iterate your way out of every problem, and instead make thoughtful choices in a few key areas as early as possible:
Select the best tools for your specific needs – even if they require more initial investment.
Minimize the number of platforms and solutions to reduce complexity.
Avoid technologies that won't scale in critical dimensions of your business.
Plan for fungibility – ensure your systems can be easily replicated, modified, and scaled.
Build foundations that can accommodate future design changes without a complete overhaul.
The goal should be to create a system that can grow and evolve with your business and that doesn’t paint you into a corner.
Modern data infrastructure solutions are emerging to address these challenges. Our answer to these challenges was to develop MinusOneDB. It is a scalable, adaptable data platform that can grow from prototype to production without demanding major rewrites and without the need to rebuild core services that would be painful (or impossible at a practical level) to replace later. While m1db is just one of many data PaaS options in the market, it represents a new approach to data infrastructure that prioritizes both agility and scalability.
MinusOneDB is buildable (and rebuildable) from the ground up using service calls and configuration. Data can be reloaded at any scale in a matter of hours, whether you just need a copy of your data store or whether you are redesigning your schema to release a new major version. It scales from millions to billions to trillions of records with predictable, linear pricing. And it provides constant, fast performance so all of your queries work for all use cases at any scale.
In line with the above principles, we have built features for many common development needs under the theory that commodity engineering should be commoditized, including:
User logins and “forgot my password”
Resource isolation for risk management
Multi-tenancy
Audit (and transaction) logs
Data versioning and time-based queries
First-class support for text as a datatype
… more all the time
Starting something new is hard. You need the velocity to be first, the flexibility to adapt on the fly, and the scalability to grow with your market. The last thing you need is to have to do everything twice. Our goal is to help startups get off the ground quickly – and then scale forever – without ever having to “throw one away.”
“The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers. Seen this way the answer is much clearer. [...]
Hence plan to throw one away; you will, anyhow.” – Fred Brooks, The Mythical Man Month
This is one of the more famous quotes in software engineering, but I noticed a while ago that a lot of technical leaders at startups seemed to be applying this idea in a way that I actually think is wrong, or at least changing rapidly. Specifically I see this idea or some variation of it used to justify what look to me like straight-up bad technical choices made in the early days of many startups. The idea seems to be that “we know we’re going to throw it away anyway so it doesn’t matter if we use this technology/platform/tool that we know cannot possibly work for us if we succeed.”
Even Brooks himself revisited this later, saying that this concept is too simplistic and may be wrong in cases where rapid iterations can be used to fix software in place. However, this seems to me to be the other side of the same coin. While I don’t believe you should design a system to be thrown away in many cases you also should not assume you will be able to fix everything gracefully or iteratively over time.
The challenges of designing a complex system when confronted with a blank text editor can be daunting to even experienced software professionals. I think that what is really happening here is that both of these concepts – “I don’t need to design carefully since we will throw it away later” and “it’s okay if it’s not right since we’ll just fix it as we go” – are really acting as psychological crutches to help technical leaders deal with the dread (and writer’s block) of going from zero to something by essentially forgiving any mistakes made in advance. As someone obsessed with the creative work of software engineering I find this fascinating. I also believe there is a better and more productive way to think about this if you find yourself in this situation.
What Are We Really Talking About?
First, it’s worth taking a moment to clarify what we are really talking about. Fred Brooks was originally writing about lessons learned during the development of OS/360 at IBM. Operating systems are obviously unforgiving and were relatively new to the world at that time. OS/360 specifically introduced several innovations in Operating System design and was estimated to have included 1 million lines of software in its first release. This was a massive project, demanding in terms of quality, and including multiple novel features.
In this context, when Fred Brooks says “plan to throw one away” the correct response is “of course.” This is not because you are not going to make every effort to build things the right way or because you are not going to use the best tools and methodologies. It is because you could not possibly have understood the design of the system correctly at the beginning of the project. As I like to say: your future, smarter self will hate almost all of the software you write. And sometimes that means you will need to start over, especially on large, complex, novel projects .
What Brooks was not saying – and my first objection when I hear this raised in the context of tech startups – is to deliberately choose the wrong tools (or frameworks, or platforms) with the intention of changing them later. The learning that he envisions leading to your next-generation design happens specifically because you are doing the best you can with the best tools and learning lessons about what works and what does not. You can’t benefit from experience if the experience itself is wrong or irrelevant to the problem you really want to be solving.
Also, let’s be clear here: a lot of software projects these days involve commodity solutions to commodity problems. We know how to build CRUD services for authenticated users to update a data store (though some variations of this problem can still be a headache). Is your problem really large? Is it really complex? Does it really include unsolved problems? If not, then you probably should just build it the right way from day one.
“Agile” and Iteration
If we aren’t going to assume up front that we can just trash the whole system and rebuild it later then we should probably just get started and assume that we will “fix forward” any mistakes we make along the way, right? This certainly fits with most of what we are told about your favorite flavor of “agile” development (I am constitutionally incapable of writing that without the scare quotes).
The problem with this is that you can’t really iterate on a ground-up rebuild. My co-founder likes to joke that anything interesting that you are doing will always be 21 story points (or whatever number is one size too large for your current scoping exercise). Since work like this never fits in a sprint it generally never gets done and you end up living with technical debt and incorrect architecture forever.
Unfortunately, even on fairly straightforward projects, some choices matter. The wrong storage layer, or in some cases the wrong language, framework or tool, can lead to drastically different outcomes, typically on scales so large that it can make a project effectively unworkable. Making mistakes on these kinds of choices lead engineers to tell product managers on mature projects that certain features “can’t be built” or that there is no way to make a system faster at any cost.
Data Storage Is Different
Most of what I am saying here is not particularly controversial and is probably pretty obvious in most cases: pay attention to the parts of the design that are hard to change later, remain flexible about as much as possible, when problems have been solved by other people try to do what has worked before. But if this is all so obvious why does this idea of “throwing one away” come up at all?
My theory is that it comes from the current state of data infrastructure, particularly data warehouses (which I have written about here) as the centerpiece of a modern data architecture. Because of the limitations of data warehouses (too slow to power your application, too expensive to use 24/7 continuously) most sophisticated modern data architectures evolve by adding other components and systems to add the other features needed. That probably includes things like these for most projects:
An operational database for CRUD and application data retrieval
Nightly ETL/batch jobs to compute metrics from the data warehouse
A reporting database to power internal analytics
Stream processing components to provide intraday analysis
Caching of common content to reduce load on the database
Data queues to smooth out the rough edges between all those layers
Etc. etc. etc.
When you look at this you realize that it is completely rational for a technical leader at a startup to say that they do not have the time, the money, some of the skills or probably the patience needed to build all of that. That is correct: a small startup cannotbuild all of the above.
But it’s very easy for the understanding that all of this cannot realistically be built at that stage to turn into justifications for why you don’t really need to (or even maybe want to) do all that stuff anyway. “We can’t” starts to turn into “we wouldn’t even if we could.” This is where you hear things like “we don’t need all of this on day one”, “we can just start small and it will be good enough”, “we can always add those things later when we need them” and “we’re going to throw it away anyway.”
This is not correct. Of course your startup is going to want audit trails, log visibility, intraday metrics, integration with ML and AI training, automated data discovery and anomaly detection. Of course it would be better to have all of those things. And, if you are successful, you will need those things soon and it will be a bad day not to have them – or at least the ability to develop and deploy them quickly.
Getting Started While Planning for the Future
Essentially technical leaders at startups are stuck between a rock and a hard place: on the one hand any business, especially SaaS startups, will need all of these capabilities and of course on the other hand they can’t afford them (note: if you work at a startup that has unlimited time and money to address these challenges then I guess I don’t have any advice for you as I have no idea what that would be like). Because of this we are going to have to manage some tradeoffs, but we can at least approach it with some careful thought.
The most important things are to make sure that we don’t make any choices that risk completely killing our business at any point. That means we should understand the cost (per unit time or per operation), the scalability and limitations, and the extensibility and adaptability of the technologies we choose. If your core storage has a per-query cost associated with it (e.g. DynamoDB or BigQuery) then you should understand what those costs will look like if the business suddenly has a spike in growth. If your storage has hard scaling limits: are those limits high enough for your business to be a success or not? Can the technologies you have chosen be deployed to all of the environments you must support for the business to succeed?
You can compromise (or “throw one away”) for the auxiliary parts of the business. Maybe that is reporting dashboards, maybe that is log aggregation. You can build off a platform that won’t scale infinitely if it scales enough for the business to succeed while you replace it later. But if you are selecting platforms that cannot meet the minimum threshold for success of the business you are doing it wrong.
Above all: do not make tradeoffs that you don’t have to. Are you sure you can’t provide intraday metrics? Why not? What would it cost (time and money)? What about supporting 2-3 orders of magnitude more traffic on your site/app? What would you want to use to handle that? Why can’t you use it from day one? If you can’t, what is the most frictionless way for you to switch later and what will that cost (again time and money)? Thinking this way can save a startup from its own success.
The Greatest Ability Is Fungibility
The single most important dimension to think about is probably your ability to operate, maintain and extend your technology platform over time. I think of this in terms of “fungibility”. Over time you will need to maintain multiple environments (think dev, test and prod), you will want new environments from time-to-time (for partnerships and internal groups with specific missions), you will want to move data seamlessly between them and you will want to manage deployment of major breaking versions of your software by deploying entirely new environments.
Developers have talked about “cattle not pets” for application layers for decades, but when it comes to data platforms most organizations cannot provide the same set of capabilities. I have seen many organizations where the absolute bottleneck was a single data platform that had been built by hand and maintained organically over years and probably could never be recreated correctly. If you can avoid this situation then you will have a lot less to worry about.
Fungibility (especially of data platforms) leads to adaptability of software projects and teams over time and either eliminates the day one design tradeoffs or reduces the potential risks massively.
Balancing Agility and Scalability
Software startups should want it all: all the capabilities, scalability, the ability to adapt and change over time, and predictable (and reasonable) costs. The key to avoiding "throwing one away" is to avoid assuming you can iterate your way out of every problem, and instead make thoughtful choices in a few key areas as early as possible:
Select the best tools for your specific needs – even if they require more initial investment.
Minimize the number of platforms and solutions to reduce complexity.
Avoid technologies that won't scale in critical dimensions of your business.
Plan for fungibility – ensure your systems can be easily replicated, modified, and scaled.
Build foundations that can accommodate future design changes without a complete overhaul.
The goal should be to create a system that can grow and evolve with your business and that doesn’t paint you into a corner.
Modern data infrastructure solutions are emerging to address these challenges. Our answer to these challenges was to develop MinusOneDB. It is a scalable, adaptable data platform that can grow from prototype to production without demanding major rewrites and without the need to rebuild core services that would be painful (or impossible at a practical level) to replace later. While m1db is just one of many data PaaS options in the market, it represents a new approach to data infrastructure that prioritizes both agility and scalability.
MinusOneDB is buildable (and rebuildable) from the ground up using service calls and configuration. Data can be reloaded at any scale in a matter of hours, whether you just need a copy of your data store or whether you are redesigning your schema to release a new major version. It scales from millions to billions to trillions of records with predictable, linear pricing. And it provides constant, fast performance so all of your queries work for all use cases at any scale.
In line with the above principles, we have built features for many common development needs under the theory that commodity engineering should be commoditized, including:
User logins and “forgot my password”
Resource isolation for risk management
Multi-tenancy
Audit (and transaction) logs
Data versioning and time-based queries
First-class support for text as a datatype
… more all the time
Starting something new is hard. You need the velocity to be first, the flexibility to adapt on the fly, and the scalability to grow with your market. The last thing you need is to have to do everything twice. Our goal is to help startups get off the ground quickly – and then scale forever – without ever having to “throw one away.”
“The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers. Seen this way the answer is much clearer. [...]
Hence plan to throw one away; you will, anyhow.” – Fred Brooks, The Mythical Man Month
This is one of the more famous quotes in software engineering, but I noticed a while ago that a lot of technical leaders at startups seemed to be applying this idea in a way that I actually think is wrong, or at least changing rapidly. Specifically I see this idea or some variation of it used to justify what look to me like straight-up bad technical choices made in the early days of many startups. The idea seems to be that “we know we’re going to throw it away anyway so it doesn’t matter if we use this technology/platform/tool that we know cannot possibly work for us if we succeed.”
Even Brooks himself revisited this later, saying that this concept is too simplistic and may be wrong in cases where rapid iterations can be used to fix software in place. However, this seems to me to be the other side of the same coin. While I don’t believe you should design a system to be thrown away in many cases you also should not assume you will be able to fix everything gracefully or iteratively over time.
The challenges of designing a complex system when confronted with a blank text editor can be daunting to even experienced software professionals. I think that what is really happening here is that both of these concepts – “I don’t need to design carefully since we will throw it away later” and “it’s okay if it’s not right since we’ll just fix it as we go” – are really acting as psychological crutches to help technical leaders deal with the dread (and writer’s block) of going from zero to something by essentially forgiving any mistakes made in advance. As someone obsessed with the creative work of software engineering I find this fascinating. I also believe there is a better and more productive way to think about this if you find yourself in this situation.
What Are We Really Talking About?
First, it’s worth taking a moment to clarify what we are really talking about. Fred Brooks was originally writing about lessons learned during the development of OS/360 at IBM. Operating systems are obviously unforgiving and were relatively new to the world at that time. OS/360 specifically introduced several innovations in Operating System design and was estimated to have included 1 million lines of software in its first release. This was a massive project, demanding in terms of quality, and including multiple novel features.
In this context, when Fred Brooks says “plan to throw one away” the correct response is “of course.” This is not because you are not going to make every effort to build things the right way or because you are not going to use the best tools and methodologies. It is because you could not possibly have understood the design of the system correctly at the beginning of the project. As I like to say: your future, smarter self will hate almost all of the software you write. And sometimes that means you will need to start over, especially on large, complex, novel projects .
What Brooks was not saying – and my first objection when I hear this raised in the context of tech startups – is to deliberately choose the wrong tools (or frameworks, or platforms) with the intention of changing them later. The learning that he envisions leading to your next-generation design happens specifically because you are doing the best you can with the best tools and learning lessons about what works and what does not. You can’t benefit from experience if the experience itself is wrong or irrelevant to the problem you really want to be solving.
Also, let’s be clear here: a lot of software projects these days involve commodity solutions to commodity problems. We know how to build CRUD services for authenticated users to update a data store (though some variations of this problem can still be a headache). Is your problem really large? Is it really complex? Does it really include unsolved problems? If not, then you probably should just build it the right way from day one.
“Agile” and Iteration
If we aren’t going to assume up front that we can just trash the whole system and rebuild it later then we should probably just get started and assume that we will “fix forward” any mistakes we make along the way, right? This certainly fits with most of what we are told about your favorite flavor of “agile” development (I am constitutionally incapable of writing that without the scare quotes).
The problem with this is that you can’t really iterate on a ground-up rebuild. My co-founder likes to joke that anything interesting that you are doing will always be 21 story points (or whatever number is one size too large for your current scoping exercise). Since work like this never fits in a sprint it generally never gets done and you end up living with technical debt and incorrect architecture forever.
Unfortunately, even on fairly straightforward projects, some choices matter. The wrong storage layer, or in some cases the wrong language, framework or tool, can lead to drastically different outcomes, typically on scales so large that it can make a project effectively unworkable. Making mistakes on these kinds of choices lead engineers to tell product managers on mature projects that certain features “can’t be built” or that there is no way to make a system faster at any cost.
Data Storage Is Different
Most of what I am saying here is not particularly controversial and is probably pretty obvious in most cases: pay attention to the parts of the design that are hard to change later, remain flexible about as much as possible, when problems have been solved by other people try to do what has worked before. But if this is all so obvious why does this idea of “throwing one away” come up at all?
My theory is that it comes from the current state of data infrastructure, particularly data warehouses (which I have written about here) as the centerpiece of a modern data architecture. Because of the limitations of data warehouses (too slow to power your application, too expensive to use 24/7 continuously) most sophisticated modern data architectures evolve by adding other components and systems to add the other features needed. That probably includes things like these for most projects:
An operational database for CRUD and application data retrieval
Nightly ETL/batch jobs to compute metrics from the data warehouse
A reporting database to power internal analytics
Stream processing components to provide intraday analysis
Caching of common content to reduce load on the database
Data queues to smooth out the rough edges between all those layers
Etc. etc. etc.
When you look at this you realize that it is completely rational for a technical leader at a startup to say that they do not have the time, the money, some of the skills or probably the patience needed to build all of that. That is correct: a small startup cannotbuild all of the above.
But it’s very easy for the understanding that all of this cannot realistically be built at that stage to turn into justifications for why you don’t really need to (or even maybe want to) do all that stuff anyway. “We can’t” starts to turn into “we wouldn’t even if we could.” This is where you hear things like “we don’t need all of this on day one”, “we can just start small and it will be good enough”, “we can always add those things later when we need them” and “we’re going to throw it away anyway.”
This is not correct. Of course your startup is going to want audit trails, log visibility, intraday metrics, integration with ML and AI training, automated data discovery and anomaly detection. Of course it would be better to have all of those things. And, if you are successful, you will need those things soon and it will be a bad day not to have them – or at least the ability to develop and deploy them quickly.
Getting Started While Planning for the Future
Essentially technical leaders at startups are stuck between a rock and a hard place: on the one hand any business, especially SaaS startups, will need all of these capabilities and of course on the other hand they can’t afford them (note: if you work at a startup that has unlimited time and money to address these challenges then I guess I don’t have any advice for you as I have no idea what that would be like). Because of this we are going to have to manage some tradeoffs, but we can at least approach it with some careful thought.
The most important things are to make sure that we don’t make any choices that risk completely killing our business at any point. That means we should understand the cost (per unit time or per operation), the scalability and limitations, and the extensibility and adaptability of the technologies we choose. If your core storage has a per-query cost associated with it (e.g. DynamoDB or BigQuery) then you should understand what those costs will look like if the business suddenly has a spike in growth. If your storage has hard scaling limits: are those limits high enough for your business to be a success or not? Can the technologies you have chosen be deployed to all of the environments you must support for the business to succeed?
You can compromise (or “throw one away”) for the auxiliary parts of the business. Maybe that is reporting dashboards, maybe that is log aggregation. You can build off a platform that won’t scale infinitely if it scales enough for the business to succeed while you replace it later. But if you are selecting platforms that cannot meet the minimum threshold for success of the business you are doing it wrong.
Above all: do not make tradeoffs that you don’t have to. Are you sure you can’t provide intraday metrics? Why not? What would it cost (time and money)? What about supporting 2-3 orders of magnitude more traffic on your site/app? What would you want to use to handle that? Why can’t you use it from day one? If you can’t, what is the most frictionless way for you to switch later and what will that cost (again time and money)? Thinking this way can save a startup from its own success.
The Greatest Ability Is Fungibility
The single most important dimension to think about is probably your ability to operate, maintain and extend your technology platform over time. I think of this in terms of “fungibility”. Over time you will need to maintain multiple environments (think dev, test and prod), you will want new environments from time-to-time (for partnerships and internal groups with specific missions), you will want to move data seamlessly between them and you will want to manage deployment of major breaking versions of your software by deploying entirely new environments.
Developers have talked about “cattle not pets” for application layers for decades, but when it comes to data platforms most organizations cannot provide the same set of capabilities. I have seen many organizations where the absolute bottleneck was a single data platform that had been built by hand and maintained organically over years and probably could never be recreated correctly. If you can avoid this situation then you will have a lot less to worry about.
Fungibility (especially of data platforms) leads to adaptability of software projects and teams over time and either eliminates the day one design tradeoffs or reduces the potential risks massively.
Balancing Agility and Scalability
Software startups should want it all: all the capabilities, scalability, the ability to adapt and change over time, and predictable (and reasonable) costs. The key to avoiding "throwing one away" is to avoid assuming you can iterate your way out of every problem, and instead make thoughtful choices in a few key areas as early as possible:
Select the best tools for your specific needs – even if they require more initial investment.
Minimize the number of platforms and solutions to reduce complexity.
Avoid technologies that won't scale in critical dimensions of your business.
Plan for fungibility – ensure your systems can be easily replicated, modified, and scaled.
Build foundations that can accommodate future design changes without a complete overhaul.
The goal should be to create a system that can grow and evolve with your business and that doesn’t paint you into a corner.
Modern data infrastructure solutions are emerging to address these challenges. Our answer to these challenges was to develop MinusOneDB. It is a scalable, adaptable data platform that can grow from prototype to production without demanding major rewrites and without the need to rebuild core services that would be painful (or impossible at a practical level) to replace later. While m1db is just one of many data PaaS options in the market, it represents a new approach to data infrastructure that prioritizes both agility and scalability.
MinusOneDB is buildable (and rebuildable) from the ground up using service calls and configuration. Data can be reloaded at any scale in a matter of hours, whether you just need a copy of your data store or whether you are redesigning your schema to release a new major version. It scales from millions to billions to trillions of records with predictable, linear pricing. And it provides constant, fast performance so all of your queries work for all use cases at any scale.
In line with the above principles, we have built features for many common development needs under the theory that commodity engineering should be commoditized, including:
User logins and “forgot my password”
Resource isolation for risk management
Multi-tenancy
Audit (and transaction) logs
Data versioning and time-based queries
First-class support for text as a datatype
… more all the time
Starting something new is hard. You need the velocity to be first, the flexibility to adapt on the fly, and the scalability to grow with your market. The last thing you need is to have to do everything twice. Our goal is to help startups get off the ground quickly – and then scale forever – without ever having to “throw one away.”
“The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. The only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers. Seen this way the answer is much clearer. [...]
Hence plan to throw one away; you will, anyhow.” – Fred Brooks, The Mythical Man Month
This is one of the more famous quotes in software engineering, but I noticed a while ago that a lot of technical leaders at startups seemed to be applying this idea in a way that I actually think is wrong, or at least changing rapidly. Specifically I see this idea or some variation of it used to justify what look to me like straight-up bad technical choices made in the early days of many startups. The idea seems to be that “we know we’re going to throw it away anyway so it doesn’t matter if we use this technology/platform/tool that we know cannot possibly work for us if we succeed.”
Even Brooks himself revisited this later, saying that this concept is too simplistic and may be wrong in cases where rapid iterations can be used to fix software in place. However, this seems to me to be the other side of the same coin. While I don’t believe you should design a system to be thrown away in many cases you also should not assume you will be able to fix everything gracefully or iteratively over time.
The challenges of designing a complex system when confronted with a blank text editor can be daunting to even experienced software professionals. I think that what is really happening here is that both of these concepts – “I don’t need to design carefully since we will throw it away later” and “it’s okay if it’s not right since we’ll just fix it as we go” – are really acting as psychological crutches to help technical leaders deal with the dread (and writer’s block) of going from zero to something by essentially forgiving any mistakes made in advance. As someone obsessed with the creative work of software engineering I find this fascinating. I also believe there is a better and more productive way to think about this if you find yourself in this situation.
What Are We Really Talking About?
First, it’s worth taking a moment to clarify what we are really talking about. Fred Brooks was originally writing about lessons learned during the development of OS/360 at IBM. Operating systems are obviously unforgiving and were relatively new to the world at that time. OS/360 specifically introduced several innovations in Operating System design and was estimated to have included 1 million lines of software in its first release. This was a massive project, demanding in terms of quality, and including multiple novel features.
In this context, when Fred Brooks says “plan to throw one away” the correct response is “of course.” This is not because you are not going to make every effort to build things the right way or because you are not going to use the best tools and methodologies. It is because you could not possibly have understood the design of the system correctly at the beginning of the project. As I like to say: your future, smarter self will hate almost all of the software you write. And sometimes that means you will need to start over, especially on large, complex, novel projects .
What Brooks was not saying – and my first objection when I hear this raised in the context of tech startups – is to deliberately choose the wrong tools (or frameworks, or platforms) with the intention of changing them later. The learning that he envisions leading to your next-generation design happens specifically because you are doing the best you can with the best tools and learning lessons about what works and what does not. You can’t benefit from experience if the experience itself is wrong or irrelevant to the problem you really want to be solving.
Also, let’s be clear here: a lot of software projects these days involve commodity solutions to commodity problems. We know how to build CRUD services for authenticated users to update a data store (though some variations of this problem can still be a headache). Is your problem really large? Is it really complex? Does it really include unsolved problems? If not, then you probably should just build it the right way from day one.
“Agile” and Iteration
If we aren’t going to assume up front that we can just trash the whole system and rebuild it later then we should probably just get started and assume that we will “fix forward” any mistakes we make along the way, right? This certainly fits with most of what we are told about your favorite flavor of “agile” development (I am constitutionally incapable of writing that without the scare quotes).
The problem with this is that you can’t really iterate on a ground-up rebuild. My co-founder likes to joke that anything interesting that you are doing will always be 21 story points (or whatever number is one size too large for your current scoping exercise). Since work like this never fits in a sprint it generally never gets done and you end up living with technical debt and incorrect architecture forever.
Unfortunately, even on fairly straightforward projects, some choices matter. The wrong storage layer, or in some cases the wrong language, framework or tool, can lead to drastically different outcomes, typically on scales so large that it can make a project effectively unworkable. Making mistakes on these kinds of choices lead engineers to tell product managers on mature projects that certain features “can’t be built” or that there is no way to make a system faster at any cost.
Data Storage Is Different
Most of what I am saying here is not particularly controversial and is probably pretty obvious in most cases: pay attention to the parts of the design that are hard to change later, remain flexible about as much as possible, when problems have been solved by other people try to do what has worked before. But if this is all so obvious why does this idea of “throwing one away” come up at all?
My theory is that it comes from the current state of data infrastructure, particularly data warehouses (which I have written about here) as the centerpiece of a modern data architecture. Because of the limitations of data warehouses (too slow to power your application, too expensive to use 24/7 continuously) most sophisticated modern data architectures evolve by adding other components and systems to add the other features needed. That probably includes things like these for most projects:
An operational database for CRUD and application data retrieval
Nightly ETL/batch jobs to compute metrics from the data warehouse
A reporting database to power internal analytics
Stream processing components to provide intraday analysis
Caching of common content to reduce load on the database
Data queues to smooth out the rough edges between all those layers
Etc. etc. etc.
When you look at this you realize that it is completely rational for a technical leader at a startup to say that they do not have the time, the money, some of the skills or probably the patience needed to build all of that. That is correct: a small startup cannotbuild all of the above.
But it’s very easy for the understanding that all of this cannot realistically be built at that stage to turn into justifications for why you don’t really need to (or even maybe want to) do all that stuff anyway. “We can’t” starts to turn into “we wouldn’t even if we could.” This is where you hear things like “we don’t need all of this on day one”, “we can just start small and it will be good enough”, “we can always add those things later when we need them” and “we’re going to throw it away anyway.”
This is not correct. Of course your startup is going to want audit trails, log visibility, intraday metrics, integration with ML and AI training, automated data discovery and anomaly detection. Of course it would be better to have all of those things. And, if you are successful, you will need those things soon and it will be a bad day not to have them – or at least the ability to develop and deploy them quickly.
Getting Started While Planning for the Future
Essentially technical leaders at startups are stuck between a rock and a hard place: on the one hand any business, especially SaaS startups, will need all of these capabilities and of course on the other hand they can’t afford them (note: if you work at a startup that has unlimited time and money to address these challenges then I guess I don’t have any advice for you as I have no idea what that would be like). Because of this we are going to have to manage some tradeoffs, but we can at least approach it with some careful thought.
The most important things are to make sure that we don’t make any choices that risk completely killing our business at any point. That means we should understand the cost (per unit time or per operation), the scalability and limitations, and the extensibility and adaptability of the technologies we choose. If your core storage has a per-query cost associated with it (e.g. DynamoDB or BigQuery) then you should understand what those costs will look like if the business suddenly has a spike in growth. If your storage has hard scaling limits: are those limits high enough for your business to be a success or not? Can the technologies you have chosen be deployed to all of the environments you must support for the business to succeed?
You can compromise (or “throw one away”) for the auxiliary parts of the business. Maybe that is reporting dashboards, maybe that is log aggregation. You can build off a platform that won’t scale infinitely if it scales enough for the business to succeed while you replace it later. But if you are selecting platforms that cannot meet the minimum threshold for success of the business you are doing it wrong.
Above all: do not make tradeoffs that you don’t have to. Are you sure you can’t provide intraday metrics? Why not? What would it cost (time and money)? What about supporting 2-3 orders of magnitude more traffic on your site/app? What would you want to use to handle that? Why can’t you use it from day one? If you can’t, what is the most frictionless way for you to switch later and what will that cost (again time and money)? Thinking this way can save a startup from its own success.
The Greatest Ability Is Fungibility
The single most important dimension to think about is probably your ability to operate, maintain and extend your technology platform over time. I think of this in terms of “fungibility”. Over time you will need to maintain multiple environments (think dev, test and prod), you will want new environments from time-to-time (for partnerships and internal groups with specific missions), you will want to move data seamlessly between them and you will want to manage deployment of major breaking versions of your software by deploying entirely new environments.
Developers have talked about “cattle not pets” for application layers for decades, but when it comes to data platforms most organizations cannot provide the same set of capabilities. I have seen many organizations where the absolute bottleneck was a single data platform that had been built by hand and maintained organically over years and probably could never be recreated correctly. If you can avoid this situation then you will have a lot less to worry about.
Fungibility (especially of data platforms) leads to adaptability of software projects and teams over time and either eliminates the day one design tradeoffs or reduces the potential risks massively.
Balancing Agility and Scalability
Software startups should want it all: all the capabilities, scalability, the ability to adapt and change over time, and predictable (and reasonable) costs. The key to avoiding "throwing one away" is to avoid assuming you can iterate your way out of every problem, and instead make thoughtful choices in a few key areas as early as possible:
Select the best tools for your specific needs – even if they require more initial investment.
Minimize the number of platforms and solutions to reduce complexity.
Avoid technologies that won't scale in critical dimensions of your business.
Plan for fungibility – ensure your systems can be easily replicated, modified, and scaled.
Build foundations that can accommodate future design changes without a complete overhaul.
The goal should be to create a system that can grow and evolve with your business and that doesn’t paint you into a corner.
Modern data infrastructure solutions are emerging to address these challenges. Our answer to these challenges was to develop MinusOneDB. It is a scalable, adaptable data platform that can grow from prototype to production without demanding major rewrites and without the need to rebuild core services that would be painful (or impossible at a practical level) to replace later. While m1db is just one of many data PaaS options in the market, it represents a new approach to data infrastructure that prioritizes both agility and scalability.
MinusOneDB is buildable (and rebuildable) from the ground up using service calls and configuration. Data can be reloaded at any scale in a matter of hours, whether you just need a copy of your data store or whether you are redesigning your schema to release a new major version. It scales from millions to billions to trillions of records with predictable, linear pricing. And it provides constant, fast performance so all of your queries work for all use cases at any scale.
In line with the above principles, we have built features for many common development needs under the theory that commodity engineering should be commoditized, including:
User logins and “forgot my password”
Resource isolation for risk management
Multi-tenancy
Audit (and transaction) logs
Data versioning and time-based queries
First-class support for text as a datatype
… more all the time
Starting something new is hard. You need the velocity to be first, the flexibility to adapt on the fly, and the scalability to grow with your market. The last thing you need is to have to do everything twice. Our goal is to help startups get off the ground quickly – and then scale forever – without ever having to “throw one away.”
Author
William Wechtenhiser
Aug 8, 2024
Sign up to our newsletter
Subscribe
Subscribe
Subscribe