SDO Opinion - Do we really need so many data companies? - Mark Freeman
An analysis of the current state of the data industry.
I just spent the past week at Data Council Austin, learning from some amazing data professionals and, more importantly, getting a snapshot of the data landscape. Among the great talks, one thing became very clear: we have a lot of data startups with substantial overlap, whether it be data catalogs, data observability, or now the emerging generative AI space. Is this fractured market sustainable with a looming recession and capital becoming expensive again?
This week I’m holding off on the interviews and diving deeper into the above question by synthesizing my observations of the market, conversations with previous guests, and the various conversations I had the past week in Austin. In this edition, I aim to answer the following:
How did we get to our current market in the data landscape?
What’s the outlook for data startups today?
What does the changing market mean for data professionals?
While I can’t predict the future, I hope to give you the market context so you can better navigate the changing landscape as it unfolds.
Upcoming SDO Interviews:
Black Swan Events in Housing Data - Jhakir Miah
Navigating Enterprise-Scale Data Challenges - Tiankai Feng
How to Upskill in DataOps - Sarah Floris
How did we get to our current market in the data landscape?
The past fifteen years have not been normal between the 2008 housing crisis and the coronavirus pandemic— both of which put us on the path of surplus that eventually has to end. Specifically, I argue that our current data market resulted from:
The creation of cheap capital via monetary policies enacted by the US government to curb crises.
The rise of the cloud which drastically sped up startups’ time to market for their solutions.
Snowflake’s IPO pushing investors heavily into data, with its peak in 2021.
With interest rates rising again, the underlying assumptions of the heavy investments into data no longer hold. Thus, the data industry is on a collision course with a harsh reality.
Though the 2008 crash feels like a completely different lifetime, its reverberations are still felt today in the form of extremely low-interest rates that made capital cheap. From 2008 to 2022, we had near-zero interest rates, which completely changed how capital was viewed by financial institutions and, thus, venture-backed companies. This is most apparent in the rise in valuations for seed stage and series A companies and their ability to raise subsequent rounds with minimal revenue.
Fast forward to March 2020, and it looked like the decade plus bull run would finally end as the entire globe ground to a halt. What proceeded was some of the largest gains in technology we have ever seen, with subsequently even more money being pumped into the data startup ecosystem by venture capital.
While one could state that the shift (shove?) to remote work increased the value of technology, I would argue that monetary policies once again bolstered the US economy to curb a financial crisis in the form of stimulus checks and PPE loans for businesses; ultimately kicking the can further down the road.
So with monetary policies making both capital cheap and bolstering the economy during crises, we understand how investors had the means to invest in data companies, but why did they decide to invest in data? In addition to the above “black swan events,” two market shifts specific to data were 1) the rise of cloud infrastructure and 2) the Snowflake IPO in 2020.
The cloud as we know it was first introduced by AWS in 2006 and didn’t reach a $100BN plus market size until 2016. In that time, we had an explosion of SaaS startups as the barrier of entry was diminished to a few button clicks and a credit card payment. The competitive advantage for companies was now speed to market, as getting your product in front of users and iterating would give you the “escape velocity” venture capital drooled over. This created the perfect environment for the emergence and rapid adoption of the Modern Data Stack and the various point solutions supporting the data lifecycle. Also, there was that HBR article resulting in every company rushing to hire a data team regardless of need.
These data companies fit perfectly within the venture capital model as well. Every company in the world has to use data; therefore, data companies have the $1BN plus total addressable market to perk up the ears of VCs. In addition, with the growing open-source and cloud ecosystem, the means to build these companies had relatively low initial costs compared to traditional hardware companies. But what accelerated investment into data companies to the peak we saw in 2021 was the 2020 Snowflake IPO. Below is an excerpt from my interview with Matt Turck, Managing Director at FirstMark Capital, explaining the situation:
In the wake of the Snowflake IPO, there was as we all know, an enormous amount of excitement around data infrastructure, the rise of the Modern Data Stack, all those things.
So you end up with a bunch of very interesting companies getting started and a lot of venture capital that was more than happy to fund those companies and then fund them again, and then six months later, fund them again. Occasionally again and again. And that was, a lot of fun, a little dizzying, but ultimately led to a lot of categories emerging overnight and getting very crowded overnight.
The Snowflake IPO “validated” (heavy emphasis on quotes) for venture capital the market opportunity in data infrastructure companies, leading to the race to get their own piece in the space. The below tweet by Matt Turck best illustrates this in seeing the landscape in 2012 and now 2023.
All of which brings me to the present day, where I stood in the vendor room at Data Council Austin and saw numerous data startups essentially doing the same thing. Many of these companies raised unbelievable valuations in the past few years in a fractured market where consumers have a swath of options for every step in the data lifecycle. Before, revenue didn’t matter as companies could be propped up by venture capital as they sold a vision of their market opportunity. Now, consumers are slashing their budgets in a down market, and financial institutions have less capital to deploy with rising interest rates (i.e., interest rates finally moving back to normal levels). In short, our world of surplus is over, the underlying assumptions behind these investments are no longer true, and these data startups with huge valuations have a rough seas ahead of them.
What’s the outlook of data startups today?
The 2023 bank run on SVB was the wake-up call to data startups that their previous world of surplus was now over. The below tweet gives a great summary of the SVB situation, but in summary, the increasing interest rates by the Feds led to SVB’s deep investment in long-term bonds to become realized losses. This spooked VC investors, and a prisoner’s dilemma was presented to every founder who deposited more than $250k in SVB. Though the federal government thankfully stepped in to make depositors whole and avoid the collapse of the regional banking system in the US, a strong signal was sent to our market: what happens when startups run out of capital?
My previous interview with Ethan Aaron, Founder & CEO at Portable, perfectly described the position many companies now find themselves in the wake of SVB’s collapse:
… [Let's] rewind back to 2021 and 2022. Interest rates were very low, which means valuations of everything were very… high. You could have one dollar revenue and the value of your company was a thousand dollars. We saw the same thing happening in the data world. Companies with very little revenue seen unbelievably high valuations…
What does that mean? It means that those companies raised a lot of money. So if one of these companies raised a hundred million dollars… you don't really have to worry about these companies disappearing overnight. That's not the problem that is going to face companies that took on money at too high of a valuation.
The problem these data startups face now is growing into these unbelievably high valuations in a saturated market where businesses are slashing budgets and head count. There is not enough available revenue in the market for all of these data companies to be healthy businesses. In addition, they can no longer buy time to reach product-market-fit via an infusion from venture capital without accepting a down round (if they can even raise again).
More importantly, Ethan also highlighted, in another one of our conversations off-mic, how these companies’ high valuations but low revenue make it unlikely for them to exit via acquisition. Instead, I argue that we will be left with highly valued zombie companies that can’t be acquired and limp along without product-market-fit for years to come. Some may use their long runway to pivot, but ultimately they are chasing growth rather than creating it.
What does the changing market mean for data professionals?
I often hear people say, “we will finally have a much-needed consolidation of data companies,” but I disagree. As I stated earlier, we will have zombie companies limping along with their massive bank accounts and valuations but with minimal customers to cater to. I argue that it’s in data professionals’ best interest to understand the risk profile of the vendors they currently utilize and are considering. Some key questions to ask yourself when evaluating this risk:
Is the vendor solving a business problem tied to revenue or just providing a point solution for a specific area of the data lifecycle?
When was the respective vendor’s last round, and how much money did they raise?
Given their valuation, do you believe their total addressable market warrants such a valuation?
Has the solution provided by the vendor reached product-market-fit or is it on a solid path to such a state?
If a respective vendor were to go out of business or pivot, how difficult would it be to migrate to a new solution?
Even if you are not a vendor, it would be naive to think this massive market shift won’t change the relationship of data teams with the business. The time of POCs going nowhere, R&D without a strategy, and raising cloud costs is now over. We had a great run on the data hype cycle, but now we need to actually deliver the outsized value we promised or risk the same fate as vendors of being cut from budgets.
How does one deliver outsized value with data? You become a strategic partner to the business to determine how to mitigate risk in a changing market or generate revenue. Data professionals are in a unique position where we have asymmetrical access to information about the business that many in the org are not privy to. The quicker you can utilize this advantage to impact a business's bottom line, the more at ease you can feel about data’s role in your respective business. If you need examples of achieving this,
has two great articles I highly recommend on repositioning your data career in the wake of potential layoffs:Closing Thoughts
Though I highlighted the role of venture capital in propping up our data industry, I don’t believe it’s fair to blame them for creating an unsustainable market in data. I hope this newsletter edition illustrates how their actions are mainly a symptom of unique market conditions. Specifically, historically low-interest rates were utilized to curb financial crises in the US, as well as the evolution of technology shifting us to the cloud and data, which changed the dynamics of startups.
With that said, the symptoms still exist, and the data industry has a rising fever. Eventually, something has to give due to rising interest rates breaking all investment assumptions. Though the ideal solution is seeing a bundling of all these overlapping data vendors, this is just not feasible given their massive valuations and limited avenues to reach revenue to match, thus limiting acquisitions. We have a turbulent road ahead of us in the data industry, but I remain hopeful. The swing back to actually providing value and generating revenue means that our relatively young industry is being pushed to mature, which is better for all of us in the long run.
Referenced Sources:
Data scientist: The sexiest job of the 21st Century. Harvard Business Review. (2022, October 19). Retrieved April 1, 2023, from https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
FDIC acts to protect all depositors of the former Silicon Valley Bank, Santa Clara, California. FDIC. (n.d.). Retrieved April 1, 2023, from https://www.fdic.gov/news/press-releases/2023/pr23019.html
The Federal Reserve. (n.d.). Implementation note issued March 22, 2023. Board of Governors of the Federal Reserve System. Retrieved April 1, 2023, from https://www.federalreserve.gov/newsevents/pressreleases/monetary20230322a1.htm#:~:text=The%20Board%20of%20Governors%20of,%2C%20effective%20March%2023%2C%202023.
Freeman, M. (2023, March 10). SDO 017 - navigating the ML, AI, and data (MAD) landscape as a VC - Matt Turck. Scaling DataOps Newsletter. Retrieved April 1, 2023, from https://scalingdataops.substack.com/p/sdo-017-navigating-the-ml-ai-and
Freeman, M. (2023, March 4). SDO 016 - navigating uncertain markets as a data leader - Ethan Aaron. Scaling DataOps Newsletter. Retrieved April 1, 2023, from https://scalingdataops.substack.com/p/sdo-016-navigating-uncertain-markets
Kaji, S. (2023, March 9). A lot of panic re: SVB (you should see my phone/emails!). A bank run driven by panic is the real risk here, not the action of selling LT Securities at Lossi have no inside information as I left SVB in 2012, but know enough about banking to piece together. quick 🧵. Twitter. Retrieved April 1, 2023, from https://twitter[.]com/Samirkaji/status/1633958266509336576?s=20
Lee, J. L. (2020, July 8). Data shows companies that raised funds in 2020 also approved for U.S. PPP Loans. Reuters. Retrieved April 1, 2023, from https://www.reuters.com/article/us-health-coronavirus-ppp-funding/data-shows-companies-that-raised-funds-in-2020-also-approved-for-u-s-ppp-loans-idUSKBN2490EW
Snowflake announces pricing of Initial Public Offering. Snowflake. (2021, July 7). Retrieved April 1, 2023, from https://www.snowflake.com/news/snowflake-announces-pricing-of-initial-public-offering/
Temkin, M. (2022, June 7). The market correction has come for series A and seed startups. PitchBook. Retrieved April 1, 2023, from https://pitchbook.com/news/articles/Series-A-seed-deals-venture-capital-market-turmoil
Turck, M. (2023, February 23). How it how it's started going (2012) (2023) pic.twitter.com/lce4rekxqb. Twitter. Retrieved April 1, 2023, from https://twitter[.]com/mattturck/status/1628879218535763968?s=20
Vailshery, L. S. (2022, December 6). Public cloud computing market worldwide 2008-2020. Statista. Retrieved April 1, 2023, from https://www.statista.com/statistics/510350/worldwide-public-cloud-computing/
Vashishta, V. (2022, December 4). Developing and evolving a data organizational structure to meet business needs. High ROI Data Science. Retrieved April 1, 2023, from https://vinvashishta.substack.com/p/developing-and-evolving-a-data-organizational
Vashishta, V. (2023, January 25). What to watch next in the layoff cycle. High ROI Data Science. Retrieved April 1, 2023, from https://vinvashishta.substack.com/p/what-to-watch-next-in-the-layoff
About On the Mark Data:
On the Mark Data helps brands connect to data professionals through captivating content, such as this newsletter and other featured content! Please feel free to check out my website to learn how I can support your data brand via influencer marketing or content and go-to-market strategy consulting.
Really great stuff Mark. This boom and bust is part of what makes our system so dynamic. While there will be many, many companies closed in the next year or two, there has been innovation and market education left in their wake.