SDO 020 - Managing Black Swan Events in Housing Data
Interview: Jhakir Miah, Director of Engineering at Amrock
What are your thoughts on being a “data-driven organization?”
Upon completing this newsletter edition, I can only think of one thing: I just stumbled across a hidden diamond in the data landscape that more people need to pay attention to. The type of organization that my guest, Jhakir Miah, described was nothing short of a dream company with respect to data maturity. For context, Amrock is within the Rocket Companies (previously branded as Quicken Loans) portfolio, and their work reflects the culture of the broader Rocket Companies ecosystem. Specifically, their relentless pursuit of automation and the lockstep between data teams and business leaders is next level— and they are one of the few companies that are truly “data-driven” throughout the entire portfolio. Below is a great example of what we aspire to accomplish with data!
— Mark
Hear from Jhakir Miah, Director of Engineering at Amrock:
LinkedIn never ceases to amaze me with its potential to connect you with amazing professionals in the data space. One of those professionals is Jhakir Miah, who popped up in my post feed with amazing insights on data leadership. Though this interview was our first conversation, I hope this will not be our last, as Jhakir has a wealth of insights on technical and business leadership. I learned a lot in our brief chat, and I’m excited for you to learn about him and his team’s amazing work.
Sponsorship:
Real quick… my goal is to keep this newsletter free for my audience and have sponsors pay. Engaging with the following would be a huge help if you want to support my newsletter!
This edition of Scaling DataOps is sponsored by SingleStore who is hosting the free webinar: Build a ChatGPT App on Your Own Data - April 25th, 10 AM PST.
You can register for the event using my featured link, where every signup goes a long way in supporting the Scaling DataOps Newsletter:
In addition to supply chains, Covid lockdown completely changed the real estate market essentially overnight. How do such drastic changes impact how your team works with housing data?
Jhakir: “So I work for a company that has multiple different smaller sister companies within the portfolio. Rocket Companies as a whole has been in the mortgage industry and this housing market for over 35 years. So these changes that we saw in the market. They threw off a lot of these younger companies and startup companies because they have not went through the cyclical cycle and the shifts in the market.
Rocket companies been through this for so many years that when the shift happened, we immediately knew how to redirect our businesses. When Covid hit, we knew that, okay, refinance is gonna start booming because the interest rate was going down. So immediately, we had the trained professionals and the staff, and we started engaging more people to start focusing on the refinance side and captured that part of the market while maintaining our purchase market as well.
It never impacted us, it didn't hinder us at that point because, like I said, we've relied heavily on technology as well. So we immediately started leveraging data. Where in the market do we need to go to, which areas do we need to focus on to really optimize our processes? Which business process can we automate using data that makes sense to say, "Okay, this can be done through computers, while these should be done with humans."
Because they've been through this so many times, our leaders like to say that this is like a baseball game. This is the fifth inning or sixth inning. So if we're down, that's okay. We still got six or seven more innings left to play. So at that point, they knew what to expect. They've been through this several times. They shifted our focus accordingly and our strategy adjusted to ensure that we are able to still sustain what we had, as well as capturing the new market that was coming around. When that Covid pipeline happened, everyone was home, but the housing market, saw a huge and significant rise in the needs and the demands, and we just need to figure out how to support those business needs.”
Mark: “So, just to follow up on that. How do you align your data strategy with that? Because there's such a historical precedent already there, but data changes so quick.”
Jhakir: “Yes, absolutely. You said historically, organization has always been data-driven. So it was always engaged with our business stakeholders and partners. Before they even made the decisions, what does the data say. Then we validated that against the market side. Does that make sense? And our senior leaders and our business partners with our experience, they said, "Okay, it does make sense, we will anticipate this." So the data teams automatically knew where to go because the business, we were along with it, and we were on the same table discussing those changes.
Before final decisions were made, data teams came in and validated a lot of those things. We said, "Okay, is this the right approach for us? Does that make sense? What does it say?" And the company is very tech-driven. Even though on the mortgage industry, we service housing, much of our sister companies do the Rocket Auto or Rocket Loans, we have Rocket Money in our portfolio as well right now, but they're tech-driven. So a lot of times they'll say, "yes, this makes sense. We think this is a good market for us to go into. Let's validate that. What does the data say?" They bring in the data people to answer those and validate those, then make decisions accordingly.”
One of the largest players to be impacted by AI models changing during the pandemic was Zillow Offers losing over $500MM due to a poor AI model. What considerations do you make to ensure your data processes don’t meet a similar fate?
Jhakir: “That was a big one. That was a wake up call for a lot of organizations really leveraging AI and data science models to say, "This is the direction we go." For ourselves, ensuring the data quality aspect of it is there. We're validating these things at a very smaller scale first to test them to validate or confirm those before we scale it out to most. And because of that industry experience over the years, so even if before we make a decision to purchase $500 million, it is validated against industry expert is validating against senior leadership who's been through the cycle to understand "yeah, sure we're not hesitant to make large investments where it makes sense. But we're thoughtful in that, does the data confirm what we believe? Is there evidence to support that? We should be able to explain what the model's doing."
So a lot of time it's on the data scientist to explain what the models are doing, how it's doing, why it's to be there. We have positions where we've seen senior executives in our business partners in a meeting with the data scientist or a data analyst, walking through each piece. It doesn't trickle up for us, where we don't say, "Oh, this analyst did it, so the VP's gonna go present it." No, bring the analyst to the table, let's have a discussion, because it's that person doing the analysis, doing the data checks, and validating everything. Let's talk it out. You're the expert in the table, you have the center seat, let's figure out if this is the right decision we as a company need to go.
I saw an article that resonated really well with me. Instead of it being data-driven, it should be decision-driven analytics. So here's the decision we wanna make. We'll find out everything that needs to be there. Find the data, validate it, and build your model. See if it supports the decision. The other way around is it's gonna be a tremendous amount of shifts that needs to happen, mindset and growth. The business becomes the reactive component as opposed to us, the data people being the proactive and saying, "Hey, this is the decision you wanna make? Here's how we proactively make it. Here are the data sets that support that or deny that."
A lot of times we've been in conversations where business partners want to make a decision about going one direction. We've looked up the dataset. You can go, but it doesn't make sense. You're not gonna make as much money as you think you're going to make because here's what the data says historically, here's how the data says if we were to go pre-covid how we did. So being able to have that type of voice in our organization is really impactful. Then you can say, "Okay, we are data drifting because when we're wrong, we're equipped to admit that we're wrong. Okay, we'll try something else. But when we're right, we go all in on it."
Mark: “Wow. And I imagine it probably takes a substantial amount of trust to be built over the years to have that. Because many times you'll have leaders who are like, "Well make the data work to align with my decision." Or I'm very anchored on this kind of sunk cost fallacy; even though you're giving me this data, I'm still moving ahead.”
Jhakir: “Absolutely. And that's where like I started with Rocket Companies, which used to be Quicken Loans, Rocket Mortgage, all of those sort of rebranding happened and we became part of a larger whole. But I've been with Rocket Companies for about three years now in the mortgage industry. So I've been in the culture that it embeds in this is that we are gonna automate what we need to automate because we want our people to focus on the most important and the critical piece. So if a data can support that 100%, go behind it. And when I started three years ago, I was a little taking back of how well the data teams are, for lack of a better word, the prestige that the data team has, right? Generally, it's always like, "all right, we'll make a decision and we'll look at the report after the fact, right?"
It's always, we have a retroactive look at and that's when data people comes in, but even now I thought, okay, wow, this is great. We made even more strides since I joined. Where now data people are in the forefront of the decision makings. We're the ones that, any migration that happens, any technology upgrade, what does the data people say? We need to make sure the data peoples are on the table so that we don't lose insights, we make sure we gain, as opposed to losing anything when these migrations happen.”
What advice can you give other data leaders to help them better navigate the data challenges caused by black swan events?
Jhakir: “So I think if you've been in the data space long enough, we can think of black swan events as once in a lifetime never happen and things of that nature. But if you've been in data space long enough, you are preconditioned to deal with things breaking out of the norm out of nowhere, right? It's the nature of what we work is just unbelievable how the smallest little thing can have the biggest effect and it trickle down everywhere.
So we're very much used to these type of black swan event, even though we may not categorize them as black swan events, we're very used to these type of reactive approaches that we have to take in order to ensure sustainability, reliability, and all of those components of our infrastructure. And it happens to us very more frequently than we'd like to admit. Our things break.
So the biggest advice is in twofold. Go deeper into your tech stacks technology, and the technology am I say go deeper into it means when you're building a robust system, there are things you taking into consideration. Failures will happen, things will break. Will it scale? Do you have the resources you need to train up and down your team members, right? Do you have the resources to expand your environment? Infrastructure as needed? So go deeper into that and make sure that is sustainable. And when we build the system that is designed to do that, you already account for any type of black swans or production failures, whatever you wanna call them.
The second component is building that trust with your business partners. I've been fortunate enough to work with business partners where if I come to 'em and say, "Hey, listen, something broke. I can't deliver this today. They said, okay, don't worry about it. Can we get a next week?" If you don't have that relationship with them, they lose that trust, they lose that need to come to you. If they lose the need to come to data people, then you become disconnected from your people who are actually doing the work at driving the business, and that's when conflicts start to happen. That's when you really start to question whether you are a data-driven organization. So building that relationship and having a transparent communication with your business partner is gonna be the second component that's gonna save you no matter what those event happen.
Because as an organization, black swan doesn't happen just to you as data leader. It's gonna happen to the entire organization. And if you have those relationship, those strong, meaningful relationship, and you have that trust with them, you can say, "Listen, we're all in this together. I just have to happen to oversee the data components. You oversee sales or marketing, we're all gonna be impacted. How do we work together so that we as an organization can move forward?"
So, own your stuff. Make sure it's built to be resilient enough and that goes into, like you said, we already do that as a tech people. Kneel into it and make sure you truly believe in that, that it can scale, that when chaos happens it won't break and it will go as supported and have the trust and communication with your business partner to be able to communicate, for lack of a better word, when shit hits the fan. That's when you really need to come down and have a kumbaya and say, "okay, how do I make sure we are still sustaining and growing and moving to the future?”
Person Profile:
Jhakir Miah is the Director of Engineering at Amrock. Feel free to connect with him on LinkedIn to learn more about his work.
What are others saying in the DataOps space?
Evolution of Data Platform at GoDaddy
What: Technical blog that covers the journey towards building a modern, low-cost cloud data platform that prioritizes scalability, reliability, cost-effectiveness, security, and governance. It includes the early days of data at GoDaddy and the best practices for establishing a well-defined data strategy.
Why: This blog provides valuable insights and guidance for organizations embarking on a similar journey to build a successful cloud data platform. By sharing the lessons learned and experiences from the journey at GoDaddy, readers can gain a deeper understanding of the key considerations and best practices for building a successful cloud data platform.
Who: This blog is a must-read for those who want to gain a deeper understanding of the key considerations and best practices for building a successful cloud data platform and learn from the lessons and experiences of GoDaddy.
The $500mm+ Debacle at Zillow Offers – What Went Wrong with the AI Models?
What: Zillow, an online real estate marketplace, closed down Zillow Offers due to inaccurate property valuations resulting in a $500 million reduction in Q3 and Q4's estimated value.
Why: Zillow's algorithms overestimated home values and didn't adjust when the housing market cooled down. The issue was caused by "concept drift" in machine learning models, which assume the past equals the future, and did not account for rapidly shifting values or market shocks.
Who: Technical leaders in the data industry should consider leveraging better tools to monitor and maintain AI models' quality, including measuring model accuracy, outputs, and inputs to detect potential model issues.
Data Mesh And Strategy Tech Stack Alignment
What: A discussion on the challenges businesses face when implementing macro technology solutions, particularly in the data platform space.
Why: The article provides insight into the need for best-in-class ecosystems and the importance of capturing business context when building data platforms.
Who: You are responsible for implementing data solutions for the business and are interested in learning about the challenges associated with building technology platforms such as data mesh.
SDO 020 - Managing Black Swan Events in Housing Data
As for your question about "what do you think about being data driven," I think companies aspire to be data driven but they should aim to be DECISION driven instead. I think the difference is:
Decision-Driven teams search for questions worth asking. They don’t settle for the ones on hand.
Decision-Driven teams look wide first, then dive deep.
Decision-Driven teams are led by data humanists, not data scientists.
Decision-Driven teams are data dogs: they figure out what’s missing and go get it.
Decision-Driven teams explore the unknown, not only the known.
I wrote about this in the Data Leadership Collaborative, one of my other favorite sources of good stuff, like Scaling DataOps, about data leadership: https://www.dataleadershipcollaborative.com/data-culture/5-tips-make-data-literacy-program-stick-your-organization