How do you modernize your data stack while keeping the lights on

Tarush Aggarwal
3 min readJul 26, 2021

In the next 10 years years every company will have a data team and be using a modern stack to collect, analyze and report on data. While we are still in the early adoption phase, many businesses have already started to invest in their data capabilities.

For a lot of these businesses moving to a modern data stack is more complicated given they already have some degree of inertia — be it with people, tooling, process, reporting or likely a combination of all 4. As many of you are aware, It’s far easier to build something from scratch rather than migrate from one service to another.

This problem is relatively easier in the software engineering world where companies regularly build on top of or deprecate existing services. Software engineering by nature is an iterative process. Data on the other hand requires business continuity. Data is the bridge between different functional areas of the business and data teams need to constantly connect the dots. Changing the underlying infrastructure on the data side while trying to still connect the dots makes everything exponentially more complicated.

WeWork was using a similar data stack when we supported a 500 person organization as when we supported a 15,000 person organization. The few changes we made in the stack as we scaled were non-trivial and had up to 1 year long migration plans. The cost of these migrations is very high and it only gets more expensive as the company grows. What this means is that if you need to modernize your infrastructure, the best time to do it is now.

The challenge with migrating to a new stack lies in keeping business continuity. We don’t have the luxury of taking 6 months to re-architect everything. The business expects the data team to keep the lights on and continue to support analysis during the migration.

What this means is that we have to set up our new stack in parallel with the legacy stack. Teams continue to work on the legacy stack and its business as usual. You would want to set up a small task force which is responsible for migrating data to the new stack piece by piece. When thinking through these mini migrations it’s important to wear the lens of the business. What are the different use cases we need to bucket together during each phase of the migration. Instead of thinking through it from a team’s perspective, think through it from a business use case perspective.

If you are still using the same BI tool across the company this means that some dashboards will be powered by the new stack and some will be on the legacy stack. If you’re using a new BI tool it can be more confusing because some use cases might be on another platform. Hence it’s important to think through the end to end use case and migrate them together. Using this approach if you are using distributed data teams then different teams will be migrated to the new stack at different times.

The length of the migration is often not important, it makes more sense to take your time to validate the migration and resolve data quality issues before moving on to the next use case. The only consideration over here is the cost of having both stacks in parallel. Cost too can be managed as a lot of these tools/vendors charge depending on compute/volume so a more gradual onboarding means you start with a smaller billing footprint.

Ultimately the benefits of moving to a modern data stack will outweigh the inconveniences of a migration. These broad buckets of benefits are:

a) Time efficiencies (be able to do the same thing with less effort)

b) Cost (pay less for more)

c) Feature set efficiencies (security, features, reliability, scale)

Migrating to a modern data stack also presents an opportunity to restructure your data teams. In general the hybrid approach of having a centralized core data team along with distributed teams embedded in different products has many advantages. More about this in a future article.

I would love to hear from you in the comments below about your experience migrating to the modern data stack. Please feel free to reach out if this is something we can support you in.

--

--

Tarush Aggarwal

Passionate about digitalizing tech enabled companies. Founder @5x Previously @WeWork @Salesforce