TL;DR: Lakebridge is a free, open-source toolkit from Databricks Labs that handles the three hardest parts of a Databricks migration: figuring out how big the job really is, translating all that legacy SQL, and proving the numbers tie out afterwards. If you’re planning a migration, spend a day with it before you write the proposal.
The part of a migration nobody budgets for
Here’s a scene you’ve probably lived through.
A client wants to move off Teradata, Oracle, Synapse, or Snowflake and onto Databricks. The business case is solid. Everyone’s excited. Then someone opens the source system and finds 80,000 lines of legacy SQL, orchestration jobs nobody documented, stored procedures written by a guy who left in 2017, and a quiet, creeping fear that when you finally cut over, the numbers won’t tie out.
The Databricks part is the easy bit. It’s everything around Databricks that eats the timeline.
That’s the part clients don’t budget for properly. And it’s exactly what Lakebridge is built to solve.
What Lakebridge actually does
Lakebridge is a Databricks Labs toolkit. Free. Open source. Backed by the Labs team.
It covers the three phases where migrations actually fail:
| Phase | What it does | Why it matters |
|---|---|---|
| Assessment | Profiles your existing warehouse and analyzes the SQL | Turns guesswork into a real scope |
| Conversion | Three different transpilers to translate your SQL | Gets you out of “someone has to rewrite all this” hell |
| Reconciliation | Compares source vs. target data after migration | Proves the numbers tie out before go-live |
If you’re planning, scoping, or currently in the middle of a Databricks migration, this is a tool you should have opened a tab on already.

Phase 1: Assessment - “how bad is it, really?”
Before you quote a client or commit to a timeline, you need two numbers:
- The TCO savings you’ll get by moving
- The effort it’ll take to get there
Most teams guess. Lakebridge ships two tools to stop you guessing.
The Profiler
Connects to your existing warehouse and sizes up the workload. Table volumes, query complexity, features in use. It gives you a defensible number for “how much smaller will this be after we migrate?”
The Analyzer
Scans the actual SQL and orchestration code. Flags the constructs that will bite you during translation, the recursive CTEs, the vendor-specific functions, the weird procedural bits.
Why this matters: This is the phase consultancies underinvest in the most. Running the Lakebridge assessment up front changes the conversation with stakeholders from “trust us” to “here are the numbers.” That’s a very different meeting.
Phase 2: Conversion - “who’s going to rewrite all this?”
This is where Lakebridge gets genuinely interesting. It ships with three transpilers under one roof, and they’re not redundant. They’re complementary.
BladeBridge
The mature, battle-tested option. Broad dialect coverage and some ETL handling built in. This is what you reach for on a Teradata or Netezza job where you need predictable results.
Morpheus
Narrower dialect support today, but it has experimental dbt support. That’s a big deal if your client already lives in the dbt ecosystem.
Switch
Converts SQL and other sources directly into Databricks notebooks using large language models.
This one is the one to watch. For the long tail of weird, non-standard procedural SQL that rule-based transpilers choke on, a well-prompted LLM is often the pragmatic answer. Having it packaged as a first-class option inside an official Databricks Labs tool, rather than as a bespoke internal hack, is a genuine shift in how these migrations can be done.
Phase 3: Reconciliation - “do the numbers match?”
This is the phase that keeps data leads up at night.
It’s also the one clients quietly assume “will just work.” It doesn’t.
Here’s what actually happens during cutover:
- Source and target systems are both live
- Row counts drift between them
- Nulls get coerced differently in the two engines
- Timezone handling diverges silently
- Finance asks why the monthly revenue number moved by 0.3%
Nobody has a good answer. Everybody panics.
Lakebridge’s Reconciler is purpose-built for this moment. It compares source and target datasets and gives you a defensible answer to the “do the numbers tie out?” question.
On regulated or commercially sensitive workloads, that’s not a nice-to-have. It’s the difference between a successful cutover and an embarrassing rollback.
What it supports (and what it doesn’t)
The supported-sources matrix is worth looking at yourself before you scope anything. Coverage is broad, but uneven across the three phases.
Here’s the quick read:
Full-path coverage (assessment + conversion + reconciliation)
- Synapse
- Oracle
- Snowflake
- MSSQL
If you’re migrating from one of these, Lakebridge has you covered end to end.
Assessment + conversion only (bring your own reconciliation)
- Teradata
- Netezza
- Redshift
- PostgreSQL
Analysis only (you can scope it, but conversion is still manual)
- SSIS, DataStage, SAS, Alteryx
- ADF, Oozie (orchestration)
Orchestration conversion
- Airflow
That pattern tells you something useful: Lakebridge is honest about where it’s mature and where it’s still filling in. Treat it as a very strong starting point, not a press-the-button-and-go magic wand.
How we would use it on a real engagement
Discovery week (days 1 to 5)
Run the Profiler and Analyzer against the source environment. Use the output to produce:
- A real TCO model
- A complexity-weighted migration backlog
This replaces the hand-waved “it’ll take 3 to 6 months” estimate with something you can actually defend in a steering committee.
Pilot conversion (weeks 2 to 4)
Pick a representative slice of SQL. Ideally one easy, one medium, one hairy. Run it through BladeBridge and Switch side by side.
You learn two things fast:
- What percentage of your codebase transpiles cleanly
- What the human effort per job looks like for the bits that don’t
Cutover and reconciliation (ongoing)
As workloads land in Databricks, wire the Reconciler in as a gated step before anything is declared “migrated.” Nothing gets marked done until the numbers tie out.
This is the discipline that separates clean migrations from the ones that quietly rot for six months before someone notices the dashboards are wrong.
The honest takeaway
Lakebridge isn’t going to make a migration trivial. Nothing will.
What it will do is remove the three most expensive sources of ambiguity in a migration engagement:
- Scoping guesswork
- Translation toil
- Post-cutover uncertainty
Those three things are where margin goes to die on fixed-price projects, and where client trust gets damaged on T&M ones.
If you’re a data lead staring down a Databricks migration, or a consultancy scoping one, the honest recommendation is this:
Spend a day with Lakebridge before you write the proposal.
The assessment output alone will sharpen your numbers, and the reconciler will save you a cutover weekend or two down the line.
It’s free. It’s from Databricks Labs. There’s genuinely no reason not to.
Useful links
- Lakebridge docs: databrickslabs.github.io/lakebridge
- GitHub: github.com/databrickslabs/lakebridge
Planning a migration to Databricks and want a second set of eyes on the scope? Book a 30-minute architecture call.
