DBT : How to clean up removed models from your production schema ?


DBT (Data Build Tool) is a powerful open-source tool used for data transformation and analysis. As your data model evolves, you may need to remove some models that are no longer in use. Removing these unused models is important to keep your production schema organized, maintainable, and efficient. In this article, we will cover the steps for cleaning up removed models from your production schema using DBT.

Step 1: Remove the model from your DBT project

The first step in cleaning up a removed model is to remove it from your DBT project. This can be done by deleting the model file and any related macro files from your project directory. You should also remove any references to the model in your DBT project files, such as the models and snapshots block in your dbt_project.yml file.

Step 2: Drop the model from your database

Once you have removed the model from your DBT project, you need to drop it from your database. You can do this by running the following command in your terminal:

dbt drop-model <model_name>

Step 3: Run a full compile and run

After you have dropped the model from your database, you should run a full compile and run to ensure that all related tables and views are dropped as well. To do this, run the following command in your terminal:

dbt run --full-refresh

Step 4: Validate the changes

Finally, validate the changes by checking your database to ensure that the model and all related tables and views have been dropped. You can do this by querying the database or using a database management tool to view the schema.

In conclusion, cleaning up removed models from your production schema is an important part of maintaining a well-organized and efficient data model. By following these steps, you can ensure that your database stays organized and that any unused models are removed to free up resources and improve performance.

Other options

Periodically Drop and Rebuild the Entire Schema

This option involves dropping and rebuilding the entire schema on a periodic basis as a way to remove any unused objects. DBT is designed with the assumption that everything can be rebuilt at any given time, making this a simple solution for removing objects. However, it is important to keep in mind that this approach may result in some downtime and is not for those who are not comfortable with the risks involved.

Query the Information Schema to Find Extra Objects in Prod

This option involves using a query to find extra objects in the prod schema. The query can be run in the analysis directory and when executed against the database, it will identify objects such as tables, views, and functions that exist in the prod schema but do not exist in the related dev schema. It is important to note that this approach assumes that the dev database has been routinely dropped, so it does not contain any extra objects. This query has been tested on both Redshift and Postgres databases.

Get more useful articles on dbt

  1. ,
Author: user

Leave a Reply