Airflow dags not getting refreshed/updating. How to do it manually?

Apache Airflow

Refreshing Airflow Dag Manually

Solution

After creating a DAG in Airflow, we anticipate its immediate refresh. However, even after refreshing the Airflow UI, the latest changes or newly added DAGs may not be immediately visible. Typically, Airflow refreshes DAGs every 30 seconds, thus we expect to observe the latest modifications within this timeframe. The common scenario where the dags are not getting refreshed may be due to some error in the dag python code. Another scenario , is if there are duplicate id ( same dag name for two files, can occour when you copy a code to create a new one ) . 

After creating a Directed Acyclic Graph (DAG) in Airflow, there’s an expectation for its prompt update in the Airflow UI. However, instant visibility of changes or new DAGs isn’t guaranteed. Airflow typically refreshes DAGs every 30 seconds, so updates should appear within this interval. Non-refreshing issues often stem from errors in the DAG’s Python code or from having duplicate IDs, which can occur when copying code to create a new DAG. These common problems can prevent the immediate reflection of changes in the Airflow interface.

To get the dags refreshed manually you can do the following. Log on to the server where your airflow is installed( If its a virtual environment , go to that path )  and perform the following steps. 

source workspace/virtualenv/bin/activate
export AIRFLOW_HOME
export AIRFLOW_HOME=~/workspace/airflow
cd $AIRFLOW_HOME
airflow dags list

The above will list all the dags in that server path. 

No if you want to get the error list as well then you have to give -v  in the last command above. To also retrieve a list of errors, you need to append -v to the final command mentioned above

airflow dags list -v


Explanation

  1. source workspace/virtualenv/bin/activate: Activates the virtual environment named virtualenv located in workspace/virtualenv/bin/, preparing your shell to use the Airflow installation and its dependencies within this isolated environment.
  2. export AIRFLOW_HOME: This declares AIRFLOW_HOME as an environment variable but doesn’t assign it a value. It’s likely a mistake or oversight in the command sequence.
  3. export AIRFLOW_HOME=~/workspace/airflow: Correctly sets AIRFLOW_HOME to ~/workspace/airflow, defining the root directory for Airflow configurations and where it stores DAG files.
  4. cd $AIRFLOW_HOME: Changes the current directory to the one specified by AIRFLOW_HOME, i.e., ~/workspace/airflow.
  5. airflow dags list: Lists all the DAGs recognized by Airflow in the current environment, allowing you to see which workflows are available to be scheduled or are currently being managed by Airflow.

Read more on Airflow here :

Official page

Author: user

Leave a Reply