Airflow installation guide for running locally on M1 Mac
Env
- python version: 3.10.14
Install python 3.10.4
pyenv install 3.10.14
pyenv local 3.10.14
Create python venv
python3.10 -m venv .venv
source .venv/bin/activate
Note Check python path and version
which python
python --version
Installation
Set AIRFLOW_HOME
Create a directory for airflow home directory whose config file(e.g. airflow.cfg) and logs. After that, set the directory as AIRFLOW_HOME
.
export AIRFLOW_HOME=[path of the directory]
Example
export AIRFLOW_HOME=$(pwd)/airflow-home
Setup constraint url
AIRFLOW_VERSION=2.9.0
PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
Install airflow
pip install "apache-airflow[postgres,celery]==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
Note If there is any problem then install airflow without any plugin first, then install plugins.
pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"
pip install "apache-airflow[postgres,celery]" --constraint "${CONSTRAINT_URL}"Please remember that
--constraint
has to be in second installation
Setup airflow database (PostgreSQL)
CREATE USER airflow WITH PASSWORD 'airflow';
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
ALTER DATABASE airflow OWNER TO airflow;
ALTER ROLE airflow SET client_encoding TO 'utf8';
ALTER ROLE airflow SET default_transaction_isolation TO 'read committed';
ALTER ROLE airflow SET timezone TO 'Asia/Seoul';
Init airflow
First check version of the installed airflow, and it will be 2.9.0
airflow version
Init airflow DB
airflow db init
After run db init
, should check there are files and directories in $AIRFLOW_HOME.
- airflow.cfg
- webserver_config.py (optional, if it is not, it's fine)
- logs
Create admin user
airflow users create --username admin --role Admin --firstname admin --lastname user --email admin@example.com
Update airflow config
Modify airflow.cfg
[core]
executor = LocalExecutor
[database]
sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@localhost:5432/airflow
Note After updating
airflow.cfg
, should runairflow db init
again.
Run
run scheduler
airflow scheduler
run web-server
airflow webserver --port [port number]
ETC
NO sample dags
If you don't need sample dags, then update airflow.cfg like below.
[core]
load_samples = False
Uninstall
Uninstall airflow
pip uninstall apache-airflow
Reset airflow DB
airflow db reset
Uninstall packages
pip freeze | grep apache-airflow | xargs pip uninstall -y
Remove airflow home directory
rm -rf ~/airflow
Refs
Installation
Trouble shoot
WARNING: There was an error checking the latest version of pip.
certifi issue
pip install --upgrade certifi
Proxy issue
pip install --proxy=http://your.proxy:port --upgrade pip
운영 환경 설정 가이드 (추가)
1. 네트워크 환경 설정
공공 데이터 API 호출 시 M1/M2 환경에서 requests
라이브러리의 NO_PROXY
관련 이슈가 발생할 수 있습니다.
NO_PROXY
환경 변수를 반드시 설정해 주세요.
export NO_PROXY="*"
2. API Rate-Limit 대응
공공 데이터 포털 API는 보수적으로 10~20 TPS 수준으로 호출을 제한하는 것이 안전합니다.
이를 위해 Airflow Pool을 활용해 동시 실행 Task 수를 제한합니다.
예시:
# Airflow UI > Admin → Pools
# Pool Name: fetch_external_data_api
# Slots: 3 # TPS 제한에 맞춰 조정
DAG 예시 설정:
@task(
pool="fetch_external_data_api",
pool_slots=1,
retries=5,
retry_exponential_backoff=True,
retry_delay=timedelta(seconds=10)
)
3. MongoDB 연결 설정
MongoDB 연결은 장기적으로 Airflow Connections + MongoHook을 사용하는 것이 관리와 보안에 유리합니다.
- Connection ID:
mongo_default
- URI 예시:
mongodb://username:password@host:27017/dbname
설치:
pip install "apache-airflow[mongo]" --constraint "${CONSTRAINT_URL}"
예시:
from airflow.hooks.base import BaseHook
from pymongo import MongoClient
mongo_conn = BaseHook.get_connection("mongo_default")
client = MongoClient(mongo_conn.get_uri())
4. 타임존 설정
airflow.cfg
에서 타임존을 Asia/Seoul
로 변경하여 로컬 시간과 일치하도록 합니다.
[core]
default_timezone = Asia/Seoul
5. 장기 실행 / 좀비 태스크 방지
scheduler_zombie_task_threshold
: 실행 환경과 API 응답 속도에 맞춰 조정 (기본값 300초 → 필요시 600~900초)execution_timeout
: 태스크 레벨에서 설정해 무한 대기 방지
예시:
@task(
execution_timeout=timedelta(minutes=1)
)