Build Apache DolphinScheduler AMI with Packer
packer_tmpl | DolphinScheduler |
---|---|
0.0.1 | 3.0.x |
0.0.2 | 3.1.x |
0.0.3 | 3.0.x and 3.1.x, with AMI required password policies |
latest | 3.1.x and above |
You can see more detail about how to build AMI in AMI build
This AMI contains all server need to setup DolphinScheduler, which including
- DolphinScheduler: Main package of server, in path
/srv/dolphinscheduler
, will auto start standalone server when EC2 instance launch.- Standalone server can be stop by command
sudo systemctl stop dolphinscheduler-standalone
- Each server can be started by command
sudo systemctl disable [dolphinscheduler-master|dolphinscheduler-worker|dolphinscheduler-api|dolphinscheduler-alter]
- Standalone server can be stop by command
- Zookeeper: Registration center, in path
/srv/zookeeper
, can be started by commandsudo systemctl start zookeeper
- PostgreSQL: Database, can be started by command
sudo systemctl start postgresql
NOTE: The section instance type requests show the minimum supported instance type of this AMI. And the t2.micro it is qualify under the AWS free-tier, but if your account doesn't qualify under it, we're not responsible for any charges that you may incur
Go to EC2 -> Network & Security -> Security Groups -> Create Security Group
to create new security group, you should add then following points to Inbound rules for this security group:
- 22: default ssh point
- 12345: DolphinScheduler's web UI point
Currently, you have to build this AMI by yourself and then launch new EC2 instance from EC2 -> Images -> AMIs
sidebar path, select the AMI you builded and then click Launch instance from AMI
bottom. In EC2 -> Instances -> Launch an instance
page,
you should choose the exist which you created in New Security Group for Standalone section, it can be find in Network settings -> Select existing security group -> Select security group
. At last launch a single instance.
After launch, you can login DolphinScheduler service with user/<EC2_DYNAMIC_INSTANCE_ID> as default username/password via instance's [Public-IPv4-address]:12345/dolphinscheduler/ui or [Public-IPv4-DNS]:12345/dolphinscheduler/ui. For about how to use DolphinScheduler you view detail in functions of DolphinScheduler.
Note: The password of dolphinscheduler is dynamic, and it will be automatic change after EC2 instance launch to keep your service safe. And you can find it in your EC2 console home. See AMI container product for more detail about AWS requests for AMI providers.
In the next step we will use ssh connect to existing EC2 instances, and currently our cluster.sh
script only support one single key pair. So we need to create a new one and then use it when launch instances.
Go to EC2 -> Network & Security -> Key Pairs -> Create Key Pair
. Keep it carefully, otherwise you can not login you instance
Go to EC2 -> Network & Security -> Security Groups -> Create Security Group
to create new security group, you should add then following points to Inbound rules for this security group:
- 22: default ssh point
- 2181: Zookeeper connect point
- 5432: Postgresql connect point
- 1234: DolphinScheduler's MasterServer point
- 5678: DolphinScheduler's WorkerServer point
- 12345: DolphinScheduler's web UI point
Currently, you have to build this AMI by yourself and then launch new EC2 instance from EC2 -> Images -> AMIs
sidebar path, select the AMI you builded and then click Launch instance from AMI
bottom, In EC2 -> Instances -> Launch an instance
page,
you should choose the exist key pair which you created in new key pair for cluster section, it can be find in Key pair -> Select key pair
. Also you should choose the exist security group which you created in
new security group for cluster section, it can be find in Network settings -> Select existing security group -> Select security group
. At last, launch multiple instances base on your cluster number(in this tutorial
we using 8 instance to create cluster), from Number of instances
input box in Summary
plane.
If you already clone this project, you should go to directory packer_tmpl/aws/ami/dolphinscheduler/bin
, and you can see two script named cluster.sh
and cluster_env.sh
. And if you not clone this repository from GitHub, you can get those two script by
following command
wget https://raw.githubusercontent.com/WhaleOps/packer_tmpl/main/aws/ami/dolphinscheduler/bin/cluster.sh
wget https://raw.githubusercontent.com/WhaleOps/packer_tmpl/main/aws/ami/dolphinscheduler/bin/cluster_env.sh
NOTE: If your network can not connect to GitHub, above command will failed with error log like
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|0.0.0.0|:443... failed: Connection refused.
. You should find out a method to make your network can connect to hostraw.githubusercontent.com
or download these two script from GitHub website.
You have to modify your cluster_env.sh
script, which including your key pair and EC2 instances IPv4 address or IPv4 DNS. For example, we launch 8 EC2 instances, and we want to deploy two master-server, 3 worker-server, a api-server, a alert-server, one database and zookeeper server,
and each instance IPv4 address as below:
- 192.168.1.1: master-server
- 192.168.1.2: master-server
- 192.168.1.3: worker-server
- 192.168.1.4: worker-server
- 192.168.1.5: worker-server
- 192.168.1.6: api-server
- 192.168.1.7: alert-server
- 192.168.1.8: metadata database(postgresql), zookeeper
We should told our deploy plan to cluster_env.sh
otherwise it will never know how to deploy it(here we only show some necessary changed content without comment)
export ips="192.168.1.1,192.168.1.2,192.168.1.3,192.168.1.4,192.168.1.5,192.168.1.6,192.168.1.7,192.168.1.8"
export masters="192.168.1.1,192.168.1.2"
export workers="192.168.1.3:default,192.168.1.4:default,192.168.1.5:default"
export alertServer="192.168.1.6"
export apiServers="192.168.1.7"
export DATABASE_SERVER="192.168.1.8"
export REGISTRY_SERVER="192.168.1.8"
Should also add you key pair location which you create in new key pair for cluster, an absolute path is be encouraged to use(here we only show some necessary changed content without comment)
# Do not change this if you use this AMI to launch your instance
export INSTANCE_USER=${INSTANCE_USER:-"ubuntu"}
# You have to change to your own key pair path
export INSTANCE_KEY_PAIR="/change/to/your/personal/to/key/pair"
After modified cluster_env.sh
you can execute the script by command
./cluster.sh start
It is will task some some time dependent on you network speed, and after it finish, your EC2 instances will be combine to dolphinscheduler cluster.
After that, you can login DolphinScheduler service with user/EC2_DYNAMIC_INSTANCE_ID as default username/password via instance's [API-SERVER-Public-IPv4-address]:12345/dolphinscheduler/ui or [API-SERVER-Public-IPv4-DNS]:12345/dolphinscheduler/ui. For about how to use DolphinScheduler you view detail in functions of DolphinScheduler.
Note: The password of dolphinscheduler is dynamic, and it will be automatic change after EC2 instance launch to keep your service safe. And you can find it in your EC2 console home. See AMI container product for more detail about AWS requests for AMI providers.
The minimum required instance type to launch this AMI is t2.micro, and with this type you can start DolphinScheduler service and run some easy workflows, but the web UI may be a little stuck. The t2.small or instance type large than is be recommended if you want a better experience, an with that type you can run some middle scale workflow and have a smooth web UI experience. ref AWS EC2 Instance Type
Turn on debug mode by changing variable pkr_debug
value default = "true"
in dolphinscheduler.pkr.hcl
, debug mode will show more detail during the AMI build.
AMI name must unique in each region, it could raise duplicate name error when you try build AMI with the name already exists. In this case, you and suffix for current build, change variable suffix
when you want to add suffix to AMI. It is useful when you try to fix/add functional to exists packet template.