This repository has been archived by the owner on Oct 23, 2024. It is now read-only.
Releases: d2iq-archive/dcos-commons
Releases · d2iq-archive/dcos-commons
SDK 0.52.0: Mono->Multi conversion support, Plan metrics
Scheduler:
- New feature: Support for one-way conversion of mono-service schedulers to multi-service schedulers. The previous service must continue to have the same name as the framework. For example, if the original mono-scheduler was named "foo", the new multi-scheduler could have "foo", "bar", and "baz", where any tasks originally belonging to the previous "foo" will be routed to the new "foo". See example usage in hello-world.
- New feature: The Scheduler will now emit metrics on plan state, e.g. reporting when the
recovery
plan is doing work, or when any custombackup
plan is triggered. - New feature: When a service is being uninstalled, its
deploy
plan (for doing the uninstall) will group resources into phases according to the host system for those resources. This should simplify diagnosis of e.g. resources that are stuck on unreserve due to a host system currently being down. - New feature: Added support for operator customization of the Marathon TLD and VIP. This should allow users to run SDK services in MoM systems in the future, but at the moment this is a manual process.
- New feature: Remove hibernate dependency for ServiceSpec validation. The hibernate library was causing runtime compatibility issues for users that wanted to link the SDK into their existing code. We weren't using hibernate for much, so it was easier to just remove it in favor of custom validation. As a result, validation errors should now be more understandable to developers and operators, compared to what Hibernate was producing.
- Bug fix: Fix potential deadlock when uninstall is triggered in a multi-service scheduler. Also improves deadlock detection in general: The scheduler should automatically restart and recover if a deadlock scenario occurs in the future.
- Bug fix: Fix offer suppression when a custom
update
plan is provided. Previously, the new suppress behavior wouldn't occur when the service developer had defined custom update behavior.
Tools/testing:
- Automatically run
maws
if credentials are stale. - Support for automatically running python lint utils like
flake8
on test/build tooling. - Various minor fixes to existing test utils.
SDK 0.42.2
Summary
This is a patch update to the DC/OS Commons SDK.
Bug Fixes
- Ensure Mounts for TLS secrets are not duplicate (#2577)
Tools
test.sh
(#2588) (#2594)- Add
-e
and--envfile
flags to pass additional environment variables to the docker container - Add the ability to set the docker image used to launch a container
- Add a timestamp to the generated environment and credential files to allow multiple simultaneous containers
- Add
- Add tooling for running flake8 in CI or using pre-commit (#2532) (#2561)
- Add strict checking to
svc.yml
rendering tests (#2527) (#2529)- Note, this may require that environment variables are added to the
ServiceTest
runner instance using thesetSchedulerEnv()
methods.
- Note, this may require that environment variables are added to the
- Force rebuilding dependencies if a CLI binary has changed (#2559)
- Allow custom linker flags to be specified when building Go executables (#2559)
- Skip binary files in the airgap linter (#2559)
HDFS 2.3.0-2.6.0-cdh5.11.0
New Features
- All frameworks (HDFS included) now isolate their
/tmp
task directories by making them MesosSANDBOX_PATH
volume sources. (#2467 and #2486)
Bug Fixes
- Fix duplicate mounts being generated for TLS secrets, causing pod maintenance operations to fail (#2577)
Improvements
- The SDK tests now validate missing values for
svc.yml
Mustache variables. (#2527)
Cassandra 2.3.0-3.0.16
New Features
- Ability to configure Cassandra's
disk_failure_policy
through thecassandra.disk_failure_policy
service configuration. In previous versionsdisk_failure_policy
is hard-coded tostop
. (#2515) - Ability to isolate framework
/tmp
task directories by making them MesosSANDBOX_PATH
volume sources. (#2467 and #2486)
Bug Fixes
- Upgrading Cassandra with non-default
service.rack
values has been fixed. (#2553)
Improvements
- The SDK tests now validate missing values for
svc.yml
Mustache variables. (#2527)
Kafka 2.3.0-1.1.0
Elastic 2.4.0-5.6.9
SDK 0.51.0: Suppressed offers, update plan fixes
Scheduler:
- New feature: The Scheduler will now automatically suppress offers when there isn't any work to do. This optimization reduces load on Mesos in clusters with a large number of idle Schedulers.
- Bug fix for
autoip
-based config distribution, introduced in 0.50.0: The Scheduler will now wait for its autoip hostname to be resolvable before launching tasks, because the tasks depend on that hostname to fetch config templates, if any are defined for the service. Without this, symptoms are tasks failing a couple times during initial deployment before finally successfully launching once the hostname is resolvable. - Bug fix for
update
plans: The scheduler will now immediately store the 'deployment-completed' bit, rather than waiting for the following restart. This came up in cases where the service'sdeploy
plan had new steps added between different releases of the same package, and if the user never restarted the scheduler when using the prior version of the package. In that situation, the Scheduler could have mistakenly used thedeploy
plan rather than the customupdate
plan specified by the developer (if any).
Testing:
- New feature: The SDK's
ServiceTest
library is now more strict when validating the service's YAML template. Any variables expected by service developers to be missing should be explicitly specified in the test usingServiceTestRunner.setSchedulerEnv()
.
Tools:
- New feature: When fetching upgrade/downgrade paths, the package builder will only fetch information about the package being built, rather than all packages.
- New feature: When publishing packages to S3, the following environment variables may be used to specify the destination:
s3://<S3_BUCKET>/<S3_DIR_PATH>/packagename/<S3_DIR_NAME>
, or these may continue to be completely overridden viaS3_URL
+ARTIFACT_DIR
. See this patch.
SDK 0.50.0: Multi-service support
- Adds multi-service mode, where a single Scheduler process can run multiple ServiceSpecs in parallel. These ServiceSpecs may be added and removed on the fly, or they can be statically defined when the Scheduler starts. See hello-world for example usage.
- Many other changes were made to support multi-service mode:
- Default HTTP handlers were broken into two halves: RPC handling itself, and underlying query logic
- The CLI's command handling has been broken up as well: Commandline parsing, and underlying command logic
- Scheduler-level vs Framework-level configuration handling is now better defined, with separate organization for each.
- Support for region and zone awareness has been added. Services which are region aware can now take advantage of environment variables which tell them what region/zone the operator has placed them in. New placement constraints can also be used to define how tasks should be distributed across them.
- Services running in custom configurations can now override the default TLD used to announce and communicate between tasks.
- Logging has been cleaned up significantly, the Scheduler should now be much quieter when in an idle state. Several fixes to existing logs to make them clearer have been made as well.
- Task IDs will be prefaced with the service name. This can make it easier to keep track at a glance of which service is managing which task. Note: This also means that, depending on the DC/OS CLI version
{{service.name}}__
may need to be prefixed to the task ID when runningdcos task exec
commands. - DC/OS 1.9 is no longer supported as of the 0.50.x branch of the SDK. Services using SDK 0.50.0 are only compatible with DC/OS 1.10 and newer. As such, any services using
executor.zip
in theirresource.json
should remove it when updating to 0.50.x.
Changes which may affect developers' existing services:
- The distribution of config templates is now done using autoip+port endpoints, rather than via VIPs. This allows better compatibility in clusters which have defined custom firewall rules which may affect VIPs. As such, any services which use the
SCHEDULER_API_HOSTNAME
environment variable should now also useSCHEDULER_API_PORT
to get the correct port. - The default JRE used has been switched to a Server JRE version, which is significantly smaller than the prior default JRE, but which has a different directory structure. Services which invoke the JRE should use something like this:
export JAVA_HOME=$(ls -d $MESOS_SANDBOX/jdk*/jre/); ...
- Many unused Java dependencies have been cleaned up from the SDK build. Downstream services which had used these dependencies will need to explicitly add them to their local build.
- Tooling and testing has been greatly improved, including automatic retrieval of logs and diagnostic data from the cluster when an integration test fails, and significant expansion of the ServiceTest functionality which allows developers to test cluster interation in a reproducible way within a JUnit test.
dcos-commons-0.42.1
Summary
This is a patch update to the DC/OS Commons SDK.
Bug Fixes
- Create a sandbox volume for /tmp on all SDK tasks so it's isolated. #2489
Kafka 2.3.0-1.0.0
Version 2.3.0-1.0.0
New Features
- Support for configuring Kafka transport encryption ciphers with secure defaults. (#2483)