Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce azure gem #737

Closed
kbrock opened this issue Aug 16, 2021 · 12 comments · Fixed by ManageIQ/manageiq-providers-azure_stack#112
Closed

Reduce azure gem #737

kbrock opened this issue Aug 16, 2021 · 12 comments · Fixed by ManageIQ/manageiq-providers-azure_stack#112
Assignees
Milestone

Comments

@kbrock
Copy link
Member

kbrock commented Aug 16, 2021

Spin off from #736

Reducing the size of the Azure API gem

Azure is a big gem (145mb for azure_mgmt_network) and our single biggest asset on the appliance/pod. So we natural want to find a want to reduce our largest asset.

In 2018 an issue was raised that the gem was getting unwieldy, but nothing came of this. The main response is that the gem has been retired as of 2/2021 and we should stop using this gem. ref. While that is a good long term solution, I want a smaller appliance before we are able to port the code.

The gem code has been changing 3-4 times a year but unsure whether they will maintain that cadence. It has been pushed in march, one month after the gem was retired.

My suggested course of action

Given:

  • There are 30+ api versions at 2-7mb each.
  • There are 4 profiles that take code from multiple api versions.

Assumption:

  • The user will use a profile and not directly code to a version of the api.
  • We can dictate which profiles we support and delete the other versions.
  • We can dictate which version of the api we support and delete the other versions.

Business decisions:

  • What versions of the profile do we want to support. Our code lists V2017_03_09, V2018_03_01, but I wonder if we want to support the latest profile. ref

Plan:

  • Get the versions of the API that are used by the profiles
  • Delete all other APIs
  • Release this gem to our own gem server.
  • on future releases of the code, run this script again on master and release another version of the gem.

Determining APIs to keep

Looks like we can remove 29 {34-5} directories of APIs. removing 127mb of the 146mb.

azure_mgmt_network-0.26.1 $ pwd
/Users/kbrock/.gem/ruby/2.7.1/gems/azure_mgmt_network-0.26.1

azure_mgmt_network-0.26.1 $ls lib/profiles/
latest             v2017_03_09        v2018_03_01        v2019_03_01_hybrid

# APIs used
azure_mgmt_network-0.26.1 $ cat lib/profiles/*/modules/network_profile_module.rb | sed -n 's/.*Mgmt::V\([^:]*\):.*/\1/p' | sort -u | sed 's/_/-/g'
2015-06-15
2017-03-30
2017-10-01
2019-06-01
2020-08-01

# APIs defined

azure_mgmt_network-0.26.1 $ ls -d lib/2*
2015-05-01-preview 2017-03-01         2018-02-01         2018-11-01         2019-08-01         2020-05-01
2015-06-15         2017-03-30         2018-04-01         2018-12-01         2019-09-01         2020-06-01
2016-03-30         2017-09-01         2018-06-01         2019-02-01         2019-11-01         2020-07-01
2016-06-01         2017-10-01         2018-07-01         2019-04-01         2019-12-01         2020-08-01
2016-09-01         2017-11-01         2018-08-01         2019-06-01         2020-03-01
2016-12-01         2018-01-01         2018-10-01         2019-07-01         2020-04-01

Next Steps

We could do the same thing with azure compute API, though that would probably be closer to 10mb.

/cc @Fryguy @agrare @bdunne you all were on the previous thread

@kbrock kbrock changed the title fewer gem files Reduce azure gem Aug 16, 2021
@Fryguy
Copy link
Member

Fryguy commented Aug 16, 2021

We know the exact versions we use on azure: https://github.com/ManageIQ/manageiq-providers-azure/blob/master/config/settings.yml#L4-L19, so maybe we could do the same for azure_stack and find a way to delete the rest?

We also don't need to create our own gem - if it's just straight deletes and the license allows us we can kill it directly in the rpm_build code like we do other things: https://github.com/ManageIQ/manageiq-rpm_build/blob/1d172a51ed665cb4f6ab71a6266dcf809a82911e/lib/manageiq/rpm_build/generate_gemset.rb#L154

(though I agree a custom gem would keep dev and prod identical for debugging purposes, and for file-size download and require purposes)

@bdunne
Copy link
Member

bdunne commented Aug 16, 2021

I think I prefer releasing our own forked gem over modifying it in manageiq-rpm_build to keep development environments similar to production.

@kbrock
Copy link
Member Author

kbrock commented Aug 19, 2021

threw together a script: https://github.com/kbrock/azure-sdk-for-ruby/blob/smaller/prune-versions.rb

>> 15mb pruned from azure_mgmt_compute
>> 1mb pruned from azure_mgmt_monitor
>> 122mb pruned from azure_mgmt_network
>> 5mb pruned from azure_mgmt_resources

>> 145mb pruned from all gems

If we limit it down to only supporting our supported profiles, that will reduce it down by 2mb. which seems a little suspect. I can look into it more later, but supporting all seems good enough.

This is a naive and we keep every api binding for the versions, not just the api binding that is actually used by the profile. (it uses some from one version of the api and some from another)

But even with being conservative, keeping all profiles, and only pruning azure_mgmt_network 122mb is a good win.

@kbrock
Copy link
Member Author

kbrock commented Sep 20, 2021

status: so I wrote a script to prune them down but I'm getting strange issues with unrelated gems. Looks like the dependencies around faraday get botched up. I'm assuming that I'm deleting some objects that are referenced or not pruning out some includes references that are including files for loading but no longer used

@miq-bot miq-bot added the stale label Feb 27, 2023
@miq-bot
Copy link
Member

miq-bot commented Feb 27, 2023

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

@kbrock
Copy link
Member Author

kbrock commented Mar 12, 2023

@Fryguy do we want to spend the money/time to fix this?

I'm not even sure where my scripts live TBH.

I had posted some concerns in the original repo a while back, but the gem authors have abandoned this.

@agrare do you feel this is a priority

@Fryguy
Copy link
Member

Fryguy commented Mar 12, 2023

It's always about level of effort. If it's a small change for big savings, then yes.

I'm actually, surprisingly, not a fan of creating our own gem for azure because maintaining it forever (security and bug fixes) will be a hassle, but I do agree with the dev vs prod argument.

@miq-bot miq-bot closed this as completed Jun 19, 2023
@miq-bot
Copy link
Member

miq-bot commented Jun 19, 2023

This issue has been automatically closed because it has not been updated for at least 3 months.

Feel free to reopen this issue if this issue is still valid.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

@Fryguy Fryguy added help wanted and removed stale labels Jun 19, 2023
@agrare agrare reopened this Sep 11, 2023
@Fryguy
Copy link
Member

Fryguy commented Oct 3, 2024

Related, the python client also has this same size issue - See Azure/azure-sdk-for-python#17801

Note that now that we install the python client in the venv, it is taking up almost 385MB on disk (in addition to the Ruby size)

@Fryguy
Copy link
Member

Fryguy commented Oct 3, 2024

I think if we upgrade to a newer azure python library, then it will reduce in size by quite a bit.

@Fryguy
Copy link
Member

Fryguy commented Oct 8, 2024

I took another crack at this and opened ManageIQ/manageiq-providers-azure_stack#112. This builds on the https://github.com/Fryguy/azure-sdk-for-ruby/tree/drop_unused branch which 1st commit adds a script for dropping gem sizes, then executes that script and commits it as the next 8 commits. I've tested this locally with the azure stack PR and all of the tests pass, and additionally I have tried to autoload every autoloadable constant and that also works.

I only measured the time to load everything and these are my results

azure_mgmt_network
  4.195398   1.299338   5.494736 (  5.541836)   # Initial autoload everything
  0.662541   0.218067   0.880608 (  0.889964)   # Autoload after purge unused APIs
  0.313432   0.132122   0.445554 (  0.462570)   # Autoload after purge profiles

@Fryguy Fryguy closed this as completed Jan 13, 2025
@Fryguy Fryguy reopened this Jan 13, 2025
@Fryguy
Copy link
Member

Fryguy commented Jan 13, 2025

ok, I've cut a new version of the 4 azure gems we use using the technique I mentioned earlier - here's the on-disk savings:

Before:

 26M	azure_mgmt_compute-0.22.0
3.5M	azure_mgmt_monitor-0.19.0
140M	azure_mgmt_network-0.26.1
8.4M	azure_mgmt_resources-0.18.2
177.9M	total

After:

2.5M	azure_mgmt_compute-0.22.0.1
2.0M	azure_mgmt_monitor-0.19.0.1
4.5M	azure_mgmt_network-0.26.1.1
1.0M	azure_mgmt_resources-0.18.2.1
10.0M	total

For a total savings of 167.9MB (94% smaller)


I forked the repo at https://github.com/ManageIQ/azure-sdk-for-ruby/ and the purge script can be found here: https://github.com/ManageIQ/azure-sdk-for-ruby/blob/azure-sdk-for-ruby_production/scripts/purge_unused.rb. The gems have been published to https://rubygems.manageiq.org

See also ManageIQ/manageiq-providers-azure_stack#112

@Fryguy Fryguy moved this to In progress in Roadmap Jan 14, 2025
@Fryguy Fryguy moved this from In progress to Spassky in Roadmap Jan 14, 2025
@Fryguy Fryguy added this to the Spassky milestone Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Spassky
Development

Successfully merging a pull request may close this issue.

5 participants