-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High RAM/CPU utilization #442
Comments
I'll dig deeper but it's possible that this is happening only on the first execution since the DFDL schema has to be compiled before the EDIFACT document can be ingested. Does resource utilisation remain that high while processing a second document? |
Hello, thank you for your attention, @cjmamo! Simply tested by changing the example and processing for a second time the same message within the main method. Also tested by adjusting the RAM usage, completion time:
It seems that for the first message the used RAM is somewhere at 750MB and then for the second an additional 50-100MB are used. Any way to have a precompiled/cached schema so that first message would benefit? In my particular case, I would use smooks by passing a single message at time to stdin and get the output from stdout. While this is not the most elegant, it would work. As such, there are no subsequent messages that would benefit from initial compilation, or caching, or any optimization that would become available after a first message instance is processed. If you have any other hints on what and how to further test this I would be very happy to test. Thanks |
Which version of the EDIFACT cartridge are you running? I'm getting very different numbers over here. |
2.0.0-RC1, as specified in the pom.xml. Switching to 2.0.0-RC4 causes
|
I'm getting a 404 from following that link. There were performance improvements since RC1 so you should run the example from the v2 tag: https://github.com/smooks/smooks-examples/tree/v2 :
|
Was the issue resolved? Can this be closed? |
I experience the issue with v2, as well. |
[app.jfr.zip](https://github.com/user-attachments/files/17284325/app.jfr.zip)
@cjmamo |
I noticed that the reader pool size by default is 0 which means that Smooks constructs a new reader every time it filters the input. Can you set this to 1 and observe how it performs when filtering multiple inputs? <smooks-resource-list xmlns="https://www.smooks.org/xsd/smooks-2.0.xsd"
xmlns:core="https://www.smooks.org/xsd/smooks/smooks-core-1.6.xsd"
xmlns:edifact="https://www.smooks.org/xsd/smooks/edifact-2.0.xsd">
<core:filterSettings readerPoolSize="1"/>
...
...
</smooks-resource-list> |
I am running the edifact-to-xml example and I see what it seems to be very high RAM and CPU utilization. Running it on a Mac with M2 and 24GB under latest macOS, as well as under Ubuntu 22.04.5, 2GB, Basic DigitalOcean droplet, with 2GB and Regular Intel 1vCPU, Java 17.
On both machines the java process goes up to 1.2 - 1.5 GB RAM, with full CPU utilization.
On the droplet the example runs in around 80 seconds, which seems quite a long time for the transformation, even for the droplet specs.
Is this the expected behavior? Is there a way to optimize this?
Thanks
The text was updated successfully, but these errors were encountered: