Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a limit of collected items #2051

Merged
merged 8 commits into from
Dec 5, 2023

Conversation

jan-cerny
Copy link
Member

@jan-cerny jan-cerny commented Nov 16, 2023

This PR introduces possibility to set a limit of items collected by probes.
If a single OVAL object matches more than given limit, we will stop collecting these items and mark the object as incomplete.

Users will set the limit by defining the OSCAP_PROBE_MAX_COLLECTED_ITEMS environment variable.
By default, the limit isn't set which means the probes will collect all the items matching the evaluated OVAL object.

Setting this limit can be useful in some situations in which the produced results are too large and cause memory problems or inability to generate the HTML report.

Related to: https://issues.redhat.com/browse/RHEL-11925, https://issues.redhat.com/browse/RHEL-4141

This change isn't a silver solution of these tickets. It's rather one of the measures that could improve our situation in combination with other measures both on the scanner and on the content side.

This commit extracts the code that marks the item produced by
probe as incomplete to a static function _mark_collected_object_as_incomplete
so that it can be reused in other code later
If a single OVAL object matches more than 10000 items, we will
stop collecting these items and mark the object as incomplete.
This should prevent some situations in which the produced results
are too large and cause memory problems or inability to generate
the HTML report.
This commit adds a simple test that covers a situation when
the probe collects more items than the limit for a single OVAL object.
The `probe_item_collect` function can fail and it can return various
return codes. But, the return codes aren't handled in the caller
function `process_file`. That means the `process_file` continues to
process items even if no other items can be collected due to memory
constraints or other problems. With this commit, we will read the return
code of `probe_item_collect` and react accordingly.
@evgenyz
Copy link
Contributor

evgenyz commented Nov 17, 2023

This won't fix the original problem (XSLT will still have problems if OVAL author concatenates multiple results), and on top of that it will introduce failing tests where HTML is not rendered at all (OCP4 Compliance Operator).

In my opinion a correct fix would modify XSLT template or, at least, limit amount of rendered objects.

@evgenyz
Copy link
Contributor

evgenyz commented Nov 17, 2023

Even if this functionality could be useful for someone in some cases, I'm strongly against putting it as a default behavior: OVAL does not define any kind of limit, which makes this an extremely unexpected feature without a control knob.


#include "probe.h"
#include "icache.h"
#include "_sexp-ID.h"

#define PROBE_ITEM_COLLECT_MAX 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here was used 1.000 but in the commit description is mentioned 10.000. Also, do you have information about how much memory one item consumes? I would like to understand how this proposed limit was calculated.

@marcusburghardt
Copy link
Member

Even if this functionality could be useful for someone in some cases, I'm strongly against putting it as a default behavior: OVAL does not define any kind of limit, which makes this an extremely unexpected feature without a control knob.

I see value in this feature since there are valid cases it can be used. But I also agree that introducing it as a default behavior might not be ideal. Can we introduce this feature as a command option, with no limit by default? I believe it would be easier for users to use this option when necessary.

@jan-cerny
Copy link
Member Author

This won't fix the original problem (XSLT will still have problems if OVAL author concatenates multiple results), and on top of that it will introduce failing tests where HTML is not rendered at all (OCP4 Compliance Operator).

My understanding of the original problem in https://issues.redhat.com/browse/RHEL-11925 is that the ARF tree produced by oscap which is the input of the XSLT template is so huge that libxslt has problems processing it.

I think that introducing a limit for the collected items will directly reduce the size of the produced ARF, so it will lower the chance that libxslt won't be able to process the tree.

You're right that it won't exactly fix the problem, It also doesn't aim to fix it by this PR, but the aim is to mitigate it or make it happening less often. We're running multiple efforts that could improve the situation when applied together, eg. #2052, changes of well-known rules in ComplianceAsCode/content.

I agree with you that XSLT will still have problems if OVAL definition uses multiple objects that lead to a large item collection. In that situation, I think having the limit still could help mitigate the issue. For example, if a definition contains 3 objects and each of them corresponds to 100000 items, then we would have 300000 items if no limit is in place but only 3000 items if there is a limit 1000 per object. Another aspect is the evaluated XCCDF profile and the amount of XCCDF rules in the profile and the nature of their individual OVAL checks. If we have a "bad chance" and we have many rules with these OVAL objects, the output will be reduced but it won't be sufficient. A different solution would be a global limit, ie. limit that doesn't apply per object but on the whole evaluation run.

Regarding the introduction of failing tests, I believe it depends on how specific OVAL definition is constructed. Can you elaborate on what you have in mind?

In my opinion a correct fix would modify XSLT template or, at least, limit amount of rendered objects.

The amount of rendered items in HTML report is limited to 100 items per OVAL object. See

<!-- table body (possibly item-type-specific) -->
<!-- limited to 100 lines -->
<tbody>
<xsl:for-each select='$items'>
<xsl:if test="not(position() > 100)">
<xsl:for-each select='key("oval-items", @item_id)'>
<xsl:apply-templates select='.' mode='item-body'/>
</xsl:for-each>
</xsl:if>
</xsl:for-each>
</tbody>
</table>
<xsl:if test="count($items) > 100">
... and <xsl:value-of select="count($items)-100"/> more items.
</xsl:if>

Even if this functionality could be useful for someone in some cases, I'm strongly against putting it as a default behavior: OVAL does not define any kind of limit, which makes this an extremely unexpected feature without a control knob.

I really like the idea to make the limit configurable in run time, eg. by command line option.

@evgenyz
Copy link
Contributor

evgenyz commented Nov 20, 2023

Regarding the introduction of failing tests, I believe it depends on how specific OVAL definition is constructed. Can you elaborate on what you have in mind?

OCP Compliance Operator (for example) does not render HTML and does not care about XSLT problems. If this feature will be enabled by default OCP evaluation results could become incomplete instead of passing.

In my opinion a correct fix would modify XSLT template or, at least, limit amount of rendered objects.

What I mean here is to try and throw out elements from the tree before trying to render them with XSLT instead of not collecting them at all. With this approach the ARF and XCCDF results would stay correct.

Even if this functionality could be useful for someone in some cases, I'm strongly against putting it as a default behavior: OVAL does not define any kind of limit, which makes this an extremely unexpected feature without a control knob.

I really like the idea to make the limit configurable in run time, eg. by command line option.

Could it be better to actually use OSCAP_PROBE_MAX_COLLECTED_OBJECTS env. variable (or smth like that)? This is the way we usually influence the behaviour of OVAL probes.

Users can limit the amount of items collected by setting the
`OSCAP_PROBE_MAX_COLLECTED_ITEMS` environment variable.
By default, the amount of items is unlimited.
@evgenyz
Copy link
Contributor

evgenyz commented Nov 22, 2023

@jan-cerny We do have quite a lot of env. variables that could affect oscap behavior. What do you think about having a special section somewhere in the main module, that would print all of them on some debug level? It might be useful to know all the input just from the log, without asking extra questions. The same, actually, goes for the command line arguments.

@jan-cerny
Copy link
Member Author

@evgenyz Yes, great idea! I believe it would be useful especially if the log output is attached to bug reports. Can you create a ticket for that?

@jan-cerny
Copy link
Member Author

I think the fail on Rawhide is caused by removal of xmlParseMemory from libxml2, I have found this thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/AI7QXMRXYQ3KAZTLIGICEX6YHFTM5RBR/

@jan-cerny jan-cerny marked this pull request as draft November 22, 2023 10:33
@jan-cerny jan-cerny added this to the 1.3.10 milestone Nov 22, 2023
@jan-cerny
Copy link
Member Author

What I mean here is to try and throw out elements from the tree before trying to render them with XSLT instead of not collecting them at all. With this approach the ARF and XCCDF results would stay correct.

I think removing elements from ARF before passing the ARF to XSLT would be in practice implemented by XSLT or a combination of XPath and DOM operations. I can imagine that this approach could lead a similar problem with memory issues.

Currently, OpenSCAP internally creates an ARF XML tree, applies XSLT on that to create HTML report, saves the HTML to a file and then saves the ARF file.

I think we can easily limit the amount of the items in the ARF during the serialization into OVAL results DOM. I look into that and it could be done by modifying oval_syschar_model_to_dom and oval_syschar_to_dom. See /src/OVAL/oval_sysModel.c around line 360. Due to the way how it works now it would cause a problem that also ARF file would be incomplete, which you probably don't want. But we can avoid it by creating a dedicated ARF tree just for the purpose of HTML creation, ie. first a full ARF tree would be created and saved to file and then another tree with reduced items would be created passed to XSLT to create a HTML output.

@jan-cerny
Copy link
Member Author

I have added a option to configure this using an environment variable OSCAP_PROBE_MAX_COLLECTED_ITEMS.

I also have extracted the proposal by @evgenyz from here to #2058.

@jan-cerny jan-cerny marked this pull request as ready for review November 24, 2023 10:00
@jan-cerny
Copy link
Member Author

The CI fail on Rawhide is tracked in #2059 and isn't caused by the contents of PR.

Copy link
Contributor

@evgenyz evgenyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is sound from my PoV. Just a couple of notes regarding style and names.

@@ -528,6 +529,31 @@ static int probe_cobj_memcheck(size_t item_cnt, double max_ratio)
return (0);
}

static int _mark_collected_object_as_incomplete(struct probe_ctx *ctx, const char *message)
{

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant \n

@@ -30,6 +30,7 @@ add_subdirectory("curl")
add_subdirectory("CPE")
add_subdirectory("DS")
add_subdirectory("mitre")
add_subdirectory("memory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior alteration we are testing here is not exactly connected to memory. I'd rather name it like probe_behavior or oval_behavior or something more generic to make the folder a collection of tests related to these env. vars.

@evgenyz
Copy link
Contributor

evgenyz commented Nov 30, 2023

@evgenyz Yes, great idea! I believe it would be useful especially if the log output is attached to bug reports. Can you create a ticket for that?

#2063

The behavior alteration we are testing here is not exactly connected to
memory. I'd rather name it like probe_behavior to make the folder a
collection of tests related to these env. vars.
@jan-cerny
Copy link
Member Author

I have removed empty line and renamed "memory" directory to "probe_behavior".

@evgenyz evgenyz merged commit 826eeb7 into OpenSCAP:maint-1.3 Dec 5, 2023
18 of 20 checks passed
@evgenyz evgenyz mentioned this pull request Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants