-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add cum_count_*()
operations to update_by
and Numeric
library
#6270
base: main
Are you sure you want to change the base?
Conversation
engine/table/src/main/java/io/deephaven/engine/table/impl/updateby/UpdateByOperatorFactory.java
Outdated
Show resolved
Hide resolved
…zero, nan, inf, finite cases.
cum_count()
operation to update_by
cum_count_*()
operations to update_by
cum_count_*()
operations to update_by
cum_count_*()
operations to update_by
cum_count_*()
operations to update_by
cum_count_*()
operations to update_by
and Numeric
library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not take a look at anything except TableSpec stuff.
import org.immutables.value.Value.Immutable; | ||
|
||
/** | ||
* A {@link UpdateBySpec} for performing a Cumulative Count of non-null values in the specified columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"non-null", seems incorrect.
public enum CumCountType { | ||
NON_NULL, NULL, NEGATIVE, POSITIVE, ZERO, NAN, INFINITE, FINITE | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pointing out (I think we do this in a couple other places), having values that are only applicable for some numeric types is a bit of a "code smell" - but something I guess we are "okay" with.
I'm worried about double logic; does ZERO include -0.0? Is -0.0 counted for negative?
Instead of an enum, I think we might want an EnumSet here where the user could say "I want to count NULL | NAN
, or NULL | NAN | INFINITE
, etc (and then, NON_NULL
doesn't exist, it would just be represented by the lack of NULL
).
If we really wanted to be more generally, we could even consider using a Filter/Expression
- of course, we could limit it the scope of what would be allowable in this context, but it could give us room in the future to support more advance counts like "count number of odd longs", etc.
* @param pairs The input/output column name pairs | ||
* @return The aggregation | ||
*/ | ||
static UpdateByOperation CumCount(String... pairs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a method for per enum is not desirable... just expose enum (or, however it is configured) to the caller?
Adds new operation to
update_by
to output the cumulative count of values encountered.cum_count_all
- cumulative count of all values (including null)cum_count_non_null
- cumulative count of non-null valuescum_count_null
- cumulative count of null valuescum_count_neg
- cumulative count of non-null negative valuescum_count_pos
- cumulative count of non-null positive valuescum_count_zero
- cumulative count of zero valuescum_count_nan
- cumulative count of NaN valuescum_count_inf
- cumulative count of +Inf/-Inf valuescum_count_finite
- cumulative count of non-null finite valuesExample Code
Groovy
Python