-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR][DAC] Add Support for Custom Schemas #3679
base: DAC-feature
Are you sure you want to change the base?
[FR][DAC] Add Support for Custom Schemas #3679
Conversation
We should assess / consider a few things:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have a good solution for the data view consideration (which only now becomes relevant since built in rules will never use them, only indexes).
Within QueryRuleData
, we can add a new property index_or_dataview -> list[str]
which returns whichever is set. The reason for making it a list even though dataview is a string is because we always iterate, so it saves us from doing a type check every time.
Then, we will need to search for every use of data.index
(or rule.contents.data.index
and all variations) and replace them with this new property where it makes sense.
Additionally, in #3510, it should have also included a validation check to ensure only index or dataview are set and not both
@validates_schema
def validates_query_data(self, data, **kwargs):
...
# add check here
This should probably be done in main
and backported to this branch.
If needed, you can convert this comment to an issue to track.
detection_rules/ecs.py
Outdated
if schema_path.is_file(): | ||
custom_schema_dump.update(eql.utils.load_dump(str(schema_path))) | ||
elif schema_path.is_dir(): | ||
custom_schema_dump.update(load_schemas_from_dir(schema_path)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enforce only schema files vs a dir of schemas (requiring explicit paths for all). Unless you have an example where this would be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I like the idea of being able to specify a directory of schemas as I think it could be more convenient for CI.
For example, if a customer has schema files that have dynamic names, e.g. my_schmea_v1.1, that could change to my_schema_1.2, they would not have to update their config file each time the name changed. Instead they could leave the config file the same and just specify a directory where my_schema_* would be located instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but with the way it is written, won't it merge all the schemas together?
def load_schemas_from_dir(schema_dir: Path) -> dict:
"""Load all schemas from a directory."""
schemas_dump = {}
for file_path in schema_dir.iterdir():
if file_path.is_file() and file_path.suffix in [".json", ".yaml", ".yml"]:
schemas_dump.update(eql.utils.load_dump(str(file_path)))
return schemas_dump
so if I had a dir with random schemas, it would merge them all into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it would have that behavior, merging them all into one. Maybe I am misunderstanding. Originally, I thought the question you were proposing was comparing someone being able to write:
8.14.0:
beats: 8.12.2
ecs: 8.11.0
endgame: 8.4.0
custom_one: schemas/custom/forte-schema_1.json
custom_two: schemas/custom/forte-schema_2.json
...
vs
8.14.0:
beats: 8.12.2
ecs: 8.11.0
endgame: 8.4.0
custom: schemas/custom
In which case I would think we would want to support either? Otherwise I am not sure I follow what you are asking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For flexibility, I like the ability to specify independent schemas or a folder of schemas. Definitely helps limit modifications to the config.
Made an issue to track this independently 👍 #3697 |
detection_rules/custom_schemas.py
Outdated
stack_schema_map = RULES_CONFIG.stack_schema_map[stack_version] | ||
|
||
for schema, value in stack_schema_map.items(): | ||
if schema not in ["beats", "ecs", "endgame"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if it is one of these three?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should ignore the schema if it is one of those, as those are reserved "schema words" that are by definition not custom. Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think that is the smart approach, but wont this return an empty dict in this case? It would be better to raise an error, forbidding using reserved words. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is a good idea, we will need to have it function such that it will allow the words once, as we would want them to be able to specify beats, etc. or to not use those if desired. But we would not want them to use multiple as the result would be confusing 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon further testing and reflection, I think the test for additional beats, ecs, or endgame schemas should not be in the custom_schemas.py
as it might be confusing to have the custom schema loader be validating non-custom schemas. I think this check would be considered increased validation for the schema map in general. As such, I think this could target main, as the check is not DAC specific.
Will be adding documentation updates to show BYOS features. |
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Issues
#3618
Summary
This PR adds the ability to specify custom schema files (or directories) via
stack-schema-map.yaml
e.g.
Details
Example Rule using eric-index
Basic Testing
At a minimum one should test being able to run
view-rule
against both an EQL and a KQL rule using custom index. Additionally, different min stack versions should be used to test that the custom schema is being applied to the correct stack version.Example
Additional Testing
With your custom schema and rules perform the following tests.
Note, these may not all pass, as I am not sure if we have tried them all on the DAC-feature branch to catch any other bugs, but before merging we should either remove extraneous tests for this list or address why they are failing.
view-rule
on several different integrations and try with multiplemin_stack
versions.python -m detection_rules view-rule rules/integrations/google_workspace/collection_google_drive_ownership_transferred_via_google_workspace.toml
python -m detection_rules view-rule rules/integrations/aws/collection_cloudtrail_logging_created.toml
python -m detection_rules view-rule rules/integrations/azure/initial_access_consent_grant_attack_via_azure_registered_application.toml
python -m detection_rules view-rule rules/apm/apm_403_response_to_a_post.toml
show-latest-compatible
to verify command shows latest compatible version.python -m detection_rules dev integrations show-latest-compatible -p apm -s 8.2.0
build-schemas
andbuild-manifest
commands to test rebuilding the schemas and manifestpython -m detection_rules dev integrations build-schemas -o
python -m detection_rules dev integrations build-manifests -o
make test
to ensure all unit tests pass.