Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] KQL fails to parse brackets and wildcards correctly #3582

Open
saiiman opened this issue Apr 7, 2024 · 0 comments · May be fixed by #3605
Open

[Bug] KQL fails to parse brackets and wildcards correctly #3582

saiiman opened this issue Apr 7, 2024 · 0 comments · May be fixed by #3605
Assignees
Labels

Comments

@saiiman
Copy link

saiiman commented Apr 7, 2024

Hello,

Describe the bug
when parsing a KQL query to a DSL query using the converter in lib/kql, I noticed that it makes two conversion errors.

  1. For a wildcard the keyword wildcard is not used but query_string. (I'm not sure if this is intentional, though. Otherwise the query did not work for me).
  2. Brackets in the KQL-query, in order to map certain conditions, are converted incorrectly.

To Reproduce

query = """
    host.name: \"foo\" 
    and 
    source.ip: \"10.10.0.10.\" 
    and not 
    user.name : bar* 
    and not (
        destination.name : \"some_name\" 
        and 
        destination.ip : \"20.20.0.20\"
    ) 
    and not 
    another.value : \"true\"
"""

print(KqlParser.to_dsl(query))

The current output is

{
  "bool": {
    "filter": [
      {
        "match": {
          "host.name": "foo"
        }
      },
      {
        "match": {
          "source.ip": "10.10.0.10."
        }
      }
    ],
    "must_not": [
      {
        "query_string": {
          "fields": [
            "user.name"
          ],
          "query": "bar*"
        }
      },
      {
        "match": {
          "destination.name": "some_name"
        }
      },
      {
        "match": {
          "destination.ip": "20.20.0.20"
        }
      },
      {
        "match": {
          "another.value": "true"
        }
      }
    ]
  }
}

Expected behavior
The expected output is

{
  "bool": {
    "filter": [
      {
        "match": {
          "host.name": "foo"
        }
      },
      {
        "match": {
          "source.ip": "10.10.0.10."
        }
      }
    ],
    "must_not": [
      {
        "wildcard": {             # use wildcard keyword
          "user.name": {
            "value": "bar*"
          }
        }
      },
      {
        "bool": {                 # use nested bool-filter keywords
          "filter": [
            {
              "match": {
                "destination.name": "some_name"
              }
            },
            {
              "match": {
                "destination.ip": "20.20.0.20"
              }
            }
          ]
        }
      },
      {
        "match": {
          "another.value": "true"
        }
      }
    ]
  }
}

suggested solution

  1. for the wildcard keyword, the following code line should be adapted as follows.
    return lambda field: {"query_string": {"fields": [field], "query": tree.value}}
return lambda field: {"wildcard": {field: { "value": tree.value}}}
  1. For the missing brackets, I have traced the transformation back to the following line of code. As a suggestion, I would propose a check for the number of elements within the filter.
    if list(child) == ["bool"] and list(child["bool"]) in (["filter"], ["must"]):
if list(child) == ["bool"] and list(child["bool"]) in (["filter"], ["must"]) and \
            (len(child.get("bool", {}).get("filter", 1)) == 1 and len(child.get("bool", {}).get("must", 1)) == 1):

thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants