Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsers & tagging for M365 Defender portal events #4794

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dafneb
Copy link

@dafneb dafneb commented Jan 27, 2024

One line description of pull request

Parser for events and activities exported from Microsoft 365 Defender portal.

Description:

  • Created parser for events from Activity log.
  • Created parser for events exported from Advanced Hunting (DeviceEvents, DeviceProcessEvents, DeviceNetworkEvents, DeviceImageLoadEvents, DeviceFileEvents, DeviceLogonEvents, DeviceRegistryEvents, UrlClickEvents).
  • Created tags for defender events.

Related issue (if applicable):

Notes:

All contributions to Plaso undergo code review.
This makes sure that the code has appropriate test coverage and conforms to the
Plaso style guide.

One of the maintainers will examine your code, and may request changes. Check off the items below in
order, and then a maintainer will review your code.

Checklist:

  • Automated checks (GitHub Actions, AppVeyor) pass
  • No new new dependencies are required or l2tdevtools has been updated
  • Reviewer assigned

@joachimmetz joachimmetz self-assigned this Jan 28, 2024
Copy link

codecov bot commented Jan 28, 2024

Codecov Report

Attention: Patch coverage is 87.06294% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 85.25%. Comparing base (ed8a139) to head (59fb297).
Report is 1 commits behind head on main.

Files Patch % Lines
plaso/parsers/defender_hunting.py 88.31% 27 Missing ⚠️
plaso/parsers/m365_activitylog.py 81.81% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4794      +/-   ##
==========================================
+ Coverage   85.24%   85.25%   +0.01%     
==========================================
  Files         426      428       +2     
  Lines       38532    38818     +286     
==========================================
+ Hits        32847    33096     +249     
- Misses       5685     5722      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@joachimmetz
Copy link
Member

@dafneb I'll make some changes to make sure the code meets the style guide. I'll leave comments without tagging you in, consider them informational/educational.

from plaso.containers import events
from plaso.parsers import dsv_parser
from plaso.parsers import manager

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per style guide use 2 empty lines.

"""M365 Activity log event data

Attributes:
timestamp (dfdatetime.DateTimeValues): Date and time when
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the additional white space does not match with rest of the code base.

DATA_FORMAT = 'M365 Activity log'

COLUMNS = (
'Event ID',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation does not match style guide.

if timestamp == 'Date':
return

activity = row.get('Category', None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dafneb why only allow these categories?

if len(row) != self._MINIMUM_NUMBER_OF_COLUMNS:
return False

# Check the date format
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment can be removed given the date value check is clear form the code and the reason for the check is in the function docstring.


def __init__(self, actiontype='event-action'):
"""Initializes event data."""
self.DATA_TYPE = f'm365:defenderah:{actiontype}' # pylint: disable=invalid-name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach will make it hard to maintain, given it is now unclear which attribute is expected in which event data object/type.


def __init__(self, actiontype='event-action'):
"""Initializes event data."""
self.DATA_TYPE = f'm365:defenderah:{actiontype}' # pylint: disable=invalid-name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the closest to the original data is to make this a single event data type https://learn.microsoft.com/en-us/microsoft-365/security/defender/advanced-hunting-devicefileevents-table?view=o365-worldwide

if not tmp_action in self._ACTIVITIES:
return

# pylint: disable=line-too-long
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't

@joachimmetz
Copy link
Member



class DefenderDeviceEventData(events.EventData):
"""Defender DeviceFileEvents event data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'InitiatingProcessParentFileName',
'InitiatingProcessParentCreationTime')

_ACTIVITIES = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the microsoft documentation appears to refer to this ac action type

"""
try:
tmp_row = dict((k.lower().strip(), v) for k,v in row.items())
tmp_action = tmp_row['actiontype'].lower().strip()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why lower and strip this a second time?

tmp_row = dict((k.lower().strip(), v) for k,v in row.items())
tmp_action = tmp_row['actiontype'].lower().strip()

if not tmp_action in self._ACTIVITIES:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be solved with a single get, given later the same look up is performed

row (dict[str, str]): fields of a single row, as specified in COLUMNS.
"""
try:
tmp_row = dict((k.lower().strip(), v) for k,v in row.items())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dafneb why lower case the values? have you seen cases where the csv file is not in the casing defined by the AH schema?

@joachimmetz
Copy link
Member

Might be useful to keep notes about the format and queries somewhere. Started https://github.com/forensicswiki/wiki/pull/223/files

# Microsoft Defender Activity Log
data_type: 'm365:activitylog:event'
attribute_mappings:
- name: 'recorded_time'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is defined as timestamp in corresponding EventData object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants