Allow ActivityWatch to Ignoring / Filtering #302

NathanaelA · 2022-09-18T04:05:42Z

@ErikBjare @johan-bjareholt - Actually when it came to implementing it -- several things changed from my initial idea that I proposed in ActivityWatch/activitywatch#1

I ended up using the brand new KeyValue table in 0.12.0, So basically in the keyvalue table you can add a new key called IGNORE_FILTERS with the json value of: {"APPS": ["app1", "app2", ...], "TITLES": ["title1", "title2", ...]}
The DataStore module on startup loads this JSON key, parses it into two vectors that it then uses during inserts and replace functions on the event table. If any of the values you put into the TITLES key sub-string matches the Event.title it skips adding it to the database. Ditto with the Event.app against the APPS vector. This is a substring match so if you put "BRAV" in the APPS then it will filter out any app that contains BRAV anywhere in the app name. (i.e. BRAVE, BRAVO, ABRAVING, etc...

At this point I DO NOT have any GUI to add this value, you just use SQLite (or perhaps a POST to / which acts like a KeyValue "set") but at this point the process works.

raudabaugh · 2022-11-05T21:31:36Z

Thanks for this! It's working really well for me. :)

Some things to consider:

aw-watcher-window has additional filtering logic specific to macOS/Chrome here. I think it makes sense to drop such platform-specific cases and keep filtering responsibility solely within aw-server-rust.
For my use case, I'd prefer that AW still adds the events to the database while redacting the title. I implemented it this way here.

johan-bjareholt

I like the idea of using the keystore to save the ignores for titles and apps. But there are some other rough edges that needs to be addressed.

johan-bjareholt · 2022-11-06T21:30:46Z

aw-datastore/src/lib.rs

@@ -1,3 +1,4 @@
+#![feature(option_result_contains)]


Please don't add any more nightly features, our aim is for aw-server-rust to be able to run on stable.

I'm not 100% sure what you are referring too, using XXX.contains afaik, is what caused the lib.rs to be updated to add #![feature(option_result_contains)]. And not sure how that stops this from being run on "Stable" -- I've been running this version of the server since I pushed my PR...

Is there a better way to see if a string contains another string that won't trigger rust to auto-add that [feature] flag to the lib.rs file if this is an actual issue that I'm not understanding?

With stable I mean "rust stable", the toolchain version. We currently require rust nightly because we have one dependency which requires it (which we have plans to get rid of).

I'm not sure what you mean that rust "auto-adds" this to lib.rs, rust itself can't change the source so either you have added it youself or you have a really invasive IDE or something.

I have not used contains before, but it seems to just be syntatic sugar. You just need to add another "match" to unpack the option/result, and if it was some/ok you can just do "x == y" as usual.

Is there a better way to see if a string contains another string

That's what String::contains does, this feature is in regards to Result::contains or Option::contains. So it's possible you have a bug here in case you are not using String::contains, since the other two checks for full equality rather than for a "substring".

I'm guessing this was added by my IDE then. When I have time, I'll try and remove the feature flag and see what happens, maybe the IDE guessed wrong... 😀

I'm not following you completely on the potential .contains replacement. This isn't a == situation.

The .contains() is the code here:
if app.contains(&key) { and here if title.contains(&key) {
Basically it is equivalent to Javascript title.indexOf(key) !== -1 where we are seeing if this specific key value is part of the app name or title. This way if the key says "BROWSER" then if you are running an app called "INTERNET BROWSER" or "BROWSER OPERA" or "THE BROWSER IS AWESOME" all three would be ignored because it found that the string had BROWSER in it somewhere... It is does string contain another string. I'm not a rust expert by any means, but if you have a better way to see if string A contains string B then I'm willing to switch to it... 😀

I think it should be possible to continue using contains as-is in the code here and just remove the nightly flag.

johan-bjareholt · 2022-11-06T21:33:26Z

aw-datastore/src/datastore.rs

            first_init,
            db_version,
        };
        ds.get_stored_buckets(&conn)?;
+        ds.get_stored_ignores(&conn)?;


So this is only done on startup? So if you ever change the list of apps/titles to ignore, you have to restart aw-server? Does not seem optimal.

Yes, this is a prototype to see how well it works. I'm using it great and my list never changes so far. I suspect just like in @raudabaugh code modification the list is generated normally once and your good to go. This list should rarely change... However, I agree that if we add some UI in front, that somehow we need to trigger the reload of the values on its change...

In addition I do believe Sam has a valid use case of rather than Ignoring the entries to use Redacted so this code should be enhanced to have Redacted_apps and Redacted_titles in addition to ignored as both are valid use cases.

johan-bjareholt · 2022-11-06T21:39:26Z

aw-datastore/src/datastore.rs

@@ -459,7 +490,32 @@ impl DatastoreInstance {
                )))
            }
        };
-        for event in &mut events {
+
+        'event: for event in &mut events {


It should not be the datastores responsibility to ignore apps/titles imo. It should not manipulate any data. I think it's better to have this logic in the rest endpoint for insert_event and heartbeat.

I'll have to re-look but I could have sworn there was more than just insert_event and heartbeat that could trigger the insert, which is why I was trying to consolidate the code only in two paths rather than have multiple places for the code... But maybe I miss read the code and only insert_event and heardbeat call these two datastore code paths...

johan-bjareholt · 2022-11-06T21:41:48Z

aw-datastore/src/datastore.rs

@@ -573,6 +629,26 @@ impl DatastoreInstance {
        bucket_id: &str,
        event: &Event,
    ) -> Result<(), DatastoreError> {
+
+        let app_exists = event.data.get("app");


Copy-paste of the same code above. Better to put it in a generic function. Unittests also wouldn't hurt.

Agreed, this should be made into a generic gunction -- the second path was because in my original implementation I found there was the second path that also created data entries (heartbeat)... So I duplicated the code to verify it would cover all the cases... Revamping the code is fine and I will check into moving this up to a higher layer, I just want to make sure I don't miss any code paths that plugins or other entities can call that will generate this data....

real-or-random · 2023-02-21T08:39:40Z

@NathanaelA Are you planning to continue working on this?

NathanaelA · 2023-02-21T18:29:47Z

@real-or-random when I have some of that spare time that people talk about... It is an low priority as this PR works great for my use-case.

2e3s · 2023-10-01T23:15:28Z

I made a similar feature in Linux Idle&Window watcher, but with regexp replacements. Given so many various watchers for ActivityWatch, it seems that the general rule may better be to keep watchers choosing what to send rather than implement the filtering and processing to make the server decide on what to store in DB.
+ This would also give extra security, because less unwarranted data travel through the network.
+ Also it allows to preview easily the private data without storing it on the server (if not sure of what app ID and title will be used).

johan-bjareholt · 2023-10-02T06:35:15Z

Given so many various watchers for ActivityWatch, it seems that the general rule may better be to keep watchers choosing what to send rather than implement the filtering and processing to make the server decide on what to store in DB.

That's the big reason why implementing it on the server would make sense, all watchers could get support for filtering without having to modify all of them to support this feature. If you would want to support filtering in all watchers separately, we would have to implement this feature at least 3 times (once for aw-client-python, once for aw-client-rust, once for aw-client-js) as well as maintaining it.

I'd also argue that the previewing would be more user friendly if done on the server-side, as then you could easily support that in the web-ui.

real-or-random · 2023-10-02T10:01:43Z

If you would want to support filtering in all watchers separately, we would have to implement this feature at least 3 times (once for aw-client-python, once for aw-client-rust, once for aw-client-js) as well as maintaining it.

I agree, and let me add having separate filters could be very surprising to user. And surprises are particularly bad when it comes to privacy features. For example, suppose the user has configured some filters, but then adds a new watcher, and suddenly the filters are not effective anymore.

edit: I still can imagine that some watchers would filter certain entries by default (e.g., in a browser watcher, one could ignore websites visited in private mode). But applying manual filters should be done on the server side.

2e3s · 2023-10-02T13:29:43Z

all watchers could get support for filtering without having to modify all of them to support this feature

Don't all watchers use their own keys? I clicked a couple random ones from https://docs.activitywatch.net/en/latest/watchers.html and looks like it. This code in PR looks specifically for app and title which is a realm of the window watcher, which is pretty much the only one.

I'd also argue that the previewing would be more user friendly if done on the server-side

It would be more GUI oriented indeed, but the event would be stored on the server then, while the whole point of the feature is to prevent that. For instance, I don't know how bitcoin wallet is reported in terms of app ID and title, and I don't want to keep in the history the usage of it. Client-based preview is trivial, but server-one would be not.

johan-bjareholt · 2023-10-03T12:48:11Z

but the event would be stored on the server then, while the whole point of the feature is to prevent that. For instance, I don't know how bitcoin wallet is reported in terms of app ID and title, and I don't want to keep in the history the usage of it.

It would be stored the first times when you test your filter yes, but since you then have a filter that works you could easily use that same filter to remove those events from the server.

I think it's important that we try to keep as many features as possible accessible through a GUI. We want to avoid excluding users just because they are not tech savvy when possible.

Don't all watchers use their own keys? I clicked a couple random ones from https://docs.activitywatch.net/en/latest/watchers.html and looks like it. This code in PR looks specifically for app and title which is a realm of the window watcher, which is pretty much the only one.

I agree that this is not desirable how it works in this PR and should be made more generic.

Allow ActivityWatch Ignoring

921791b

johan-bjareholt reviewed Nov 6, 2022

View reviewed changes

ErikBjare force-pushed the master branch from 98adff1 to 1219744 Compare November 22, 2023 11:08

dennisorlando mentioned this pull request Feb 4, 2024

Add functionality to redact/filter sensitive data ActivityWatch/activitywatch#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow ActivityWatch to Ignoring / Filtering #302

Allow ActivityWatch to Ignoring / Filtering #302

NathanaelA commented Sep 18, 2022

raudabaugh commented Nov 5, 2022 •

edited

johan-bjareholt left a comment

johan-bjareholt Nov 6, 2022

NathanaelA Nov 6, 2022

johan-bjareholt Nov 7, 2022

NathanaelA Nov 7, 2022

johan-bjareholt Nov 8, 2022

johan-bjareholt Nov 6, 2022

NathanaelA Nov 6, 2022

johan-bjareholt Nov 6, 2022

NathanaelA Nov 6, 2022

johan-bjareholt Nov 6, 2022

NathanaelA Nov 6, 2022

real-or-random commented Feb 21, 2023

NathanaelA commented Feb 21, 2023

2e3s commented Oct 1, 2023 •

edited

johan-bjareholt commented Oct 2, 2023

real-or-random commented Oct 2, 2023 •

edited

2e3s commented Oct 2, 2023

johan-bjareholt commented Oct 3, 2023

Allow ActivityWatch to Ignoring / Filtering #302

Are you sure you want to change the base?

Allow ActivityWatch to Ignoring / Filtering #302

Conversation

NathanaelA commented Sep 18, 2022

raudabaugh commented Nov 5, 2022 • edited

johan-bjareholt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

real-or-random commented Feb 21, 2023

NathanaelA commented Feb 21, 2023

2e3s commented Oct 1, 2023 • edited

johan-bjareholt commented Oct 2, 2023

real-or-random commented Oct 2, 2023 • edited

2e3s commented Oct 2, 2023

johan-bjareholt commented Oct 3, 2023

raudabaugh commented Nov 5, 2022 •

edited

2e3s commented Oct 1, 2023 •

edited

real-or-random commented Oct 2, 2023 •

edited