process_iter()'s filter argument #1401

giampaolo · 2019-02-01T16:41:56Z

Example:

>>> psutil.process_iter(filter=lambda p: p.name() == 'python')

This will make it easier to write more compact code and one-liners (e.g. see doc).
Also it will help writing more efficient code. E.g. if one is interested in processes with a certain name() and username() the concatenation of "ANDs" will avoid calling username() if the name() condition is not satisfied:

>>> psutil.process_iter(filter=lambda p: p.name() == 'python' and p.username() == 'jeff')

Right now this is not possible because the API only allows you to collect all process info in one shot and only after that we can apply the filtering logic:

for p in psutil.process_iter(attrs=['name', 'username']);
    if p.info['name'] == 'python' and p.info['username'] == 'jeff':
        ...

This will be particularly useful for resource intensive methods such as open_files(). The use case was suggested by @btimby.

The text was updated successfully, but these errors were encountered:

giampaolo · 2019-02-18T20:35:24Z

Note to self - while I was thinking about real-world use cases I sort of enjoyed coming up with some "utils-module-like" examples :

import psutil, os


def filter_by_name(proc):
    return proc.name() == 'python' or 'python' in proc.cmdline()


def filter_by_current_user(proc):
    return proc.username() == os.getusername()


def filter_by_servers(proc):
    for conn in proc.connections():
        if conn.status == psutil.CONN_LISTEN:
            return True

def filter_by_logfiles(proc):
    for file in proc.open_files():
        if file.path.endswith('.log'):
            return True


for p in psutil.process_iter(filter=filter_by_logfiles):
    print(p)

That looks nice on the surface, but it sort of encourages using a paradigm that poses thorny questions regarding caching. The filtering function calls methods which are likely gonna be called in the "for" block as well (and these are not cached). An easy way to solve that would be passing the result of Process.as_dict() directly instead, which seems reasonable. Needs more thinking though, as I'm not sure such kind of paradigm should be encouraged (logic split vs. logic in one place, etc). In the meantime, here's the patch:

--- a/psutil/__init__.py
+++ b/psutil/__init__.py
@@ -1494,7 +1494,7 @@ def pid_exists(pid):
 _pmap = {}
 
 
-def process_iter(attrs=None, ad_value=None):
+def process_iter(attrs=None, ad_value=None, filter=None):
     """Return a generator yielding a Process instance for all
     running processes.
 
@@ -1517,10 +1517,17 @@ def process_iter(attrs=None, ad_value=None):
     """
     def add(pid):
         proc = Process(pid)
-        if attrs is not None:
-            proc.info = proc.as_dict(attrs=attrs, ad_value=ad_value)
-        _pmap[proc.pid] = proc
-        return proc
+        with proc.oneshot():
+            if filter is not None:
+                try:
+                    if not filter(proc):
+                        raise NoSuchProcess(proc.pid)
+                except AccessDenied:
+                    raise NoSuchProcess(proc.pid)
+            if attrs is not None:
+                proc.info = proc.as_dict(attrs=attrs, ad_value=ad_value)
+            _pmap[proc.pid] = proc
+            return proc
 
     def remove(pid):
         _pmap.pop(pid, None)

dpinol · 2024-05-05T06:23:52Z

any feedback on this?

giampaolo added enhancement new-api labels Dec 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process_iter()'s filter argument #1401

process_iter()'s filter argument #1401

giampaolo commented Feb 1, 2019 •

edited

giampaolo commented Feb 18, 2019

dpinol commented May 5, 2024

process_iter()'s filter argument #1401

process_iter()'s filter argument #1401

Comments

giampaolo commented Feb 1, 2019 • edited

giampaolo commented Feb 18, 2019

dpinol commented May 5, 2024

giampaolo commented Feb 1, 2019 •

edited