Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process_iter()'s filter argument #1401

Open
giampaolo opened this issue Feb 1, 2019 · 2 comments
Open

process_iter()'s filter argument #1401

giampaolo opened this issue Feb 1, 2019 · 2 comments

Comments

@giampaolo
Copy link
Owner

giampaolo commented Feb 1, 2019

Example:

>>> psutil.process_iter(filter=lambda p: p.name() == 'python')

This will make it easier to write more compact code and one-liners (e.g. see doc).
Also it will help writing more efficient code. E.g. if one is interested in processes with a certain name() and username() the concatenation of "ANDs" will avoid calling username() if the name() condition is not satisfied:

>>> psutil.process_iter(filter=lambda p: p.name() == 'python' and p.username() == 'jeff')

Right now this is not possible because the API only allows you to collect all process info in one shot and only after that we can apply the filtering logic:

for p in psutil.process_iter(attrs=['name', 'username']);
    if p.info['name'] == 'python' and p.info['username'] == 'jeff':
        ...

This will be particularly useful for resource intensive methods such as open_files(). The use case was suggested by @btimby.

@giampaolo
Copy link
Owner Author

Note to self - while I was thinking about real-world use cases I sort of enjoyed coming up with some "utils-module-like" examples :

import psutil, os


def filter_by_name(proc):
    return proc.name() == 'python' or 'python' in proc.cmdline()


def filter_by_current_user(proc):
    return proc.username() == os.getusername()


def filter_by_servers(proc):
    for conn in proc.connections():
        if conn.status == psutil.CONN_LISTEN:
            return True

def filter_by_logfiles(proc):
    for file in proc.open_files():
        if file.path.endswith('.log'):
            return True


for p in psutil.process_iter(filter=filter_by_logfiles):
    print(p)

That looks nice on the surface, but it sort of encourages using a paradigm that poses thorny questions regarding caching. The filtering function calls methods which are likely gonna be called in the "for" block as well (and these are not cached). An easy way to solve that would be passing the result of Process.as_dict() directly instead, which seems reasonable. Needs more thinking though, as I'm not sure such kind of paradigm should be encouraged (logic split vs. logic in one place, etc). In the meantime, here's the patch:

--- a/psutil/__init__.py
+++ b/psutil/__init__.py
@@ -1494,7 +1494,7 @@ def pid_exists(pid):
 _pmap = {}
 
 
-def process_iter(attrs=None, ad_value=None):
+def process_iter(attrs=None, ad_value=None, filter=None):
     """Return a generator yielding a Process instance for all
     running processes.
 
@@ -1517,10 +1517,17 @@ def process_iter(attrs=None, ad_value=None):
     """
     def add(pid):
         proc = Process(pid)
-        if attrs is not None:
-            proc.info = proc.as_dict(attrs=attrs, ad_value=ad_value)
-        _pmap[proc.pid] = proc
-        return proc
+        with proc.oneshot():
+            if filter is not None:
+                try:
+                    if not filter(proc):
+                        raise NoSuchProcess(proc.pid)
+                except AccessDenied:
+                    raise NoSuchProcess(proc.pid)
+            if attrs is not None:
+                proc.info = proc.as_dict(attrs=attrs, ad_value=ad_value)
+            _pmap[proc.pid] = proc
+            return proc
 
     def remove(pid):
         _pmap.pop(pid, None)

@dpinol
Copy link

dpinol commented May 5, 2024

any feedback on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants