Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WindowEvent.xml/json #75

Closed
abrichr opened this issue May 3, 2023 · 2 comments
Closed

Add WindowEvent.xml/json #75

abrichr opened this issue May 3, 2023 · 2 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@abrichr
Copy link
Contributor

abrichr commented May 3, 2023

This task involves adding an xml and/or json attribute to WindowEvent, and storing a text representation of the user's current active window.

This is to intended to augment ScreenShots where xml/json representations are available, e.g. for non-virtualized desktop applications, and web applications (see #51)

For implementing on Windows, start here:

https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_application.py#L661

https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_application.py#L264

From ChatGPT:

What Python libraries and/or Windows/MacOS system APIs can I use to get the full UI state (e.g. as XML or json) of a third party desktop application while it is running?

To get the UI state of a third-party desktop application, you can use various libraries and system APIs depending on the platform. Here's a list of options for both Windows and macOS:

Windows:

  1. UI Automation API: Windows provides a native API called UI Automation, which allows you to access and manipulate the UI elements of third-party applications. You can use it with Python by leveraging the pywinauto library. Documentation for UI Automation API can be found here: https://docs.microsoft.com/en-us/windows/win32/winauto/entry-uiauto-win32
  2. Pywinauto: A popular Python library that enables you to automate GUI tasks on Windows. It provides an easy-to-use interface to access UI elements and supports both Win32 and UIA API. You can find more information and documentation on GitHub: https://github.com/pywinauto/pywinauto

macOS:

  1. AppleScript: AppleScript is a scripting language provided by Apple, which allows you to control applications and their UI elements. You can execute AppleScript scripts from Python using the osascript command or the appscript library. For more information, check out the official AppleScript documentation: https://developer.apple.com/library/archive/documentation/AppleScript/Conceptual/AppleScriptLangGuide/introduction/ASLR_intro.html
  2. PyObjC: PyObjC is a Python-to-Objective-C bridge that enables you to use macOS frameworks like Cocoa and Quartz in Python. You can use it to access and manipulate UI elements of third-party applications. Documentation and examples can be found here: https://pythonhosted.org/pyobjc/
  3. AXUIElement: macOS has a native Accessibility API called AXUIElement, which provides access to UI elements and their properties. You can use this API with Python by using the pyobjc library. Here's a helpful blog post on how to get started with AXUIElement: https://www.raywenderlich.com/54876/accessibility-os-x-tutorial

From phind.com:

How can I programatically get the state of a running desktop application?

Getting the state of a running desktop application programmatically can be approached in different ways depending on the operating system and the programming language being used. In this answer, we will explore some of the options available for Java developers on Windows and Linux.

Starting with Windows, one way to get the state of a running desktop application is by using the tasklist command. This command provides a list of all running processes on the system, including the name and the process ID (PID) of each process. To run this command from Java, we can use the Runtime.exec method and pass the command as a string. Here is an example:

Process p = Runtime.getRuntime().exec("tasklist.exe /fo csv /nh");
BufferedReader input = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line;
while ((line = input.readLine()) != null) {
    // Parse data here.
}
input.close();

This code runs the tasklist command with the /fo csv /nh options, which produce a CSV format output without the column headers. The output can be parsed to extract the information about the running processes. Note that the Runtime.exec method throws an IOException if the command fails to execute, so it's important to handle this exception.

Another approach to getting the state of a running desktop application on Windows is by using the ProcessHandle class, which was introduced in Java 9. This class provides a way to access information about the processes currently running on the system, including their PIDs, parent processes, user, start time, and command line. Here is an example:

ProcessHandle.allProcesses()
        .forEach(process -> System.out.println(processDetails(process)));

private static String processDetails(ProcessHandle process) {
    return String.format("%8d %8s %10s %26s %-40s",
            process.pid(),
            text(process.parent().map(ProcessHandle::pid)),
            text(process.info().user()),
            text(process.info().startInstant()),
            text(process.info().commandLine()));
}

private static String text(Optional<?> optional) {
    return optional.map(Object::toString).orElse("-");
}

This code uses the ProcessHandle.allProcesses() method to get a stream of all the processes currently running on the system. The forEach method is then used to process each process in the stream and print its details. The processDetails method formats the details of each process into a string that can be printed. Note that the text method is used to convert optional values to strings, and to handle null values.

Moving on to Linux, one way to get the state of a running desktop application is by using the wmctrl command, which provides a list of all the windows currently open on the desktop, including their names and IDs. To run this command from Java, we can use the Runtime.exec method and pass the command as a string. Here is an example:

Process p = Runtime.getRuntime().exec("wmctrl -l");
BufferedReader input = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line;
while ((line = input.readLine()) != null) {
    // Parse data here.
}
input.close();

This code runs the wmctrl command with the -l option, which lists all the windows currently open on the desktop. The output can be parsed to extract the information about the running applications. Note that the Runtime.exec method throws an IOException if the command fails to execute, so it's important to handle this exception.

Another approach to getting the state of a running desktop application on Linux is by using the xprop command, which provides information about the properties of a window, including its class and title. To use this command, we need to first get the ID of the window we are interested in, and then pass this ID to the xprop command. Here is an example:

Process p = Runtime.getRuntime().exec("wmctrl -l");
BufferedReader input = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line;
while ((line = input.readLine()) != null) {
    String[] parts = line.split("\\s+", 4);
    String winId = parts[0];
    Process p2 = Runtime.getRuntime().exec("xprop -id " + winId + " WM_CLASS");
    BufferedReader input2 = new BufferedReader(new InputStreamReader(p2.getInputStream()));
    String line2 = input2.readLine();
    // Parse data here.
    input2.close();
}
input.close();

This code first runs the wmctrl command to get a list of all the windows currently open on the desktop. It then loops through the output of this command, extracts the ID of each window, and passes this ID to the xprop command. The output of the xprop command can be parsed to extract the information about the running applications. Note that we need to handle the case where a window has no class information.

@abrichr abrichr added help wanted Extra attention is needed tribe-hackathon labels May 3, 2023
@abrichr
Copy link
Contributor Author

abrichr commented May 5, 2023

  1. Look at pywinauto and figure out what data is available, and when/how:

dump_tree: https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_application.py#L661

https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_application.py#L264

ElementInfo: https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/element_info.py

window_text: https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_wrapper.py#L306

from_point: https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_wrapper.py#L266

texts:
https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_wrapper.py#L468

get_properties
https://github.com/pywinauto/pywinauto/blob/bf7f789d01b7c66ccd0c213db0a029da7e588c9e/pywinauto/base_wrapper.py#LL569C9-L569C23

  1. Modify models.py WindowEvent and/or Screenshots and/or create a new table altogether:
    xml = sa.Column(sa.Integer)
    (or json, or other, depending on what is available from step 1)

  2. In read_screen_events https://github.com/MLDSAI/puterbot/blob/main/puterbot/record.py#LL344C4-L344C23 and/or read_window_events https://github.com/MLDSAI/puterbot/blob/main/puterbot/record.py#L344 and/or a new window_state_event_reader:

Read new data via https://github.com/pywinauto/pywinauto

  1. If necessary, add logic to discard unnecessary events in process_events:

https://github.com/MLDSAI/puterbot/blob/main/puterbot/record.py#L67

e.g.:

if prev_window_state_event` is None:
    logger.warning("discarding input that came before window_state")
  1. If necessary, add logic to write window_state_events:

e.g. after window_event_writer in write_events: https://github.com/MLDSAI/puterbot/blob/main/puterbot/record.py#L670

  1. Add MacOS support to pywinauto; see:

@abrichr
Copy link
Contributor Author

abrichr commented May 23, 2023

Addressed in #151

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants