Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the TimeProfiler #94

Open
gigiblender opened this issue Apr 26, 2021 · 1 comment
Open

Refactor the TimeProfiler #94

gigiblender opened this issue Apr 26, 2021 · 1 comment
Assignees
Labels
bug Something isn't working documentation Documentation

Comments

@gigiblender
Copy link
Member

gigiblender commented Apr 26, 2021

Describe the bug
Currently, this method (addValueToMetric) expects a taskName to be passed.
The addValueToMetric method is called for the TASK_COPY_IN_SIZE_BYES and TASK_COPY_OUT_SIZE_BYTES profile type, which is not correct. Objects passed as parameters to tasks belong to the whole task schedule context (i.e multiple tasks that belong to the same task schedule and use the same object parameter will result in a single COPY_IN/STREAM_IN).

Therefore, I think the TASK_COPY_IN/OUT_SIZE_BYTES should be profiled per task schedule, and not individual tasks.

How To Reproduce
To reproduce, run the test below with the -Dtornado.profiler=True flag

    public static void add(int[] a, int[] b) {
        for (@Parallel int i = 0; i < a.length; i++) {
            b[i] = a[i] + a[i];
        }
    }

    public static void mult(int[] a, int[] b) {
        for (@Parallel int i = 0; i < b.length; i++) {
            b[i] = b[i] + a[i] * 3;
        }
    }

    public static void main(String[] args) {
        int n = 32;
        int[] a = new int[n];
        int[] b = new int[n];

        TaskSchedule ts = new TaskSchedule("s0")
                .task("t0", Main::add, a, b)
                .task("t1", Main::mult, a, b)
                .streamOut(b);

        ts.execute();
    }

The output produced by the profiler is:

{
    "s0": {
        "TOTAL_DISPATCH_DATA_TRANSFERS_TIME": "51936",
        "TOTAL_TASK_SCHEDULE_TIME": "298600160",
        "TOTAL_DRIVER_COMPILE_TIME": "169628101",
        "TOTAL_GRAAL_COMPILE_TIME": "51983174",
        "TOTAL_KERNEL_TIME": "18176",
        "TOTAL_DISPATCH_KERNEL_TIME": "15200",
        "TOTAL_BYTE_CODE_GENERATION": "5949780",
        "COPY_IN_TIME": "4512",
        "COPY_OUT_TIME": "1888",
        "s0.t0": {
            "METHOD": "Main.add",
            "DEVICE_ID": "0:0",
            "DEVICE": "GeForce GTX 1650",
            "TASK_COPY_OUT_SIZE_BYTES": "152",
            "TASK_COPY_IN_SIZE_BYTES": "344",
            "TASK_COMPILE_GRAAL_TIME": "35695419",
            "TASK_KERNEL_TIME": "9984",
            "TASK_COMPILE_DRIVER_TIME": "88543309"
        }, 
        "s0.t1": {
            "METHOD": "Main.mult",
            "DEVICE_ID": "0:0",
            "DEVICE": "GeForce GTX 1650",
            "TASK_COPY_IN_SIZE_BYTES": "40",
            "TASK_COMPILE_GRAAL_TIME": "16287755",
            "TASK_KERNEL_TIME": "8192",
            "TASK_COMPILE_DRIVER_TIME": "81084792"
        }
    }
}

Even though objects a and b are used by both t0 and t1, the TASK_COPY_IN_SIZE_BYTES and TASK_COPY_OUT_SIZE_BYTES are only reported for t0.

Expected behavior
The expected output for the test above would be:


{
    "s0": {
        "TOTAL_DISPATCH_DATA_TRANSFERS_TIME": "51936",
        "TOTAL_TASK_SCHEDULE_TIME": "298600160",
        "TOTAL_DRIVER_COMPILE_TIME": "169628101",
        "TOTAL_GRAAL_COMPILE_TIME": "51983174",
        "TOTAL_KERNEL_TIME": "18176",
        "TOTAL_DISPATCH_KERNEL_TIME": "15200",
        "TOTAL_BYTE_CODE_GENERATION": "5949780",
        "COPY_IN_TIME": "4512",
        "COPY_OUT_TIME": "1888",
        "COPY_OUT_SIZE_BYTES": "XXXX",
        "COPY_IN_SIZE_BYTES": "XXXX",
        "s0.t0": {
            "METHOD": "Main.add",
            "DEVICE_ID": "0:0",
            "DEVICE": "GeForce GTX 1650",
            "TASK_COMPILE_GRAAL_TIME": "35695419",
            "TASK_KERNEL_TIME": "9984",
            "TASK_COMPILE_DRIVER_TIME": "88543309"
        }, 
        "s0.t1": {
            "METHOD": "Main.mult",
            "DEVICE_ID": "0:0",
            "DEVICE": "GeForce GTX 1650",
            "TASK_COMPILE_GRAAL_TIME": "16287755",
            "TASK_KERNEL_TIME": "8192",
            "TASK_COMPILE_DRIVER_TIME": "81084792"
        }
    }
}

Additional context
Also, Javadoc for the datastructures of the profiler should be added.

@gigiblender gigiblender added bug Something isn't working documentation Documentation labels Apr 26, 2021
@jjfumero
Copy link
Member

Thank you @gigiblender for the report. This is by design. We want each task to have its own profiler. This is especially beneficial in a multi-task , multi-device environment. But I also agree that for some cases (e.g., using the task.sync(objects) ) the profiler should add the metrics at the task-schedule level, rather than the task-level.

I suggest refactoring this part to add both options.

  • Add metrics for copies at the task-schedule level
  • When possible, keep the copy metrics for each individual task.

@jjfumero jjfumero self-assigned this Jun 17, 2022
@jjfumero jjfumero added this to To do in TornadoVM v0.14.1 via automation Jun 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Documentation
Projects
No open projects
Development

No branches or pull requests

2 participants