today I'd quickly like to share some results we got from AutoProfiler concerning the pattern extraction from runtime profiles. As test benchmark we used Parallel Programming Samples offered by Microsoft, and a Desktop Search we developed and parallelized manually as a real world example.
For this first approach, we analyzed the control flows of all actual programs by automatically instrumenting the binary and collecting the following indicators:
- Number of times a method is called
- Method-inclusive time share
- Method-exclusive time share
The following table shows the results of one benchmark and compares them to the suggestions we got out of AutoProfiler.
Method | #calls | %incl. time | %excl. time | Manual pattern | AutoProfiler Suggestion |
---|---|---|---|---|---|
PerfSimStep() | 120 | 91,82 | 0,10 | Worker | Worker |
Sim1Step() | 7260 | 91,40 | 57,05 | Worker | Master |
I think the next step is to take a deeper look into data dependencies in order to better distinguish different patterns. This could well be Master/Worker from Pipeline. Also, we try to cope with a special form of calls from a mater to a worker thread: If a detected worker calls a method very often with no high CPU-load per call, it might lead to a performance gain to inline this worker instead of an explicit spawn-off-execution in a separate thread.
K!
Keine Kommentare:
Kommentar veröffentlichen