Krita/Optimization: Difference between revisions
No edit summary |
|||
Line 86: | Line 86: | ||
* [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF Software Optimization Guide for AMD64] | * [http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25112.PDF Software Optimization Guide for AMD64] | ||
* [http://www.pixelglow.com/macstl/ STL like programming but using MMX/SSE{1,2,3} when available] | * [http://www.pixelglow.com/macstl/ STL like programming but using MMX/SSE{1,2,3} when available] | ||
= Profile guided optimization = | |||
Profile guided optimization is something else though. It is a special way of compiling and linking, that the compiler and linker use profiling information to know how best to optimize the code. So code that is used a lot is compiled with -O3 (the most optimizations), while code that is not used a lot gets -Os (to take less space), and so forth. This is a very useful technique that was not available on Linux until last year, and the news today is that Firefox now builds properly with it and there is a nice noticeable speed improvement for Linux users. | |||
source:http://linux.slashdot.org/comments.pl?sid=2117150&cid=35987784 | |||
= Links = | = Links = |
Revision as of 13:40, 1 May 2011
Hot Spots
- thumbnails are recalculated a lot
- the histogram docker calculates even when hidden
- brush outline seems slow
- the calculation of the mask for the autobrush is very slow and doesn't cache anything
- caching a whole row or column of tiles in the h/v line iterators should speed up things a lot
- tile engine 1 has the BKL; tile engine 2 cannot swap yet and isn't optimized yet
- projection recomposition doesn't take the visible area into account
- pigment preloads all profiles (startup hit)
- gradients are calculated on load, instead of being associated with a png preview image that is cheap to load
Tools
Valgrind
Tips
- only turn on instrumentation when you need it, ie only before the function you want to optimize, you can use callgrind_control to control valgrind. For instance, to stop instrumentation:
callgrind_control -i off
And then to activate it:
callgrind_control -i on
And unless you want to optimize startup, I suggest that you use the following startup line (which switch off instrumentation untill a call to "callgrind_control -i on"):
valgrind --tool=callgrind --instr-atstart=no krita
Sysprof
mutrace
mutrace is a tool that count how much time is spend waiting for a mutex to unlock.
Easy optimization
As soon as you see slow code, try to have a look at the code to see if we aren't creating a lot of unnecesserary objects, 90% of the time slow code is caused by this (the remain 10% are often caused by a lot of access to the tilesmanager, like with random accessor)
For instance:
- Avoid:
for(whatever) { QColor c; ... }
Do:
QColor c; for(whatever) { }
It might seems insignificant, but really it's not, on a loop of a milion of iterations, this is expensive as hell.
An other example:
- avoid
for(y = 0 to height) { KisHLineIterator it = dev->createHLineIterator(0, y, width); for(whatever) { ... } }
Do:
KisHLineIterator it = dev->createHLineIterator(0, y, width); for(y = 0 to height) { for(whatever) { ... } it.nextRow(); // or nextCol() if you are using a VLine iterator }
Vector instructions
* reference about MMX on Intel's website * Fundamentals of Media Processor Designs: introduction to the use of MMX/SSE instructions * Software Optimization Guide for AMD64 * STL like programming but using MMX/SSE{1,2,3} when available
Profile guided optimization
Profile guided optimization is something else though. It is a special way of compiling and linking, that the compiler and linker use profiling information to know how best to optimize the code. So code that is used a lot is compiled with -O3 (the most optimizations), while code that is not used a lot gets -Os (to take less space), and so forth. This is a very useful technique that was not available on Linux until last year, and the news today is that Firefox now builds properly with it and there is a nice noticeable speed improvement for Linux users.
source:http://linux.slashdot.org/comments.pl?sid=2117150&cid=35987784
Links
- Design for Performance : great read about performance optimization (aimed at game developers, but many tricks apply for Krita)
- TCMalloc: a malloc replacement which make faster allocation of objects by caching some reserved part of the memory
- Optmizing CPP: extensive manual on writing optimized code.