However the prototype for the pow function is double pow double x, double y ; I need to have x and y be double. March 18th, , PM 5. Re: Math "pow" function slow Well March 18th, , PM 6. Re: Math "pow" function slow I have now verified this assembly code and seems to work Excellent!!! March 18th, , PM 7. Re: Math "pow" function slow You have to be careful with the negative exponents.
The author states that this will not work if the exponent is negative. Of course, the solution to this is just to check for it, and just get the reciprocal of the base and make the exponent positive before calling the assembly language function. NET Framework. How hard would it really be to port to Windows 8? All times are GMT For the third power, for std::pow x, 3 , it is now much faster than before.
Let's see if there are differences between clang Moreover, there is no difference between the two libraries. Let's see if this makes a difference for third power:. I was expecting more of a difference between the two, but it seems they are using a similar implementations, if not the same. As said earlier, all the tests were run in single precision float.
Let's see now if it's any different with double precision double. This is very interesting! It seems that most of the overhead of this function for single precision was in fact in conversion to double since it seems that the algorithm itself is only implemented for double precision.
Let's see about third power now:. As seen before, with third power, the overhead is actually huge. Let's see what happens with -ffast-math:. With -ffast-math, there is absolutely no overhead anymore for std::pow x, n even for third power. The results are the same for clang.
I've checked for higher values of the exponent and the result is also the same. Now, let's try to test for which n is code:std::pow x, n becoming faster than multiplying in a loop. Since std::pow is using a special algorithm to perform the computation rather than be simply loop-based multiplications, there may be a point after which it's more interesting to use the algorithm rather than a loop.
And now, let's see the performance. I've compiled my benchmark with GCC 4. Here are the results for calls to each functions:. I highlighted the lines in questions. The calculations with test are in place so that the compiler does not optimize away the code, which it would do otherwise. When I did this on , pow was significantly faster. I have revisited this on , where pow just a tiny bit slower than the multiplication. To be fair, I ran each one a couple times since the overall time varies.
0コメント