Performance issue in specific case


I found a possible bug using chaiscript. It is linked to save_function_params function, here is the configuration that cause problem :

    class Test
        Test() { }
        ~Test() { }

        int GetCount()
            return 30000; 

Test* testInst = new Test();

chai.add(chaiscript::user_type<Test>(), "Test");
chai.add(chaiscript::fun(&Test::GetCount), "GetCount");

chai.add(chaiscript::var(testInst), "test");

    return 300000;
}), "GetInteger");

std::string line;

            def func(inst)
                var nb = inst.GetCount()
                for (var j = 0; j <  inst.GetCount(); ++j)
                    var z = 0
                    if (z < 0)

catch (std::exception e)
    std::cout << "ERROR : " << e.what() << std::endl;

each loop time parameters are saved and it causes performance issue, if I replace inst.GetCount() in loop condition by nb or GetInteger function the parameters are not saved. So I wonder if this save is really necessary ?



The save_function_params comes down to use cases like this:

var r = range([1,2,3,4]);

save_function_params is used to make sure that the vector ([1,2,3,4]) which is created on the stack does not get popped from the stack and cause a memory error at runtime when the user attempts to use the variable r.

I have made some headway on my current optimization work to identify functions where save_function_params does not need to be called - the most obvious case being if the return value from the function is unused. The performance difference is not as big as I would have hoped, however.

Would you mind running an experiment for me:

  • Run some timings with the stock code.
  • simply comment out the internal call to save_function_params in your check-out of ChaiScript
  • Run timings again.
  • Repeat both timings with -DCHAISCRIPT_NO_THREADS

That should give you 4 sets of timings (I’m assuming optimized builds for all tests):

  • save_function_params / no threads
  • save_function_params / threads
  • no save_function_params / no threads
  • no save_function_params / threads

Then report back with the timings and your hardware configuration and compiler, that could really help me a lot, to know how much effort I should spend on this.

FYI: I did just run this test myself and am seeing a pretty significant performance difference, let me know how much you see.



Hi Jason,

I have done the tests you asked me, and here are the results :

  • save_function_params / no threads : 27247 ms
  • save_function_params / threads : 27847 ms
  • no save_function_params / no threads : 296 ms
  • no save_function_params / threads : 1028 ms

My computer configuration is :

  • OS : Win7 x64
  • Compiler : MSVC13 (x86 for this test)
  • CPU : Inter i3 (2130 3.4GHz)
  • RAM : 8Gb

One thing is that parameters are not saved once, the vector that contains previous stack is not popped, so during the for loop “save parameters” vector is size increasing (with reallocation permanence issue). What I mean is I expect at the end of the for loop step the current stack to be popped from “save parameter”, the problem is not here ?

Thank you for taking time to read and to respond :slight_smile:


Ok, I think you’re right that there is a bug because of an optimization that I applied a long time ago that avoids the function push/pop when it thinks it can. This is definitely going to take some work to find the best solution.


Hi, it is me again !

This performance issue in this case is incovenient for me, so I took time to look further.
I think the problem is inside get_bool_condition in While_AST_Node and For_AST_Node that evaluates chai to know if loop is ended. As I said upper, the problem is that if there is some save_function_params in the boolean condition (dot access, fun call, etc.) they are save on while/for scope and are popped when loop is ended (not at the end of the loop step), so the save_function_params vector is increasing again and again… (reallocation issue)
To avoid this bad behavior I just protect the get_boolean_condition evaluation inside a new scope (chaiscript::eval::detail::Scope_Push_Pop).
Basically I replace get_boolean_condition by get_scoped_boolean_condition :

        bool get_scoped_bool_condition(const AST_NodePtr child, const chaiscript::detail::Dispatch_State &t_ss) const {
            chaiscript::eval::detail::Scope_Push_Pop spp(t_ss);
            return get_bool_condition(child->eval(t_ss), t_ss);

I changed my computer since the last time so performance are now :

  • get_bool_condition : 21256 ms
  • get_scoped_bool_condition : 744 ms

On little tests I have done it works well, do you see any issue with this fix ?

Thank you


Oh that’s very interesting, and sorry for dropping the ball on it on my side.

I’d like to do some tests to make sure it doesn’t cause other unexpected impact, but it looks good initially.



Ok, I’ve looked into this, and the more I think about it the more I realize that I have a general problem with the lifetime of unused return values from functions. Or more to the point - they live too long, for the entire scope of the surrounding block. I’m going to do a little bit more research to see if I can completely avoid the scope and the saving of the return value.


Actually, sorry, it really is just the parameters being passed into the function call that might live longer than necessary, not the return value. Your method works, but I’m still not sure if it’s the most correct solution.


Ok, I’ve applied your patch and pushed it up to the ‘develop’ branch