Fast method call scheduling in Python
For some parts of my project, I need a process-local scheduling system that allows me to delay method execution by a few seconds. I have thousands of “clients” of this system, so use threading for each delay
. Timer is a bad idea because I will soon hit the operating system thread limit. I have implemented a system that uses only one thread for timing control.
The main idea is to keep a queue of sorted tasks (time + function + parameter + kwargs) and use a single
threading. Timer to schedule/cancel the execution of the queue header. This scheme works, but I’m not happy with the performance. Scheduling a virtual task every 10 seconds for about 2000 clients causes the process to consume 40% of the CPU time. Looking at the profiler output, I found that all the time was spent on new
threading. The construction of the timer, its startup, and especially the creation of new threads.
I believe there is a better way. Now I’m thinking about rewriting
LightTimer so that there is one that can be
threaded. Event controls the execution thread and multiple
set() timed thread events. For example:
- I scheduled a task to call within 10 seconds. The task is added to the queue. Timing thread #
- Then I schedule a task to call in 11 seconds. The task is added to the queue. The timing thread does not respond and notifies the new task when it wakes up.
- Then I schedule a task to be invoked in 5 seconds. The task is added to the queue. Timing thread #2 starts
time.sleep(5)because #1 has slept for longer intervals.
time.sleep(10) before event.set().
I hope you have understood the idea. What do you think of this approach? Is there a better way? Maybe I can take advantage of some Linux system features to come up with the best solution?
An alternative implementation you can use is to use the time.time
() method to calculate the absolute time that each queued function should execute. Put this time and the function you want to call in an object wrapper that uses execution time to determine the order to override comparison operators. The
heapq module is then used to maintain a minimal heap. This will give you an efficient data structure where element 0 of the heap is always your next event.
One way to implement the actual call is to use a separate thread to execute the callback. The heap will need to be protected with mutexes, and you can use condition variables to implement scheduling. In an infinite loop, simply find the next execution of the function (element 0 of the heap) and use the
wait() method of the condition variable and set the timeout to the next execution time. If the newly inserted function should occur before the oldest function already in the heap, then your heap insertion method can wake up the scheduling thread early using the
notify() method of the condition variable.