condor_utils
has classes designed to make measuring and publishing statistics about the performance of code easy. The classes are defined in generic_stats.h
This consists of a set of simple templatized classes that you add as member variables to your code, or to a helper class or structure. The most commonly used of these is the stats_entry_recent
class. This class keeps track of a total value, and also uses a ring buffer to keep track of recent values.
Adding statistics values to an existing collection
To add a statistics value where there is already an existing collection of values, do the following:
- Declare a member of type
stats_entry_xxx
and add it to the stats collection class. the most common type isstats_entry_recent
. The collection is a member of type StatisticsPool and is usually named Pool - Set or increment the variable in the code to be measured.
Adding a statistics value when there isn't a collection
At the most basic level, use generic statistics by:
- Add member variables of type
stats_entry_xxx
to your class or as a static or global - Set the size of the recent buffer if the type is
stats_entry_recent
(you can do this in the constructor) - Set or increment the variable in the code to be measured.
- Periodically call the
Advance
orAdvanceBy
method to rotate the recent buffer - call the
Publish
method to write the value as a ClassAd attribute.
Below is code suitable for creating an object that keeps track of times.
class MyObject { ... // declare a statistics counter stats_entry_recent<time_t> myObjectRuntime; ... }; // set statistics recent buffer to 4 items in the constructor void MyObject::MyObject() : myObjectRuntime(4) { } // update the counter value in one of my worker methods void MyObject::MyWorkerMethod() { time_t begintime = time(NULL); // advance the recent buffer if enough time has passed. time_t delta = (begintime - last_rotate_time); if (delta > time_quantum) { myObjectRuntime.AdvanceBy(delta/time_quantum); last_rotate_time = begintime - (delta % time_quantum); } // ... do some work // accumulate statistics data myObjectRuntime += time(NULL) - begintime; } // publish the counter value(s) in another worker method void MyObject::MyPublishStatisticsMethod(ClassAd & ad) { if (publish_all) { myObjectRuntime.Publish(ad, "ObjectRuntime"); } else { // publish the overall value myObjectRuntime.Publish(ad, "ObjectRuntime", myObjectRuntime.PubValue); // publish the recent value myObjectRuntime.Publish(ad, "RecentObjectRuntime", myObjectRuntime.PubRecent); } }
The myObjectRuntime
member will accumulate overall runtime and runtime within the recent window. In this the above example, the recent window consists of 4 intervals, with the interval advancing whenever you call the Advance
or AdvanceBy
methods on myObjectRuntime. You would commonly advance based on elapsed time, and there is a helper function in generic_stats.cpp generic_stats_Tick
, that will advance a collection of counters on that basis.
The published ClassAd attribute data types will be the same as the templatized data type. int
, double
are supported in ClassAds. time_t values will be published as int
.
More sophisticated counters can be built by using a class as the template parameter to stats_entry_recent
. For instance stats_entry_recent<Probe>
will store and publish count,min,max,average,and standard deviation for a sample value and a recent window of the values. The Probe
class does count,min,max etc, and the stats_entry_recent
class will manage a total and recent buffer of =Probe=s.
Creating a new statistics collection
It is common to have multiple statistics values to measure a subsystem that are intended to be published together. When this is the case is most convenient to gather then up into a container class/struct and then add a StatisticsPool member to the class so that you can use it's methods to Clear, Publish and Advance the collection of counters as a unit.
Counters of type stats_entry_recent<> contain a ring buffer, so that values from the past can be discarded. The usual way to do this is to decide how long you want to keep values (the window), and how precisely you want to keep track of the time a value was added (the time quantum). The number of entries in the ring buffer becomes window/quantum, so there is a trade-off between memory used by a statistics counter and how accurately it keeps track of time.
As a general rule a statistics collection is defined by where and when the data will be gathered rather rather than by how and where the data is published.
If there is a need to publish the same value in more than one ClassAd with a different name in each add, this can be accomplished by calling the Publish methods of the counters explicitly or adding the counters to the StatisticsPool multiple times with different names.
- create a structure or class to hold your probes
- use
stats_entry_recent<type>
for probes that should publish recent values (i.e. number of jobs that have finished)
- use
stats_entry_abs<type>
for probes that are instantaneous values for which you may want to publish a max value (i.e. number of shadow currently alive)
- use
stats_recent_counter_timer
for probes that accumulate runtime - use
stats_entry_probe<type>
for probes that publish min,max,avg,stdev
- use
stats_entry_recent<Probe>
for probes that publish min,max,avg,stdev and also keep a recent buffer of the same.
- use
- give your stats class methods for
Init
,Clear
,Tick
, andPublish
, and have those methods call the corresponding methods on the member variable of type StatisticsPool.
class MyStats { time_t InitTime; time_t Lifetime; time_t LastUpdateTime; time_t RecentLifetime; time_t RecentTickTime; stats_entry_recent<int> JobsSubmitted; stats_entry_recent<time_t> JobsTimeToStart; stats_entry_recent<double> JobsRunningTime; ... etc StatisticsPool Pool; void Init(); void Clear(); void Tick(); void Publish(ClassAd & ad) const; void Unpublish(ClassAd & ad) const; } MyStats;
Then, in your C++ source file for the MyStats
class:
void MyStats::Init() { const int recent_window = 60*20; const int window_quantum = 60*4; Pool.AddProbe("JobsSubmitted", &JobsSubmitted); Pool.AddProbe("JobsTimeToStart", &JobsTimeToStart); Pool.AddProbe("JobsRunningTime", &JobsRunningTime); Pool.SetRecentMax(recent_window, window_quantum); } void MyStats::Clear() { Pool.Clear(); this->InitTime = time(NULL); this->StatsLifetime = 0; this->StatsLastUpdateTime = 0; this->RecentStatsTickTime = 0; this->RecentStatsLifetime = 0; } void MyStats::Publish(ClassAd & ad, int flags) const { Pool.Publish(ad, flags); } void MyStats::Unpublish(ClassAd & ad) const { Pool.Unpublish(ad); } void MyStats::Tick() { const int my_window = 20*60; // seconds const int my_quantum = 4*60; // in seconds int cAdvance = generic_stats_Tick( my_window, my_quantum, this->InitTime, this->LastUpdateTime, this->RecentTickTime, this->Lifetime, this->RecentLifetime); if (cAdvance) Pool.Advance(cAdvance); }
Adding statistics values at runtime
Probes can be added to the StatisticsPool at runtime. They can also be queried out of the pool, athough this is more expensive and error-prone than merely referencing a member variable.
class MyStats { ... StatisticsPool Pool; ... void NewProbe(const char * category, const char * ident, int as_type); void Sample(const char * ident, int val); void Sample(const char * ident, double val); } MyStats; // create a new probe with name name // void MyStats::NewProbe(const char * category, const char * ident, int as_type) { MyString attr; attr.sprintf("MYStat_%s_%s", category, ident); cleanStringForUseAsAttr(attr); int flags = IF_BASICPUB; int cRecent = my_window / my_quantum; switch (as_type) { case STATS_ENTRY_TYPE_INT32 | AS_RECENT: { stats_entry_recent<int> probe = Pool.NewProbe< stats_entry_recent<int> >(ident, attr.Value(), flags); probe->SetRecentMax(cRecent); } break; case STATS_ENTRY_TYPE_DOUBLE | AS_RECENT: { stats_entry_recent<double> probe = Pool.NewProbe< stats_entry_recent<double> >(ident, attr.Value(), flags); probe->SetRecentMax(cRecent); } break; } } void MyStats::Sample(const char * ident, int val) { stats_entry_recent<int>* probe = Pool.Get< stats_entry_recent<int> >(ident); if (probe) probe->Add(val); } void MyStats::Sample(const char * name, double val) { stats_entry_recent<double>* probe = Pool.Get< stats_entry_recent<double> >(ident); if (probe) probe->Add(val); }