Ticket #2541: Memory utilization of classads (revised)
Problem DescriptionWhen running heap profiles experiments on the schedd, we've found that doubling the queue count == doubling of the classad count in the schedd even though most of the data is the same across ads.
Also when doing performance calculations on the collector it was determined that most of the time inside of the collector was spent churning on redundant data.
Proposed solutionKeep a cache of the current in memory classads.
Approach key [str_name] = key[str_value] = weak_ptr<~ExprTree>After several iterations on possible implementations I've settled on this implementation because it has optimal speed performance without sacrificing any space savings.
One thing that becomes apart when analyzing the data is that there is a large % or redundant data and a small % of data which churns. We could easily save more memory by having meta information in classads (possibly through nested ads).m_Cache[name][value] = weak_ptr<Cache_Element>
All caching code backs against the Insert function(s) in classads. It was optimized so the main path by which messages come across the wire will do a look-up using a single string based function in the classads layer:bool Insert(const std::string& serialized_nvp);
Initial ResultsSpace savings depends on a couple of variables:
- Very simple tests yield space savings ~10-15% with iLoc savings ~ 10%. This is very difficult to quantify until it's tested at scale what the actual difference will yield.
- The complexity of the ExprTree's in the Cache, more complex >> savings in both speed and perf. So it is "fluid"
DebuggingThere are a couple of internal features which have been added to aid in the diagnostics of the behavior of caching. There is an internal only knob which is hooked through DAEMON_CORE which enables a daemon to dump it's cache to a <subsystem>_classad_cache file under LOG when DEBUG_CLASSAD_CACHE=TRUE . The file dumping is hooked through SIGUSR2kill -s 12 <collector_pid> -> drops $LOG/COLLECTOR_classad_cache
- This works for all DAEMON_CORE based daemons.
2012-May-15 16:24:04 by tstclair:
After taking it off the shelf, dusting off, and giving this ticket much needed love things are looking good. Initial tests yield hit rates ~80% on average with a fair amount of variance depending on the workload.
I will update this ticket to include the design. Once my tests come back green I will push a final squash commit for review.
2012-May-16 15:11:39 by tstclair:
- pushed to tracking branch: master-classad-cache
- local tests all passed, will run through NMI (previously built cleanly through NMI)
2012-May-16 15:14:03 by tstclair:
Any extra lexer optimization would still be handy, because as entries get flushed from the cache they still need to be lex'd prior to re-entry.
2012-May-22 16:25:15 by tstclair:
removing lexer mods and cleaning up + running through local tests post mods to revised approach. Will then commit and create new tickets which outline further potential gains.
2012-May-30 15:34:11 by jfrey:
CODE REVIEW of first iteration (, git hash 10ae0cd4):
- The new functions nvp() and split_nvp() are unused. I'm assuming they're meant to be part of further optimizations.
These have been removed and cleaned
- ClassAdCache and CachedExprEnvelope make frequent use of shared_ptr::use_count() and weak_ptr::use_count(). The Boost docs and the STL spec say that this method isn't necessarily efficient and shouldn't be used for production code.
Non starter, does not affect perf
- For ClassAdCache::dump_keys() and CachedExprEnvelope::_debug_dump_keys(), I'd prefer if they took a FILE* to any already-opened file.
I actually don't like this recommendation
- In MatchClassAd::ReplaceLeftAd(), passing in a NULL changes
lad, which is different behavior than before. ReplaceRightAd() has the same problem.
I've verified behavior and went through that code many times so I think I will need to see exactly what you are talking about
- In compat_classad::ClassAd::Insert(), the modified ExprTree* isn't propagated to the caller. If the expression is cached, then the caller's ExprTree* is no longer valid, which will cause problems if they try to use it.
- A caller to ClassAd::Insert() may have their ExprTree* modified to point to a cached equivalent ExprTree*. If they kept another copy of the old pointer and try to use it, they can crash. If they Evaluate() on the modified ExprTree*, they will be in the wrong scope. These will hopefully be unusual patterns, but could be lurking surprises.
- A caller to ClassAd::Lookup() or ClassAd::LookupInScope() will get back the CachedExprEnvelope for the ExprTree* if caching was enabled on insertion (the default). If they want to inspect the ExprTree*, they will have to know to call ExprTree::self().
All virtual function go through self so it should be clean, which is why the Env->Letter idiom was chosen
- If caching goes badly, every attribute/value pair gets an additional overhead of 48 bytes for the CachedExprEnvelope, plus the size of the attribute name (in CachedExprEnvelope::m_szKey).
Non-starter with redesign, and has been tested at length
- Caching values that are integer, real, or boolean literals doesn't save much space and can use a lot of extra space if there are more than a few unique values for a given attribute. Consider not caching these Literals.
Will let other test justify
- It doesn't look like nested ClassAds are handled. If an attribute in a cached nested ad is part of an evaluation and it references anything that's not local, the wrong parent ad(s) will be searched.
Will need example, as all tests pass
- Possible alternative: Move parentScope from ExprTree to ClassAd. In most places where scoping information is required, it's stored in an EvalState or is queried from a ClassAd. Some details on this proposal:
- ExprTree would be smaller, saving memory.
- ExprListIterator::Initialize() would have to be modified to take a scope parameter.
- ExprTree::Evaluate() and ExprTree::Flatten() variants that don't have an EvalState parameter would have to be modified to take a scope parameter or be removed.
- ClassAd::AttrList would be a map of std::string to shared_ptr<ExprTree>.
- CachedExprEnvelope would not be necessary, but the rest of the caching code (using shared_ptr and weak_ptr) would be largely the same.
- This idea has the same problem with nested ads. The best solution I have is to not cache them.
Outside of scope of this ticket
2012-May-30 15:36:53 by jfrey:
While reviewing this code, I had a few ideas for additional memory optimizations. The sizes mentioned are on x86_64 linux:
strValueinto pointers and move
strValueinto the enum. This would save 16 bytes for all literals that aren't strings or absolute times. String literals would save 8 bytes.
absoluteinto a special value of
expr. This would save 8 bytes per attribute reference.
2012-Jun-26 21:54:26 by jfrey:
Followup on code review:
- Why don't you like the recommendation of ClassAdCache::dump_keys() and CachedExprEnvelope::_debug_dump_keys() taking a FILE* instead of a filename? It matches how the parsing behaves and gives the caller more control over how files are opened. In Condor, it allows the use of the safefile functions.
Because I want to ensure open semantics and it's intended to be a debug only utility to be used only by developers so safefile semantics do not matter
- Caching and nested ads are even more broken than I thought. I will attach a example. In the example, attribute C doesn't evaluate properly when caching is enabled. When caching is disabled (done by giving an argument of '0'), it evaluates properly.
2012-Jun-27 13:46:40 by jfrey:
I found some additional problems with nested ads. Your most recent fix addresses when a nested ad is the value of an attribute in the parent ad. But it doesn't address when the nested ad is a subexpression. I'll attach a demonstration program similar to the previous one.
2012-Jun-29 14:12:37 by johnkn:
Wranger sez - This item needs version history. please compose a couple of sentences and send to me and or Karen.
Type: enhance Last Change: 2013-Jan-14 14:18 Status: resolved Created: 2011-Oct-11 15:39 Fixed Version: v070900 Broken Version: v070600 Priority: 3 Subsystem: Daemons Assigned To: tstclair Derived From: #2961 Creator: tstclair Rust: Customer Group: other Visibility: public Notify: firstname.lastname@example.org email@example.com, firstname.lastname@example.org email@example.com firstname.lastname@example.org Due Date:
|#2899||classad lex/parse uses up to 50% of the compute time for daemons|
|#3075||Proper build fails on Centos 5|
|#3089||Reduce size of classad::Value|
|#3092||MatchClassAd::ReplaceLeftAd() and ReplaceRightAd() need fixing|
|#3099||Bug in unparsing quoted ClassAd attribute names|
|#3100||New ClassAd::Insert() doesn't handle attribute names properly|
|#3127||Caching code makes ClassAds much less thread-safe|
|#3175||ClassAd caching leaks memory in Condor|
|#3441||Fixes for classad caching.|
|#3448||Improving classads hash function|
|#3457||reduce the size of classad exprtree nodes|
|2013-Mar-22 09:58||Check-in : Remove data members from classad Operation class, and create subclasses by arity #3457 parent ticket #2541 each subclass of Operation class will have only the necessary data members, so we don't waste space for 2nd & 3rd ExprTree pointers in nodes that have no need of them. ===VersionHistory:Pending=== [...] (By John (TJ) Knoeller )|
|2013-Mar-22 09:58||Check-in : Remove nodeKind member from ExprTree, and replace it with a pure virtual method #3457 that all of the derived classes must implement. parent ticket #2541 (By John (TJ) Knoeller )|
|2012-Dec-04 14:41||Check-in : Ticket #2541 Fix Cache - Unparsing issue in xml, and cleanup SameAs f(n)'s. (By Timothy St. Clair )|
|2012-Jul-17 14:13||Check-in : edit of version history item and knob defn related to new caching of ClassAds ===GT=== #3127 ===GT=== #2541 (By Karen Miller )|
|2012-Jul-11 10:44||Check-in : improve wording of version history item about caching ClassAds ===GT=== #2541 (By Karen Miller )|
|2012-Jun-29 14:38||Check-in : ===GT=== #2541 ===VersionHistory:Complete=== (By Timothy St. Clair )|
|2012-Jun-29 12:52||Check-in : Ticket #2541 Update string::npos check for invalid input strings (By Timothy St. Clair )|
|2012-Jun-27 14:36||Check-in : Ticket #2541 Update for ExprList (By Timothy St. Clair )|
|2012-Jun-27 12:10||Check-in : Ticket #2541 Update caching for nested classads use case. (By Timothy St. Clair )|
|2012-Jun-26 18:35||Check-in : Fix building of cream and blahp externals. #2541 classad_stl.h wasn't being installed into the externals directory. The blahp configure line was getting two include directories for boost. (By Jaime Frey )|
|2012-Jun-25 13:17||Check-in : Fix building of cream_gahp for RHEL 6 #2541 (watched by tstclair) (By Nathan W. Panike )|
|2012-Jun-22 16:11||Check-in : Ticket #2541 Windows requires explicit inclusion of weak_ptr.hpp (By Timothy St. Clair )|
|2012-Jun-22 15:28||Check-in : Ticket #3075 and #2541 remove namespace collision with prefix classad_ (By Timothy St. Clair )|
|2012-Jun-22 15:01||Check-in : Ticket #3075 and #2541 Add in order checks for standard features C++11 -> tr1 -> boost. Sadly our code base needs tender love for C++11 support. Also hides boost from classads exposure where possible. (By Timothy St. Clair )|
|2012-Jun-22 11:08||Check-in : Ticket #3075 and #2541 Update build chain to prefer local system boost when possible even when non-proper. This ensures distributed libraries match system where possible. (By Timothy St. Clair )|
|2012-Jun-21 17:22||Check-in : Ticket #2541 & Ticket #3075 Update cream to use prefixed boost with prefixed classads (By Timothy St. Clair )|
|2012-Jun-21 15:26||Check-in : Ticket #3075 and #2541 Where boost doesn't exist prefer tr1. (By Timothy St. Clair )|
|2012-Jun-21 12:46||Check-in : Ticket #2541 NMI build and test updates Fixes Fedora 16 -Werror build Fixes lib_job_router_local failure (By Timothy St. Clair )|
|2012-Jun-21 08:06||Check-in : Ticket #2541 - Squashed feature commit for classad caching. (By Timothy St. Clair )|
- nested_ad.cpp 1010 bytes added by jfrey on 2012-Jun-27 02:55:47 UTC.
Example showing caching broken for nested ads. Attr C doesn't evaluate properly with caching enabled, but does when caching is disabled. Pass a '0' on the command line to disable caching.
- nested_ad2.cpp 1017 bytes added by jfrey on 2012-Jun-27 18:47:36 UTC.
Program demonstrating that caching doesn't handle nest ads that are a subexpression in an attribute value. In this case, it's part of a list.