Beijing-based hyperscale giant, ByteDance, owners of TikTok among other popular applications, have found that general purpose, open source recommendation engines fall far short of what is required for on-point real-time suggestions.
Across its platforms, the company hosts close to two billion users and pulls in just under $35 million in annual revenue across its channels, although TikTok is the most globally notable. Keeping users engaged with content and advertising is key to that bottom line, which has pushed the company to build a tailor-made recommendation engine called Monolith, designed for high-volume, high performance online training on the fly—an advantage over some of the more static parameters used in TensorFlow or PyTorch, for example.
The online training system, called Monolith, diverges from other recommenders we’ve seen to date in that it is truly tuned toward real-time, online decision-making via what ByteDance calls a “collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce memory footprint” and an “online training architecture with high fault-tolerance”.
The code is set to be released soon within one of ByteDance’s secondary revenue lines, its SaaS platform called BytePlus Recommend, which lets enterprises tap into the Chinese hyperscaler’s R&D and deploy recommendation as they go.
In essence, instead of taking a slew of mini-batches from training (still handled via TensorFlow) out of direct storage, the system takes in real-time data on the fly and instantly updates the system, meshing those results with existing parameters. “This enables the model to interactively adapt itself according to a user’s feedback in real time” the authors of a thorough walk-through of Monolith describe.
The system ByteDance describes is interesting as well in that it uses Kafka to log user actions (upvoting, clicking an ad, etc.) and then shoots all that data to trusty old HDFS to be gathered and doled out into batches for the on-the-fly training.
Monolith is also noteworthy in terms of the tradeoffs it makes for fault tolerance, which boils down to sacrificing complete accuracy as the parameters are synchronized for better network bandwidth and overall reliability for real-time results.
“Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time”
“We also proved that realtime serving is crucial in recommendation systems, and that parameter synchronization interval should be as short as possible for an ultimate model performance,” creators say.
Although not confirmed, one can imagine this development to be a significant boon to the company’s video-centric TikTok platform in production. “It has been deployed on some of the most important products in ByteDance and is benefitting our enterprise customers,” the company explains.
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.