What Makes a "Good" Prophet Model? (2) - Reduce Redundancy
In my last post, we discussed the importance of avoiding “black box” Prophet models—models so complex and opaque that users struggle to understand the underlying calculations and dependencies. Without clarity, these models can lead to errors, inefficiencies, and unwelcome surprises when something goes wrong.
Today, let’s shift our focus to another characteristic of a "good" Prophet model: keeping redundancy low. Let's make the idea simple, redundancy in a Prophet model refers to variables in the model’s library that aren’t actually used in any products within the workspace. If a Prophet model has high redundancy, this means it has thousands of variables available but only a fraction actively contributing to the calculations.
How to Measure Level of Redundancy?
To measure redundancy, theoretically we can look at the ratio of variables used by products to the total number of variables in the library. A lower percentage suggests higher redundancy. Fortunately, getting no. of variables used of variables used by products in the workspace isn’t just guesswork; we can use Prophet’s diagnostic files to find the exact numbers. Of course, if we can consider indicators and variable definitions that aren’t used by any products, the metric can provide a fuller picture.
Although we can get the no. of variables used by all products from the diagnostic files, some variables may be set to zero or used as bypassing results from other variables, obscuring whether they’re truly in use. This makes it harder to understand the "REAL" level of redundancy in the Prophet model.
From my experience building Prophet models from scratch—particularly before the implementation of IFRS 17—most models needed only around 1,200 to 1,500 variables to cover the full range of ordinary and investment-linked products (same goes to family takaful products, which are more complex that conventional life insurance). This benchmark gives us a rough idea of how many variables are genuinely necessary, and it highlights just how much room for optimization there may be in models that exceed this range.
For example, I’ve seen many Prophet models with variable counts as high as 4,000 to 5,000 or more, but to keep things simple, let’s use a 4,500-variable model for illustration. If we find that only 1,500 of those variables are active in calculations, then just 33.33% are being used, while 66.67% of variables serve no functional purpose in the model. While this is an estimate, it shows how widespread redundancy can be in many models.
Does It Really Matters? Yes, It Does.
At this point, you might be wondering, “What’s the problem with redundancy? I'm not using those variables anyway.” A fair question! If it’s harmless, why should we care?
Unfortunately, redundancy DOES bring issues to our Prophet models. Here’s something to consider: By referring to the above example, that means you’re only utilizing about 33% of what’s in the model. This raises a few important questions:
- Do you understand what the other 3,000 variables are for?
- If you make a modification to the library, how confident are you that it won’t impact any of those 3,000 unused variables?
- Are you certain those 3,000 variables don’t have hidden dependencies on the 1,500 active ones?
One of the challenges with redundancy is that unused variables today won’t necessarily stay unused forever. When we set up a new product, we may need to include an indicator or a variable that hasn’t been used previously, pulling new variables and variable definitions into the active calculation mix. This often happens when a new product design calls for something unique or tailored that wasn’t part of our prior product setups.
Here’s where the real risk kicks in: those newly introduced variables might interact with existing ones that are already in use by other products. This can lead to unforeseen interactions or even conflicts in the calculations, especially if those pre-existing variables have underlying definitions that shift as new elements come into play.
This is why we can’t simply ignore variables that are currently inactive in our model. As our product lineup evolves, those seemingly dormant variables could enter the calculation stream.
Does Your Prophet Model Have Excessive Patching?
For Prophet models with high level of redundancy, excessive patching has become a common practice. This often leading to layers of replicated variable definitions and added indicators used solely to bypass existing settings. For example, an expression might end with an “AND BYPASS_INDICATOR” to sidestep a variable definition rather than addressing the core setup. How about the existing variable definitions without the bypassing Indicators? Well, I think most Prophet modeler will tend to leave those variable definitions in the library - no choice, we DO NOT KNOW what will happen if we delete the existing variable definitions.
These temporary fixes may seem harmless, but they may create serious risks in the future. When Indicators that weren’t previously used suddenly come into play for a new product or new calculation requirements, they can trigger unexpected outcomes and calculation issues.
When we use the bypassing Indicator in our Prophet models, we don't really solve the problems as a whole. We are only addressing only the symptoms of a problem, just like what is explained in a common Chinese idiom: "treating a headache by curing the head, treating foot pain by curing the foot". When we incorporate such bypassing Indicator in the Prophet models, we are unable to add the Indictor into all required variables as there are too many variables to go through for a library with high level of redundancy.
To make mater worse, for Prophet models that cannot fully pass the technical validation using Prophet’s “Validate” function, it is hard to detect these issues in advance as patches may mask underlying errors that only surface later.
Conclusion
Redundancy in Prophet models may seem benign at first glance, but as we've explored, it brings risks and complexities that can hinder both model accuracy and maintainability. With too many unused variables, patching shortcuts, and bypass indicators, a model quickly becomes unwieldy and prone to errors—especially when new products enter the mix or new calculation requirements arise. By reducing redundancy, we’re not just keeping our Prophet models lean; we’re ensuring they remain effective, adaptable, and much easier to validate.
That's one of the reasons I recommend creating Prophet models from scratch, as well as avoiding centralization of Prophet models across different business units.
Related Posts:
- What Makes a “Good” Prophet Model? (1) - Avoid Black Box
- What Makes a "Good" Prophet Model? (2) - Reduce Redundancy
Comments
Post a Comment