Custom status thresholds allow you to fine-tune telemetry monitoring. This article provides practical guidance for setting thresholds that balance responsiveness with noise reduction.
Why it matters
- Low thresholds may result in status ‘flapping’ and not reflect accurate status
- High thresholds may result in delayed detection of telemetry ingestion/collection issues.
The goal is to align thresholds with integration behaviour, business impact, and maintenance patterns.
Best Practices
1. Start Conservative, Then Optimize
- Review and leverage default thresholds to begin with and if not satisfactory then look to customize for your operational needs.
- Monitor notification frequency and adjust to reduce false positives while maintaining reliability.
2. Understand the Integration’s Normal Behavior
- Review historical data through the events graph or via Advanced Query to determine typical ingest times or intervals.
- Avoid setting thresholds too aggressively to ensure notifcation of minor, non-critical fluctations in system behaviour.
3. Balance Sensitivity and Noise
- Warning Threshold: Set this to catch early signs of delay without triggering too often.
- Critical Threshold: Make this significantly higher than the warning threshold to indicate a real issue.
4. Consider Business Impact
- For mission-critical integrations use tighter thresholds.
- For non-critical allow more flexibility to avoid unnecessary notifications.
5. Account for Maintenance Windows
- If integrations have scheduled downtime or maintenance, adjust thresholds or disable notifications during those periods.
Common Pitfalls
- Ignoring seasonal patterns: Traffic spikes or quiet periods can skew thresholds.
- One-size-fits-all settings: Different integrations need different thresholds.
- Not revisiting thresholds: Review quarterly or after major changes.
For information on Status Thresholds and steps on how to customize please review Status Thresholds