Skip to main content

Regular Inspection Instructions

I. Significance of System Inspection

  • Monitoring and alerting systems can only trigger alarms when the monitored target state reaches the set threshold, with weak perception of resource usage trends. Additionally, the effectiveness of alerts and the timeliness of alert handling also have a significant impact on system stability;

  • Regular inspections are an effective supplement to real-time monitoring. By collecting system operation trend data through manual or automated means, and conducting analysis and risk control, they can effectively improve system operation reliability and reduce operational risks;

  • Discovering problems is a level, solving problems is an ability, and preventing problems is a quality. Early detection and early intervention can control and eliminate risks at the initial stage, maximizing the avoidance of losses caused by system unavailability.

II. Basic Inspection Operation Steps

  1. Click the nine-grid icon in the upper right corner, select Administrator Settings, select Operation & Maintenance Management, Resource Monitoring;

  2. Click the upper right corner of Resource Monitoring, select to view data from the last 7 days

    image.png
    ;

  3. Memory resource health range: average usage rate below 80%, peak usage rate below 90%;

image.png

  1. CPU resource health range: average usage rate below 60%, peak usage rate below 85%;

image.png

  1. Storage resource health range: peak usage rate below 85%.

image.png

III. Deep Inspection Function

Cloud Inspection Atlas Patrol is a value-added functional module built into Guandata BI. This functional module focuses on BI cluster resources and system operation-related situations, automatically generating visual diagnostic reports, effectively improving operation and maintenance inspection efficiency, helping enterprises actively grasp current system load conditions, discover and proactively eliminate operation and maintenance problems in a timely manner, and formulate capacity planning in advance to ensure continuous, stable, and efficient system operation.