Cloud Inspection Introduction
1. Overview of Cloud Inspection
Cloud Inspection (also known as Cloud Diagnosis or Intelligent O&M) is an intelligent operation and maintenance service provided by Guandata. It shares Guandata's years of digital management technology in the form of a product, offering one-stop full connection to make IT operations more intelligent.
Cloud Inspection focuses on the cluster resources and operation status of BI systems. There is no need for manual data extraction and analysis. By automatically generating visual analysis reports, it quickly identifies O&M issues, proactively eliminates faults, and provides optimization/solution suggestions, reducing the cost of daily O&M work and enabling proactive capacity planning.
Tip: If you want to try this module, please contact your Guandata Customer Success Manager (usually your company's current service contact).
2. Application Scenarios of Cloud Inspection
2.1 System Health Assessment
Comprehensive evaluation, in-depth analysis, and proactive risk or issue prevention.
- Comprehensive statistics and analysis of system health from multiple dimensions: system performance/capacity, dashboards/cards, datasets, ETL, users, etc., building deep cloud inspection capabilities.
- Based on expert experience, a system health scoring system is constructed, providing a clear overview of system health levels.
- Automatically generates O&M reports in the cloud, with future support for scheduled push after subscription.
2.2 System Risk Identification
Comparison and reasoning based on expert experience and industry best practices to predict risks in advance and eliminate problems at the source.
- Analyze reliability characteristic data with expert experience to enhance O&M professionalism.
- Timely detection of performance anomalies, resolving risks during the sub-health period, and identifying performance bottlenecks.
- Proactively identify risks, turning passive handling into proactive prevention and improving reliability.
2.3 Resource Capacity Planning (Coming Soon)
Capacity monitoring, accurately identifying overloaded or idle resources; providing optimal expansion solutions based on industry best practices.
- Predict future capacity trends and provide optimal expansion solutions from the perspective of cost and performance.
- Identify overloaded resources, provide early capacity warnings, and reduce potential risks.
- Identify idle resources, provide optimal recycling suggestions, and improve resource utilization.
3. How to Use Cloud Inspection
Cloud Inspection is available in version 4.4.0 and above. The usage process is as follows:
3.1 Open Cloud Inspection
Click the grid icon in the upper right corner of the platform and select Cloud Inspection to enter the Cloud Inspection interface.


3.2 Obtain Cloud Inspection Report
Cloud Inspection reports are available in online and offline versions, depending on whether your environment can access the internet.
3.2.1 Online Cloud Inspection
Click once to get the latest inspection report.

3.2.2 Offline Cloud Inspection
If your environment cannot directly access external services (such as the Cloud Inspection service), you can perform Cloud Inspection as follows:
step 1: Manually export data and send it to Guandata service personnel
In the "Data Selection" section, click "Export Data" to export the last 30 days of data.
After exporting, send the data to Guandata service personnel.

step 2: Guandata service personnel generate the inspection report and send you the report link
After receiving your data, Guandata personnel will manually generate the corresponding report in the Cloud Inspection service backend and send you the report link.
You can view the Cloud Inspection results through this link.
3.3 Switching Cloud Inspection Reports
3.3.1 Date Switch
Click the "Date" selection box in the upper left corner of the Cloud Inspection interface to switch and view historical reports. Click different dates in the dropdown (the time here is equivalent to the report name), and the interface will display the report details after switching.

3.3.2 Domain Switch
Click the "Domain" selection box in the upper left corner of the interface. You can select a domain in the dropdown and switch to view data from different domains.

3.4 Cloud Inspection Report Content
The Cloud Inspection report area includes three different interpretation modes: Inspection Report Overview, System O&M Interpretation, and Business Governance Interpretation.

3.4.1 Inspection Report Overview
The overview mainly includes five categories: system performance/capacity, dashboards/cards, datasets, ETL, and users. You can quickly view them by clicking the navigation bar on the right.

The inspection report will display potential anomaly analysis, alert users to risks, and provide diagnostic suggestions for reference.
For example, for the metric "Top 20 Datasets with the Most Update Failures in the Last 31 Days", when a dataset in the user's environment fails to update multiple times or even continuously, the Cloud Inspection report will alert the top 20 datasets with the highest frequency of such occurrences (as shown in the figure).
At this time, it is recommended to first change the update method of such datasets to "manual update" to avoid continuous update failures and waste of system resources. Then, according to the actual situation, move the corresponding datasets to a unified folder for troubleshooting one by one. The operation is as follows:
Step 1: Click the "Batch Operation" button.

Step 2: Select "Modify Update Settings" to change the problematic datasets to "manual update".


Step 3: Select "Batch Move" to move these datasets to a unified folder for subsequent tracking and troubleshooting.

Other supported operations for metrics are as follows:
- Batch move datasets supports metrics: Datasets with no consumption, datasets generating invalid consumption, top 20 extracted datasets by runtime in the last 31 days (runtime ≥10s), top 20 datasets with the most update failures in the last 31 days, datasets not updated in the last 31 days, empty datasets;
- Batch modify update method for datasets supports metrics: Datasets with no consumption, top 20 datasets with the most update failures in the last 31 days;
- Batch move ETL supports metrics: Top 20 ETLs by CPU usage time in the last 31 days (CPU usage time ≥10s), top 20 ETLs by update runs in the last 31 days (runs ≥5), top 20 ETLs with the most update failures in the last 31 days (runs ≥5), ETLs not run in the last 31 days, ETLs created more than 31 days ago but not yet run;
- Batch modify update method for ETL supports metrics: Top 20 ETLs with the most update failures in the last 30 days (runs ≥5), top 20 ETLs by CPU usage time in the last 31 days (CPU usage time ≥10s), top 20 ETLs by update runs in the last 31 days (runs ≥5);
- Batch modify publish status for dashboards supports metrics: Dashboards with 0 visitors in the last 31 days;
3.4.2 System O&M Interpretation
The System O&M Interpretation mode mainly provides cause analysis, troubleshooting ideas, and action optimization suggestions for common problems in scenarios. Users can view metric information and quickly jump to modify related configurations according to the guidance.
- Experience scenarios: Card loading, ETL operation, dataset;
- Performance scenarios: Disk operation and maintenance, memory load, server resource configuration.

3.4.3 Business Governance Interpretation
Function Background
- For system administrators, it is necessary to regularly pay attention to data assets such as datasets, ETL, dashboards, and cards in the system, and check related assets:
- For unused or invalid data assets, governance operations (gray decommissioning, deletion) are required;
- For data assets with high performance burden, it is necessary to assess whether governance operations are needed.
- In addition, system administrators will also pay attention to the performance of important data assets (high resource heat) in the system to ensure user experience.
Function Introduction
The Business Governance Interpretation mode is mainly divided into machine resource usage inventory and data asset management inventory, providing inventory ideas and action optimization suggestions for the system resources consumed and business value generated by datasets, ETL, dashboards, and cards.
System administrators can view metric information according to the provided ideas, understand the resource occupation and usage of data assets in the platform, pay attention to corresponding assets and optimize as needed, and better manage the BI platform.

- Machine resource usage inventory

- Data asset management inventory

Inventory Scenario 3: Identifying Zombie ETLs in the System
In large-scale user scenarios, there are some typical BI usage patterns that can be optimized. Under this module, users can know which ETLs in the system have been running continuously after creation, but the generated datasets and cards have not created value.
By tracing the entire governance chain of ETL-dataset-card, users can use the batch operation function provided by Cloud Inspection to quickly and conveniently govern zombie ETLs, thereby saving computing resources and improving performance experience.
Note: If you need this function, please contact Guandata personnel.

3.5 Report Update Methods
3.5.1 Obtain Cloud Inspection Report
Click the "Get Latest Report" button in the lower right operation bar of the Cloud Inspection interface. Usually, after a short wait, you can see the latest report content on the interface.

In the report content interface, based on the manually obtained time point, the system will automatically inspect the operation data of the last 30 days (the data range is the previous day and the 30 days before it). After the inspection is completed, scroll down the page to view the details.
3.5.2 Update Method Settings
Click the "Settings" button in the lower right operation bar of the Cloud Inspection interface to enter the settings interface.

You can choose "Manual Online Update" or "Automatic Online Update". If you choose "Automatic Online Update", you can set the specific time to the minute.

3.6 View Cloud Inspection Report History
Click the "Update Records" button in the lower right operation bar of the Cloud Inspection interface to enter the update records interface and see the update history list.

In the update records list, click the specific report name in the first column "Report Name" to enter the detailed interface of the report.

3.7 Sharing and Interpretation of Cloud Inspection Reports
Click the "Professional Report Interpretation" button in the upper right corner of the Cloud Inspection interface to copy the report link and share the Cloud Inspection report. Others can open the Cloud Inspection report and export report data through this link. You can send the report to Guandata staff for professional interpretation.

3.8 Feedback on Cloud Inspection Reports
If you have feedback on the analysis and diagnosis report of Cloud Inspection during use, you can click the "Feedback" button in the upper right corner of the Cloud Inspection interface and fill in your feedback in detail on the page, including the problem scenario, problem details, and expected solution. After completion, Guandata staff will contact you based on the feedback received to solve your problem.


4. Advantages of Cloud Inspection
4.1 Visual Inspection Report — Clear at a Glance
Just click easily, no other operations are required, and you can view the visual diagnostic report. One-click automated statistics of system cluster resource and application usage data, including more than 100 inspection metrics, without manual data collection. The data analysis is comprehensive, the report content is clear and beautiful, and the overall situation is clear at a glance.
4.2 Rich O&M Experience — Intelligent Interpretation
Combining Guandata's rich O&M experience and years of service to many customers, the expert experience is toolized to realize intelligent interpretation of inspection reports and further diagnose system status.
4.3 Efficient Action Guide — Practical and Executable
Cloud Inspection can combine the specific information of the enterprise system and provide more accurate, intelligent, and comprehensive O&M suggestions based on Guandata's rich O&M experience and strategy rules, providing enterprise users with actionable system optimization guides to ensure continuous, stable, and efficient system operation.
4.4 Cloud Service Tooling — Zero Cost and Low Threshold
- Zero cost — No need to consume local computing resources, all calculations are done in the cloud.
- Low threshold — As one of Guandata's one-stop services, zero-code operation, simple process.
- High growth — The continuously updated and growing "Cloud Inspection" platform, with feature updates requiring no extra handling by users and no burden on users.