Resource Lineage
1. Overview
Resource lineage is an important concept in the data lifecycle, including the origin of data and a complete path description to the current position, helping users more conveniently manage data. Specifically, it refers to lineage analysis and impact analysis conducted on resource subjects. Typical resources in BI include datasets, dashboard pages, applications, cards, etc. Field lineage refers to lineage analysis and impact analysis conducted on specific fields in resources, analyzing the flow path of data fields between different data resources.
Users can understand the global resource lineage of current resources through the "Resource Lineage" entry of various resources; they can also view "Field Lineage" more granularly to understand the impact scope of field changes at a glance.
-
Looking Forward: "Who" processed "me", achieving tracking and recording of key information through "Lineage Analysis";
-
Looking Backward: "Who" does "me" support processing, understanding downstream data information of analysis objects through "Impact Analysis", quickly grasping the impact that metadata changes may cause and assessing risks.
2. Usage Guide
2.1. Resource Lineage
Through "Resource Lineage", users can conveniently see the associations between each data application, analysis dashboard, ETL, dataset, etc. The flow direction of data analysis is fully under control, thereby achieving quick understanding of resource upstream and downstream dependencies, resource deletion and modification risk assessment, etc. When troubleshooting problems, they can also quickly locate the problem, such as tracing upward to find key nodes that introduce indicator calculation problems, and assessing downward the impact scope of caliber calculations caused by indicator changes.
View Resource Lineage
There are the following entries for viewing resource lineage: data account list, dataset list/details, visualization analysis cards, pages, applications, data screens, ETL list/details
Analyze the resource lineage of a dataset
-
Enter the Data Center - Dataset interface, and in the operation bar on the right, click the "View Resource Lineage" button as shown in the figure below to enter the resource lineage details page and view the relationships between resources.
-
Resource types include: data accounts, datasets, ETL, dashboards, data screens, applications, etc. Additionally, in the resource lineage page, you can also check "Lineage Analysis" and "Impact Analysis" to view complete lineage information.

- In the lineage canvas, the upper and lower two-level nodes are expanded by default. Click to expand nodes to continue tracing lineage.

- In the resource lineage canvas, if the resource lineage chain is complex, you can use canvas auxiliary positioning.

- For database-type datasets, tracing forward can find which data tables in the database the dataset is associated with.

Analyze the resource lineage of a card
- Enter the data analysis interface, and for a certain card, click the "View Resource Lineage" button as shown in the figure below to enter the resource lineage details page.

- After entering the resource lineage page, it supports using cards and dashboards as analysis objects to display upstream and downstream lineage.

View Information and Node Switching
Guandata BI supports users to view information of any resource node in the resource lineage interface and switch analysis nodes.
- Click on a certain node, and the border of that node will be highlighted. The right side of the page will display detailed information of that node, supporting viewing information such as the creation time, location path, and status of that node. You can also click the view button below the window to jump to the resource details page of that node with one click.

- Among them, ETL or dataset nodes can see the last update time and status.

- Guandata BI supports users to perform "Analysis Object Switching" operations, that is, click the "..." button of any node, select "Switch Analysis Object" from the dropdown menu, and after clicking, use the current node resource as the analysis object to view its upstream and downstream resource lineage chain diagram.
Note: Currently, Guandata's node switching function is only visible to administrators. The permission scope of this function will be further optimized in the future.

Batch Operations
For resource nodes, batch operations can be performed, mainly including batch deletion and application unbinding.
- Batch Deletion
Note: Currently, Guandata's batch deletion function is only visible to administrators. The permission scope of this function will be further optimized in the future.
- Click "Batch Operations" to see the checkboxes of nodes, and you can batch check multiple resource nodes.

- Click "Batch Delete" to delete all selected resource nodes (after deletion, dashboards will be moved to the recycle bin, and other resources cannot be recovered).

- If the node still has downstream dependencies, it cannot be deleted. You can see the nodes that failed to delete in the "Delete Failure Prompt" popup.

- Click "Show Downstream Lineage" to see the specific associated downstream resource content. After assessing the importance, if you confirm to delete, click "Delete Lineage Tree" in the lower right corner to delete all deletable resource nodes in the operation area with one click.

- If the deletion is successful, that node will show a faded and dotted line effect, and will disappear after refreshing. The deletion operation is irreversible, please operate with caution.

- Application Unbinding
If users need to delete some dashboards but want to retain the related applications, they need to unbind the dashboard nodes first.
Click the "..." button of the dashboard resource node, select "Unbind All Related Applications" from the dropdown menu, and after clicking, all related applications of that dashboard will be unbound.

After unbinding, the dashboard will not appear in applications (if you need to restore, you can re-add it in the application details), that is, the downstream resource lineage tree of the dashboard no longer contains applications, and users can only delete the dashboard in subsequent deletion operations.

2.2. Field Lineage
In the past, when data consumers found that certain indicator data on dashboards was inconsistent with historical experience judgment and suspected data problems, data developers had to find cards with data problems on dashboards, trace their dependent datasets, and then check the ETL and data sources upstream of problematic datasets one by one from the dataset lineage to determine which step introduced the problem. If the data processing problem was not introduced by the BI platform, they also needed to trace back to upstream database tables, making the entire process cumbersome and inefficient.
To address this issue, Guandata BI launched the "Field Lineage" function. Users can find upstream and downstream datasets, ETL, and cards associated with a certain indicator based on data lineage, trace upward to find key nodes that introduce indicator calculation problems, and assess downward the impact scope of caliber calculations caused by indicator changes. This will improve the efficiency of data developers in troubleshooting data problems.
Prerequisites
1. The field lineage function is controlled by the system-backend switch. If you need to use it, please enable it in the administrator backend.
- If ETL nodes are involved in upstream and downstream relationships, ETL needs to run at least once to generate field lineage information.
View Field Lineage
- During the troubleshooting process, if you want to see the lineage relationship of a certain field, you can switch to the field lineage tab.
Note: Only datasets and cards support viewing field lineage.

- Check the fields that need to be analyzed in the field list on the left, and the canvas area on the right will display the corresponding lineage-impact analysis results.

- Switch to the resource list tab to view all resources involved in the currently checked fields.

- Enter the editing page of a certain Smart ETL, click on a certain operator node of the ETL, and you can perform field lineage analysis of the node. After selecting fields, the lineage chain of the current field from beginning to end will be highlighted in the canvas; the names of related lineage fields can be displayed on nodes in the lineage chain.

Note:
1. Currently, only datasets and cards support viewing field lineage;
2. The field lineage of ETL output datasets is updated after each ETL run. Therefore, if you find that the field lineage of ETL output datasets does not exist/is incorrect after enabling the switch or in other scenarios, it is recommended to re-execute the ETL and then check whether the field lineage is correct.
2.3. Export Model Documentation
Guandata supports automatically exporting basic version documentation at the dataset and field level based on platform data lineage. The "Export Resource Lineage" entry is added to the upper right corner of the resource lineage canvas; the "Export Model Documentation" entry is added to the field lineage canvas.
For resource lineage, when exporting, it takes ETL output datasets as the main body, traverses all ETL output datasets in the canvas, finds the ETL corresponding to that output dataset and its input datasets, and lists the corresponding inputs one by one.
For field lineage, when exporting, it takes the current analysis object as the main body, traverses all fields upstream and downstream of that analysis object (taking the current analysis object as the main body to find one layer up and down), finds the source fields and source input information corresponding to these fields, and lists the corresponding inputs field by field, one by one.

2.4. Audit Log Supports Lineage Access Records
The audit log supports user access records for the data lineage module [Resource Lineage] and [Field Lineage], mainly recording access time, access user, and other information.
