Skip to main content

Using View Dataset to Count Foot Traffic

Requirement Scenario

The data source is a slowly changing dimension table that records daily user check-in time information, including two time points: entry time and exit time. Now we need to filter by date range and count the foot traffic changes per minute during this period. Similar scenarios include: foot traffic changes in shopping malls, tourist attractions, laboratories, etc., online live streaming room viewer monitoring, ratings statistics, etc. The effect diagram is as follows.

image.png

Implementation Challenges

  1. The dashboard requires minute-level granularity display, but the two time points in the dataset represent a time range. If ETL is used to expand the date range to minute level, the data volume will expand dramatically, which is unfriendly to both the system and report loading; moreover, ETL does not support dynamic calculation with linked global parameters, so ETL is not recommended.

  2. In the card, the minute dimension depends on the dataset and will be affected by filter conditions, making it impossible to always be fixed at 0~59. To achieve a dimension that is not affected by the dataset and filter conditions, global parameters and view dataset need to be used here.

  3. Due to BI functionality limitations, date filters with hours, minutes, and seconds cannot link to global parameters, so we need to separate date and hour into 2 filters, temporarily not considering minutes and seconds. For example, selecting hour 8 means filtering the period from 8:00 to 8:59 in the morning.

Implementation Method

1. Create "View Dataset"

Select the slowly changing dimension table dataset to use, and write the SQL statement as follows. From the middle parameter bar, find a date-type global parameter for date filtering (e.g., [DYNAMIC_PARAMS.Query Date]), and prepare a text-type multi-select parameter for hour numbers (e.g., [DYNAMIC_PARAMS.Multi-select], so the filter can multi-select; for single selection, you can also use single-select text parameters or numeric parameters).

select h.`Hour`,m.`Minute`, count(distinct input1.`User`) as `User Count` 
from
(select * from
(select distinct date(input1.`Entry Time`) as `Date` , explode(sequence(0,23)) as `Hour`,1 as `Constant`
from input1 )
where `Date`=date([DYNAMIC_PARAMS.Query Date]) and `Hour` in ([DYNAMIC_PARAMS.Multi-select])) h
left join (select explode(sequence(0,59)) as `Minute`,1 as `Constant`) m on h.`Constant`=m.`Constant`
left join input1
on (h.`Hour`> hour(input1.`Entry Time`) and h.`Hour`=minute(input1.`Entry Time`)) or (h.`Hour`= hour(input1.`Exit Time`) and m.`Minute`<=minute(input1.`Exit Time`))
group by h.`Hour`,m.`Minute`
order by h.`Hour`,m.`Minute`

image.png

Comments

Part1: Get the date and hour numbers filtered by the page, and implement cross join to get temporary table h. Filtering 2022-11-25 and hours 8, 9, the actual temporary table data is as follows.

image.png

Part2: Create temporary table m, expand from 0 to 59, use constant field to join with temporary table h (at this time the effect is equivalent to cross join), to get a time dimension table containing query date, query hour, and complete minute numbers, totaling 120 rows.

image.png

Part3: Use the time dimension table obtained above to continue left join with the slowly changing dimension table dataset, and perform deduplication count on user ID. Pay attention to the join logic: the hour of "Entry Time" needs to match minute numbers greater than or equal to the "Entry Time"; the hour of "Exit Time" needs to match minute numbers less than or equal to the "Exit Time"; hours in between all need to match.

2. Create Dashboard

  1. Use the created view dataset to create a new "Line Chart" card, drag the fields to be displayed into the dimension bar and value bar, set sorting and chart styles, save and return to the page (due to parameter default value settings, preview may not be available, don't worry for now; you can also modify default values from the right side to preview the effect).

image.png

  1. Create a new date filter, select "Equal" as the comparison operator, set default values, then link to the date parameter in the card.

Note: Do not check "Provide hour, minute, second selection" because adding hour, minute, second will prevent linking to any parameters; also do not set "Range", multiple dates will seriously affect the visual effect of line charts with minutes as dimensions; default values must be set, when there are no filter items, the parameter's own default value takes effect, and the chart may have no data display.

image.png

  1. Create a manual dimension table containing hour numbers (0~23) locally in Excel, upload it to GuanBI, use this dataset to create a "Selection" type filter, set single or multi-select as needed, and set default values (required); then link to the text parameter in the card. The effect when hour numbers are multi-select is shown in the figure below.

image.png