I'm working on a way to replicate Google Analytics sequential segments in Big Query. And I've come up with something similar, but I'm not quite convinced about it.
Let's suppose I want to know which people trigger event A and then event B. The best I can think of is to use JOINs generating two tables with both conditions. In an extremely simplified example, avoiding the UNNESTs and the real dot notation to make it easier to read:
SELECT
COUNT(DISTINCT(userId)) AS users
FROM
(SELECT
userId
FROM
ga_table
WHERE
event_category = "Event Category"
AND event_action = "Action 1") a
RIGHT JOIN
(SELECT
userId
FROM
ga_table
WHERE
event_category = "Event Category"
AND event_action = "Action 2") b
JOIN ON
a.userId = b.userId
My problems with this approach are:
- I get the same thing with a WHERE multiple. And actually the sequences are going to be much longer than two events.
- I have no control over the order of the hits, I can't determine that A happens first and then B. I had thought of using something like the hit number, but I don't know how to incorporate it into the code if I split it into JOINs (which I assume is not the way to do it).
- I can't have more than two conditions without having to generate tables and tables to join one to the other. And it's clearly not the right way to do it
Any way to generate something similar to GA sequential segments? Or at least the logic to follow
Thank you very much
question from:
https://stackoverflow.com/questions/65908420/how-to-replicate-ga-sequential-segments-in-big-query 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…