You need to join two PCollections.
- A PCollection that contains data from Pub/Sub. This can be created by using the PubSubIO.Read PTransform.
- A PCollection that contains data from BigQuery. If data is static, BigQueryIO.Read transform can be used. If data can change though, the current BigQuery transforms available in Beam probably will not work. One option might be to use transform
PeriodicImpulse
and your own ParDo
to create a periodically changing input. See here for an example (please note that PeriodicImpulse
transform was added recently).
You can combine the data in a ParDo
where PCollection
(1) is the main input and PCollection
(2) is a side input (similar to the example above).
Finally you can stream output to BigQuery using the BigQueryIO.Write transform.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…