I'm trying to locally debug a Dataflow job which batch uploads to BigQuery but when running locally using the DirectRunning BigQueryBatchFileLoads spits out
"code": 400,
"message": "Project id is missing",
"errors": [
{
"message": "Project id is missing",
"domain": "global",
"reason": "invalid"
}
],
I've tried hard coding the project idoptions.view_as(beam.options.pipeline_options.GoogleCloudOptions).project = PROJECT_ID
And using cli flags python mycode.py --dataset mydataset --project myprojectid
but no luck.
From what I can gather, BigQueryBatchFileLoads is getting the project ID from a runtime value provider and I've attempted to debug the values in the value provider in a DoFn but could not resolve any values.
I'm new to Dataflow / ApacheBeam so I'm hoping the answer is something benign as this must be a very common use case.
Any advice would be appreciated.
question from:
https://stackoverflow.com/questions/65642972/dataflow-bigquerybatchfileloads-cant-find-project-id-with-directrunner 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…