Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
109 views
in Technique[技术] by (71.8m points)

python - Dataflow BigQueryBatchFileLoads cant find project ID with DirectRunner

I'm trying to locally debug a Dataflow job which batch uploads to BigQuery but when running locally using the DirectRunning BigQueryBatchFileLoads spits out

    "code": 400,
    "message": "Project id is missing",
    "errors": [
      {
        "message": "Project id is missing",
        "domain": "global",
        "reason": "invalid"
      }
    ],

I've tried hard coding the project idoptions.view_as(beam.options.pipeline_options.GoogleCloudOptions).project = PROJECT_ID
And using cli flags python mycode.py --dataset mydataset --project myprojectid but no luck.

From what I can gather, BigQueryBatchFileLoads is getting the project ID from a runtime value provider and I've attempted to debug the values in the value provider in a DoFn but could not resolve any values.

I'm new to Dataflow / ApacheBeam so I'm hoping the answer is something benign as this must be a very common use case.

Any advice would be appreciated.

question from:https://stackoverflow.com/questions/65642972/dataflow-bigquerybatchfileloads-cant-find-project-id-with-directrunner

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...