Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync with open source how #118

Draft
wants to merge 4,178 commits into
base: li_trunk
Choose a base branch
from
Draft

sync with open source how #118

wants to merge 4,178 commits into from

Conversation

lesterhaynes
Copy link

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

rajkgupt and others added 21 commits April 22, 2024 11:41
* add DataflowRunner in wordCount run command with Gradle

add DataflowRunner in wordCount run command with Gradle

* Update dataflow.md to update mvn package and run commands

Update dataflow.md to update mvn package and run commands
* replace Periodic to Unbounded for BigTableIOST

* replace PeriodicImpulse for Kafka stress test

* refactor

* correct a number of records and records per second

* specify correct resources for kafka servers

* refactor
* Stabilize additional teststream cases.

* Update sdks/go/test/integration/primitives/teststream_test.go

Co-authored-by: Ritesh Ghorse <riteshghorse@gmail.com>

* Update sdks/go/test/integration/primitives/teststream.go

Co-authored-by: Ritesh Ghorse <riteshghorse@gmail.com>

* Update sdks/go/test/integration/primitives/teststream_test.go

Co-authored-by: Ritesh Ghorse <riteshghorse@gmail.com>

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
Co-authored-by: Ritesh Ghorse <riteshghorse@gmail.com>
…1050)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.17.0 to 0.23.0.
- [Commits](golang/net@v0.17.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#31051)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.17.0 to 0.23.0.
- [Commits](golang/net@v0.17.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.22.0 to 0.24.0.
- [Commits](golang/net@v0.22.0...v0.24.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…anslation (#30910)

* iceberg write schematransform and test

* IcebergIO translation and tests

* unify iceberg urns and identifiers; update some comments

* replace icebergIO translation with iceberg schematransform translation; fix Schema::sorted to do recursive sorting

* removed new proto file and moved Managed URNs to beam_runner_api.proto; we now use SchemaTransformPayload for all schematransforms, including Managed; adding a version number to FileWriteResult encoding so that we can use it to fork in the future whhen needed

* Row and Schema snake_case <-> camelCase conversion logic

* Row sorted() util

* use Row::sorted to fetch Managed & Iceberg row configs

* use snake_case convention when translating transforms to spec; remove Managed and Iceberg urns from proto and use SCHEMA_TRANSFORM URN

* perform snake_case <-> camelCase conversions directly in TypedSchemaTransformProvider

* add SchemaTransformTranslation abstraction. when encountering a SCHEMA_TRANSFORM urn, fetch underlying identifier

* prioritize registered providers; remove snake_case <-> camelCase conversions from python side
* [YAML] - Normalize YAML PubSub format

* [YAML] - Fix format

* [YAML] - Fix import

* [YAML] - Update tests and web
Using `View#asList()` can be *significantly* more expesnive than
`View#asIterable()` as the former creates a distribtued mapping
of range(0, N) to the elements, and iterating over it
(while respecting this order) requries O(N) non-continguous point
lookups.

These counters should be useful in diagnosing performance issues.
* Fix building release candidate

* String change
Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
#30770)

* fix several bugs regarding avto <-> beam schema conversion

* fix test with my bizzare workaround

* fix import in test

* fix linting issue

* comments -> docstrings

* formatting

* review comments

* add skip test config to sqltransform test

* fix docstrings

* formatting

* review comment

* uncomment test skip

* always return map in beam_type_to_avro_type

* unnest primitive types for arrays and maps too

* remove print stmt
lostluck and others added 30 commits May 21, 2024 10:11
…ly in element batches. (#31319)

* Don't swallow data errors on timer errors.

* reset error in case of timer handling

* Don't drop timers on splits.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
* BigQueryIO write throttling detection

* double retry interval
* Logging loading  filesystems failures.

---------

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
Bumps com.gradle.enterprise from 3.17.2 to 3.17.4.

---
updated-dependencies:
- dependency-name: com.gradle.enterprise
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#31344)

* Disable soft delete policy when creating new default bucket.

* Fix lints

* Add a check on returned bucket name.
… enabled (#31358)

* Add warning to gcpTempLocation when its bucket has soft delete policy enabled.

* Fix an issue reported by spotbugs.
* Don't set SETUPTOOLS_USE_DISTUTILS=stdlib since it doesn't work on Py3.12

* Unblock tests.
…hentication (#31352)

* Updates the Java Expansion Service container to support Google ALTS authentication

* Fix spotless

* Updates the description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment