Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.8.0
Choose a base ref
...
head repository: googleapis/python-bigquery-dataframes
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.9.0
Choose a head ref
  • 15 commits
  • 59 files changed
  • 8 contributors

Commits on Oct 12, 2023

  1. Unverified

    This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
    Copy the full SHA
    3b51a36 View commit details
  2. feat: send BigQuery cancel request when canceling bigframes process (#…

    …103)
    
    Co-authored-by: Henry J Solberg <henryjsolberg@google.com>
    milkshakeiii and Henry J Solberg authored Oct 12, 2023

    Verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    Copy the full SHA
    e325fbb View commit details
  3. Partially verified

    This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
    We cannot verify signatures from co-authors, and some of the co-authors attributed to this commit require their commits to be signed.
    Copy the full SHA
    36693bf View commit details
  4. refactor: all ArrayValue ops return only ArrayValue (#92)

    * refactor: all ArrayValue ops return only ArrayValue
    
    * copyright notice
    
    ---------
    
    Co-authored-by: Tim Swast <swast@google.com>
    TrevorBergeron and tswast authored Oct 12, 2023

    Unverified

    This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
    Copy the full SHA
    855616a View commit details

Commits on Oct 13, 2023

  1. docs: add open-source link in API doc (#106)

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
    - [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
    - [ ] Ensure the tests and linter pass
    - [ ] Code coverage does not decrease (if any source code was changed)
    - [ ] Appropriate docs were updated (if necessary)
    
    Fixes #<issue_number_goes_here> 🦕
    ashleyxuu authored Oct 13, 2023

    Unverified

    This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
    Copy the full SHA
    db51fe3 View commit details
  2. Unverified

    This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
    Copy the full SHA
    1b3f3a5 View commit details
  3. style: improve cancellation string (#111)

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
    - [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
    - [ ] Ensure the tests and linter pass
    - [ ] Code coverage does not decrease (if any source code was changed)
    - [ ] Appropriate docs were updated (if necessary)
    
    Fixes #<issue_number_goes_here> 🦕
    milkshakeiii authored Oct 13, 2023
    Copy the full SHA
    752a1d6 View commit details

Commits on Oct 16, 2023

  1. perf: if primary keys are defined, read_gbq avoids copying table da…

    …ta (#112)
    
    We make the same uniqueness assumption as the query engine and use these columns as the total ordering.
    
    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
    - [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
    - [ ] Ensure the tests and linter pass
    - [ ] Code coverage does not decrease (if any source code was changed)
    - [ ] Appropriate docs were updated (if necessary)
    
    Fixes internal issue b/305260214 🦕
    tswast authored Oct 16, 2023
    Copy the full SHA
    e6c0cd1 View commit details

Commits on Oct 17, 2023

  1. feat: add AtIndexer getitems (#107)

    * feat: add AtIndexer getitems
    
    * fix third party docstrings
    
    * use loc from at
    
    ---------
    
    Co-authored-by: Henry J Solberg <henryjsolberg@google.com>
    milkshakeiii and Henry J Solberg authored Oct 17, 2023
    Copy the full SHA
    752b01f View commit details
  2. feat: Support external packages in remote_function (#98)

    * feat: Support external packages in `remote_function`
    
    * Update code sample demonstrating external packages for `remote_function`
    
    * GCF customization for hackathon
    shobsi authored Oct 17, 2023
    Copy the full SHA
    ec10c4a View commit details
  3. Copy the full SHA
    a6dab9c View commit details
  4. feat: add bigframes.options.bigquery.application_name for partner a…

    …ttribution (#117)
    
    Because `session.py` was getting long, this also refactors `session.py` to separate client construction in a separate module.
    
    Fixes internal issue 305950924
    🦕
    tswast authored Oct 17, 2023
    Copy the full SHA
    52d64ff View commit details

Commits on Oct 18, 2023

  1. fix: fix TODOs for loc multiindex (#113)

    Co-authored-by: Henry J Solberg <henryjsolberg@google.com>
    milkshakeiii and Henry J Solberg authored Oct 18, 2023
    Copy the full SHA
    68e3cd3 View commit details
  2. feat: use ArrowDtype for STRUCT columns in to_pandas (#85)

    Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
    - [ ] Make sure to open an issue as a [bug/issue](https://togithub.com/googleapis/python-bigquery-dataframes/issues/new/choose) before writing your code!  That way we can discuss the change, evaluate designs, and agree on the general idea
    - [ ] Ensure the tests and linter pass
    - [ ] Code coverage does not decrease (if any source code was changed)
    - [ ] Appropriate docs were updated (if necessary)
    
    Fixes #<issue_number_goes_here> 🦕
    tswast authored Oct 18, 2023
    Copy the full SHA
    9238fad View commit details
  3. chore(main): release 0.9.0 (#108)

    🤖 I have created a release *beep* *boop*
    ---
    
    
    ## [0.9.0](https://togithub.com/googleapis/python-bigquery-dataframes/compare/v0.8.0...v0.9.0) (2023-10-18)
    
    
    ### ⚠ BREAKING CHANGES
    
    * rename `bigframes.pandas.reset_session` to `close_session` ([#101](https://togithub.com/googleapis/python-bigquery-dataframes/issues/101))
    
    ### Features
    
    * Add `bigframes.options.bigquery.application_name` for partner attribution ([#117](https://togithub.com/googleapis/python-bigquery-dataframes/issues/117)) ([52d64ff](https://togithub.com/googleapis/python-bigquery-dataframes/commit/52d64ffdbbab16b1d94974b543ce9080be1ec0d1))
    * Add AtIndexer getitems ([#107](https://togithub.com/googleapis/python-bigquery-dataframes/issues/107)) ([752b01f](https://togithub.com/googleapis/python-bigquery-dataframes/commit/752b01ff9df114c54ed58eb96956e9ce34a8ed47))
    * Rename `bigframes.pandas.reset_session` to `close_session` ([#101](https://togithub.com/googleapis/python-bigquery-dataframes/issues/101)) ([36693bf](https://togithub.com/googleapis/python-bigquery-dataframes/commit/36693bff398c23e179d9bde95d52cbaddaf85c45))
    * Send BigQuery cancel request when canceling bigframes process ([#103](https://togithub.com/googleapis/python-bigquery-dataframes/issues/103)) ([e325fbb](https://togithub.com/googleapis/python-bigquery-dataframes/commit/e325fbb1c91e040d87df10f7d4d5ce53f7c052cb))
    * Support external packages in `remote_function` ([#98](https://togithub.com/googleapis/python-bigquery-dataframes/issues/98)) ([ec10c4a](https://togithub.com/googleapis/python-bigquery-dataframes/commit/ec10c4a5a7833c42e28fe9e7b734bc0c4fb84b6e))
    * Use ArrowDtype for STRUCT columns in `to_pandas` ([#85](https://togithub.com/googleapis/python-bigquery-dataframes/issues/85)) ([9238fad](https://togithub.com/googleapis/python-bigquery-dataframes/commit/9238fadcfa7e843be6564813ff3131893b79f8b0))
    
    
    ### Bug Fixes
    
    * Support multiindex for three loc getitem overloads ([#113](https://togithub.com/googleapis/python-bigquery-dataframes/issues/113)) ([68e3cd3](https://togithub.com/googleapis/python-bigquery-dataframes/commit/68e3cd37258084d045ea1075e5e61df12c28faac))
    
    
    ### Performance Improvements
    
    * If primary keys are defined, `read_gbq` avoids copying table data ([#112](https://togithub.com/googleapis/python-bigquery-dataframes/issues/112)) ([e6c0cd1](https://togithub.com/googleapis/python-bigquery-dataframes/commit/e6c0cd1777736e0fa7285da59625fbac487573bd))
    
    
    ### Documentation
    
    * Add documentation for `Series.struct.field` and `Series.struct.explode` ([#114](https://togithub.com/googleapis/python-bigquery-dataframes/issues/114)) ([a6dab9c](https://togithub.com/googleapis/python-bigquery-dataframes/commit/a6dab9cdb7dd0e56c93ca96b665ab1be1baac5e5))
    * Add open-source link in API doc ([#106](https://togithub.com/googleapis/python-bigquery-dataframes/issues/106)) ([db51fe3](https://togithub.com/googleapis/python-bigquery-dataframes/commit/db51fe340f644a0d7c911c11d92c8299a4be3446))
    * Update ML overview API doc ([#105](https://togithub.com/googleapis/python-bigquery-dataframes/issues/105)) ([1b3f3a5](https://togithub.com/googleapis/python-bigquery-dataframes/commit/1b3f3a5374915b2833c6c1ac05670e9708f07bff))
    
    ---
    This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
    release-please[bot] authored Oct 18, 2023
    Copy the full SHA
    e2788a8 View commit details
Showing with 2,092 additions and 581 deletions.
  1. +33 −0 CHANGELOG.md
  2. +13 −4 README.rst
  3. +2 −2 bigframes/__init__.py
  4. +20 −1 bigframes/_config/bigquery_options.py
  5. +17 −24 bigframes/clients.py
  6. +7 −8 bigframes/core/__init__.py
  7. +4 −4 bigframes/core/block_transforms.py
  8. +73 −54 bigframes/core/blocks.py
  9. +1 −1 bigframes/core/global_session.py
  10. +65 −21 bigframes/core/indexers.py
  11. +72 −29 bigframes/core/indexes/index.py
  12. +46 −0 bigframes/core/joins/name_resolution.py
  13. +22 −22 bigframes/core/joins/row_identity.py
  14. +37 −145 bigframes/core/joins/single_column.py
  15. +20 −15 bigframes/dataframe.py
  16. +17 −0 bigframes/dtypes.py
  17. +16 −0 bigframes/formatting_helpers.py
  18. +8 −8 bigframes/ml/llm.py
  19. +10 −0 bigframes/operations/__init__.py
  20. +2 −2 bigframes/operations/base.py
  21. +2 −2 bigframes/operations/structs.py
  22. +7 −3 bigframes/pandas/__init__.py
  23. +38 −13 bigframes/remote_function.py
  24. +7 −3 bigframes/series.py
  25. +112 −178 bigframes/{session.py → session/__init__.py}
  26. +196 −0 bigframes/session/clients.py
  27. +1 −1 bigframes/version.py
  28. +8 −0 docs/reference/bigframes.pandas/series.rst
  29. +2 −0 docs/templates/toc.yml
  30. +1 −1 notebooks/generative_ai/bq_dataframes_llm_code_generation.ipynb
  31. +1 −1 notebooks/getting_started/getting_started_bq_dataframes.ipynb
  32. +1 −1 notebooks/regression/bq_dataframes_ml_linear_regression.ipynb
  33. +2 −1 noxfile.py
  34. +1 −1 samples/snippets/quickstart_test.py
  35. +12 −6 samples/snippets/remote_function.py
  36. +1 −1 samples/snippets/remote_function_test.py
  37. +31 −0 tests/system/conftest.py
  38. +45 −0 tests/system/large/test_remote_function.py
  39. +3 −3 tests/system/small/ml/test_llm.py
  40. +57 −0 tests/system/small/test_dataframe.py
  41. +20 −3 tests/system/small/test_dataframe_io.py
  42. +6 −6 tests/system/small/test_pandas_options.py
  43. +44 −0 tests/system/small/test_series.py
  44. +25 −0 tests/system/small/test_session.py
  45. +4 −0 tests/unit/_config/test_bigquery_options.py
  46. +2 −1 tests/unit/resources.py
  47. +13 −0 tests/unit/session/__init__.py
  48. +114 −0 tests/unit/session/test_clients.py
  49. +1 −1 tests/unit/{ → session}/test_session.py
  50. +4 −12 tests/unit/test_clients.py
  51. +3 −3 tests/unit/test_pandas.py
  52. +202 −0 third_party/bigframes_vendored/google_cloud_bigquery/LICENSE
  53. +13 −0 third_party/bigframes_vendored/google_cloud_bigquery/__init__.py
  54. +158 −0 third_party/bigframes_vendored/google_cloud_bigquery/_pandas_helpers.py
  55. +13 −0 third_party/bigframes_vendored/google_cloud_bigquery/tests/__init__.py
  56. +13 −0 third_party/bigframes_vendored/google_cloud_bigquery/tests/unit/__init__.py
  57. +413 −0 third_party/bigframes_vendored/google_cloud_bigquery/tests/unit/test_pandas_helpers.py
  58. +5 −0 third_party/bigframes_vendored/pandas/core/frame.py
  59. +26 −0 third_party/bigframes_vendored/pandas/core/series.py
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,39 @@

[1]: https://pypi.org/project/bigframes/#history

## [0.9.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v0.8.0...v0.9.0) (2023-10-18)


### ⚠ BREAKING CHANGES

* rename `bigframes.pandas.reset_session` to `close_session` ([#101](https://github.com/googleapis/python-bigquery-dataframes/issues/101))

### Features

* Add `bigframes.options.bigquery.application_name` for partner attribution ([#117](https://github.com/googleapis/python-bigquery-dataframes/issues/117)) ([52d64ff](https://github.com/googleapis/python-bigquery-dataframes/commit/52d64ffdbbab16b1d94974b543ce9080be1ec0d1))
* Add AtIndexer getitems ([#107](https://github.com/googleapis/python-bigquery-dataframes/issues/107)) ([752b01f](https://github.com/googleapis/python-bigquery-dataframes/commit/752b01ff9df114c54ed58eb96956e9ce34a8ed47))
* Rename `bigframes.pandas.reset_session` to `close_session` ([#101](https://github.com/googleapis/python-bigquery-dataframes/issues/101)) ([36693bf](https://github.com/googleapis/python-bigquery-dataframes/commit/36693bff398c23e179d9bde95d52cbaddaf85c45))
* Send BigQuery cancel request when canceling bigframes process ([#103](https://github.com/googleapis/python-bigquery-dataframes/issues/103)) ([e325fbb](https://github.com/googleapis/python-bigquery-dataframes/commit/e325fbb1c91e040d87df10f7d4d5ce53f7c052cb))
* Support external packages in `remote_function` ([#98](https://github.com/googleapis/python-bigquery-dataframes/issues/98)) ([ec10c4a](https://github.com/googleapis/python-bigquery-dataframes/commit/ec10c4a5a7833c42e28fe9e7b734bc0c4fb84b6e))
* Use ArrowDtype for STRUCT columns in `to_pandas` ([#85](https://github.com/googleapis/python-bigquery-dataframes/issues/85)) ([9238fad](https://github.com/googleapis/python-bigquery-dataframes/commit/9238fadcfa7e843be6564813ff3131893b79f8b0))


### Bug Fixes

* Support multiindex for three loc getitem overloads ([#113](https://github.com/googleapis/python-bigquery-dataframes/issues/113)) ([68e3cd3](https://github.com/googleapis/python-bigquery-dataframes/commit/68e3cd37258084d045ea1075e5e61df12c28faac))


### Performance Improvements

* If primary keys are defined, `read_gbq` avoids copying table data ([#112](https://github.com/googleapis/python-bigquery-dataframes/issues/112)) ([e6c0cd1](https://github.com/googleapis/python-bigquery-dataframes/commit/e6c0cd1777736e0fa7285da59625fbac487573bd))


### Documentation

* Add documentation for `Series.struct.field` and `Series.struct.explode` ([#114](https://github.com/googleapis/python-bigquery-dataframes/issues/114)) ([a6dab9c](https://github.com/googleapis/python-bigquery-dataframes/commit/a6dab9cdb7dd0e56c93ca96b665ab1be1baac5e5))
* Add open-source link in API doc ([#106](https://github.com/googleapis/python-bigquery-dataframes/issues/106)) ([db51fe3](https://github.com/googleapis/python-bigquery-dataframes/commit/db51fe340f644a0d7c911c11d92c8299a4be3446))
* Update ML overview API doc ([#105](https://github.com/googleapis/python-bigquery-dataframes/issues/105)) ([1b3f3a5](https://github.com/googleapis/python-bigquery-dataframes/commit/1b3f3a5374915b2833c6c1ac05670e9708f07bff))

## [0.8.0](https://github.com/googleapis/python-bigquery-dataframes/compare/v0.7.0...v0.8.0) (2023-10-12)


17 changes: 13 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
@@ -13,6 +13,7 @@ BigQuery DataFrames is an open-source package. You can run
Documentation
-------------

* `BigQuery DataFrames source code (GitHub) <https://github.com/googleapis/python-bigquery-dataframes>`_
* `BigQuery DataFrames sample notebooks <https://github.com/googleapis/python-bigquery-dataframes/tree/main/notebooks>`_
* `BigQuery DataFrames API reference <https://cloud.google.com/python/docs/reference/bigframes/latest>`_
* `BigQuery documentation <https://cloud.google.com/bigquery/docs/>`_
@@ -63,7 +64,7 @@ auto-populates ``bf.options.bigquery.location`` if the user starts with
directly or in a SQL statement.

If you want to reset the location of the created DataFrame or Series objects,
you can reset the session by executing ``bigframes.pandas.reset_session()``.
you can close the session by executing ``bigframes.pandas.close_session()``.
After that, you can reuse ``bigframes.pandas.options.bigquery.location`` to
specify another location.

@@ -94,10 +95,18 @@ using the
and the `bigframes.ml.compose module <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.compose>`_.
BigQuery DataFrames offers the following transformations:

* Use the `OneHotEncoder class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.OneHotEncoder>`_
in the ``bigframes.ml.preprocessing`` module to transform categorical values into numeric format.
* Use the `KBinsDiscretizer class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.compose.ColumnTransformer>`_
in the ``bigframes.ml.preprocessing`` module to bin continuous data into intervals.
* Use the `LabelEncoder class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.LabelEncoder>`_
in the ``bigframes.ml.preprocessing`` module to normalize the target labels as integer values.
* Use the `MaxAbsScaler class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.MaxAbsScaler>`_
in the ``bigframes.ml.preprocessing`` module to scale each feature to the range ``[-1, 1]`` by its maximum absolute value.
* Use the `MinMaxScaler class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.MinMaxScaler>`_
in the ``bigframes.ml.preprocessing`` module to standardize features by scaling each feature to the range ``[0, 1]``.
* Use the `StandardScaler class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.StandardScaler>`_
in the ``bigframes.ml.preprocessing`` module to standardize features by removing the mean and scaling to unit variance.
* Use the `OneHotEncoder class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.preprocessing.OneHotEncoder>`_
in the ``bigframes.ml.preprocessing`` module to transform categorical values into numeric format.
* Use the `ColumnTransformer class <https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.compose.ColumnTransformer>`_
in the ``bigframes.ml.compose`` module to apply transformers to DataFrames columns.

@@ -335,7 +344,7 @@ sessions
; when this happens, you can’t use previously
created DataFrame or Series objects and must re-create them using a new
BigQuery DataFrames session. You can do this by running
``bigframes.pandas.reset_session()`` and then re-running the BigQuery
``bigframes.pandas.close_session()`` and then re-running the BigQuery
DataFrames expressions.


4 changes: 2 additions & 2 deletions bigframes/__init__.py
Original file line number Diff line number Diff line change
@@ -16,15 +16,15 @@

from bigframes._config import options
from bigframes._config.bigquery_options import BigQueryOptions
from bigframes.core.global_session import get_global_session, reset_session
from bigframes.core.global_session import close_session, get_global_session
from bigframes.session import connect, Session
from bigframes.version import __version__

__all__ = [
"options",
"BigQueryOptions",
"get_global_session",
"reset_session",
"close_session",
"connect",
"Session",
"__version__",
21 changes: 20 additions & 1 deletion bigframes/_config/bigquery_options.py
Original file line number Diff line number Diff line change
@@ -23,7 +23,7 @@

SESSION_STARTED_MESSAGE = (
"Cannot change '{attribute}' once a session has started. "
"Call bigframes.pandas.reset_session() first, if you are using the bigframes.pandas API."
"Call bigframes.pandas.close_session() first, if you are using the bigframes.pandas API."
)


@@ -37,14 +37,33 @@ def __init__(
location: Optional[str] = None,
bq_connection: Optional[str] = None,
use_regional_endpoints: bool = False,
application_name: Optional[str] = None,
):
self._credentials = credentials
self._project = project
self._location = location
self._bq_connection = bq_connection
self._use_regional_endpoints = use_regional_endpoints
self._application_name = application_name
self._session_started = False

@property
def application_name(self) -> Optional[str]:
"""The application name to amend to the user-agent sent to Google APIs.
Recommended format is ``"appplication-name/major.minor.patch_version"``
or ``"(gpn:PartnerName;)"`` for official Google partners.
"""
return self._application_name

@application_name.setter
def application_name(self, value: Optional[str]):
if self._session_started and self._application_name != value:
raise ValueError(
SESSION_STARTED_MESSAGE.format(attribute="application_name")
)
self._application_name = value

@property
def credentials(self) -> Optional[google.auth.credentials.Credentials]:
"""The OAuth2 Credentials to use for this client."""
41 changes: 17 additions & 24 deletions bigframes/clients.py
Original file line number Diff line number Diff line change
@@ -29,8 +29,6 @@
)
logger = logging.getLogger(__name__)

_BIGFRAMES_DEFAULT_CONNECTION_ID = "bigframes-default-connection"


class BqConnectionManager:
"""Manager to handle operations with BQ connections."""
@@ -46,6 +44,23 @@ def __init__(
self._bq_connection_client = bq_connection_client
self._cloud_resource_manager_client = cloud_resource_manager_client

@classmethod
def resolve_full_connection_name(
cls, connection_name: str, default_project: str, default_location: str
) -> str:
"""Retrieve the full connection name of the form <PROJECT_NUMBER/PROJECT_ID>.<LOCATION>.<CONNECTION_ID>.
Use default project, location or connection_id when any of them are missing."""
if connection_name.count(".") == 2:
return connection_name

if connection_name.count(".") == 1:
return f"{default_project}.{connection_name}"

if connection_name.count(".") == 0:
return f"{default_project}.{default_location}.{connection_name}"

raise ValueError(f"Invalid connection name format: {connection_name}.")

def create_bq_connection(
self, project_id: str, location: str, connection_id: str, iam_role: str
):
@@ -164,25 +179,3 @@ def _get_service_account_if_connection_exists(
pass

return service_account


def get_connection_name_full(
connection_name: Optional[str], default_project: str, default_location: str
) -> str:
"""Retrieve the full connection name of the form <PROJECT_NUMBER/PROJECT_ID>.<LOCATION>.<CONNECTION_ID>.
Use default project, location or connection_id when any of them are missing."""
if connection_name is None:
return (
f"{default_project}.{default_location}.{_BIGFRAMES_DEFAULT_CONNECTION_ID}"
)

if connection_name.count(".") == 2:
return connection_name

if connection_name.count(".") == 1:
return f"{default_project}.{connection_name}"

if connection_name.count(".") == 0:
return f"{default_project}.{default_location}.{connection_name}"

raise ValueError(f"Invalid connection name format: {connection_name}.")
15 changes: 7 additions & 8 deletions bigframes/core/__init__.py
Original file line number Diff line number Diff line change
@@ -211,8 +211,8 @@ def column_ids(self) -> typing.Sequence[str]:
return tuple(self._column_names.keys())

@property
def hidden_ordering_columns(self) -> typing.Tuple[ibis_types.Value, ...]:
return self._hidden_ordering_columns
def _hidden_column_ids(self) -> typing.Sequence[str]:
return tuple(self._hidden_ordering_column_names.keys())

@property
def _reduced_predicate(self) -> typing.Optional[ibis_types.BooleanValue]:
@@ -400,24 +400,23 @@ def _hide_column(self, column_id) -> ArrayValue:
expr_builder.ordering = self._ordering.with_column_remap({column_id: new_name})
return expr_builder.build()

def promote_offsets(self) -> typing.Tuple[ArrayValue, str]:
def promote_offsets(self, col_id: str) -> ArrayValue:
"""
Convenience function to promote copy of column offsets to a value column. Can be used to reset index.
"""
# Special case: offsets already exist
ordering = self._ordering

if (not ordering.is_sequential) or (not ordering.total_order_col):
return self._project_offsets().promote_offsets()
col_id = bigframes.core.guid.generate_guid()
return self._project_offsets().promote_offsets(col_id)
expr_builder = self.builder()
expr_builder.columns = [
self._get_any_column(ordering.total_order_col.column_id).name(col_id),
*self.columns,
]
return expr_builder.build(), col_id
return expr_builder.build()

def select_columns(self, column_ids: typing.Sequence[str]):
def select_columns(self, column_ids: typing.Sequence[str]) -> ArrayValue:
return self._projection(
[self._get_ibis_column(col_id) for col_id in column_ids]
)
@@ -807,7 +806,7 @@ def _create_order_columns(
elif ordering_mode == "string_encoded":
return (self._create_string_ordering_column().name(order_col_name),)
elif expose_hidden_cols:
return self.hidden_ordering_columns
return self._hidden_ordering_columns
return ()

def _create_offset_column(self) -> ibis_types.IntegerColumn:
8 changes: 4 additions & 4 deletions bigframes/core/block_transforms.py
Original file line number Diff line number Diff line change
@@ -40,8 +40,8 @@ def equals(block1: blocks.Block, block2: blocks.Block) -> bool:

equality_ids = []
for lcol, rcol in zip(block1.value_columns, block2.value_columns):
lcolmapped = lmap(lcol)
rcolmapped = rmap(rcol)
lcolmapped = lmap[lcol]
rcolmapped = rmap[rcol]
joined_block, result_id = joined_block.apply_binary_op(
lcolmapped, rcolmapped, ops.eq_nulls_match_op
)
@@ -563,8 +563,8 @@ def align_rows(
joined_index, (get_column_left, get_column_right) = left_block.index.join(
right_block.index, how=join
)
left_columns = [get_column_left(col) for col in left_block.value_columns]
right_columns = [get_column_right(col) for col in right_block.value_columns]
left_columns = [get_column_left[col] for col in left_block.value_columns]
right_columns = [get_column_right[col] for col in right_block.value_columns]

left_block = joined_index._block.select_columns(left_columns)
right_block = joined_index._block.select_columns(right_columns)
Loading