Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow flake8-type-checking rules to automatically quote runtime-evaluated references #6001

Merged
merged 6 commits into from
Dec 13, 2023

Conversation

charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Jul 23, 2023

Summary

This allows us to fix usages like:

from pandas import DataFrame

def baz() -> DataFrame:
    ...

By quoting the DataFrame in -> DataFrame. Without quotes, moving from pandas import DataFrame into an if TYPE_CHECKING: block will fail at runtime, since Python tries to evaluate the annotation to add it to the function's __annotations__.

Unfortunately, this does require us to split our "annotation kind" flags into three categories, rather than two:

  • typing-only: The annotation is only evaluated at type-checking-time.
  • runtime-evaluated: Python will evaluate the annotation at runtime (like above) -- but we're willing to quote it.
  • runtime-required: Python will evaluate the annotation at runtime (like above), and some library (like Pydantic) needs it to be available at runtime, so we can't quote it.

This functionality is gated behind a setting (flake8-type-checking.quote-annotations).

Closes #5559.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 23, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

@charliermarsh charliermarsh force-pushed the charlie/quote branch 5 times, most recently from 41a82c7 to 33ee9b7 Compare July 23, 2023 04:34
@charliermarsh
Copy link
Member Author

charliermarsh commented Jul 23, 2023

This "works" on Dagster in that it generates 1834 fixes (and no syntax errors) that generally look right:

--- a/examples/assets_smoke_test/assets_smoke_test/pure_python_assets.py
+++ b/examples/assets_smoke_test/assets_smoke_test/pure_python_assets.py
@@ -1,5 +1,9 @@
+from typing import TYPE_CHECKING
+
 from dagster import SourceAsset, TableSchema, asset
-from pandas import DataFrame
+
+if TYPE_CHECKING:
+    from pandas import DataFrame

 raw_country_populations = SourceAsset(
     "raw_country_populations",
@@ -19,7 +23,7 @@


 @asset
-def country_populations(raw_country_populations) -> DataFrame:
+def country_populations(raw_country_populations) -> "DataFrame":
     country_populations = raw_country_populations.copy()
     country_populations["change"] = (
         country_populations["change"]
@@ -32,13 +36,13 @@ def country_populations(raw_country_populations) -> DataFrame:


 @asset
-def continent_stats(country_populations: DataFrame) -> DataFrame:
+def continent_stats(country_populations: "DataFrame") -> "DataFrame":
     result = country_populations.groupby("continent").agg({"pop2019": "sum", "change": "mean"})
     return result


 @asset
-def country_stats(country_populations: DataFrame, continent_stats: DataFrame) -> DataFrame:
+def country_stats(country_populations: "DataFrame", continent_stats: "DataFrame") -> "DataFrame":
     result = country_populations.join(continent_stats, on="continent", lsuffix="_continent")
     result["continent_pop_fraction"] = result["pop2019"] / result["pop2019_continent"]
     return result

@tylerlaprade
Copy link
Contributor

Yes! I was wondering yesterday if I misunderstood the purpose of the rule, because it was barely applying. When I realized I needed to add quotes, I briefly started manually adding them, but then realized it was unfeasible to do manually on even a medium-sized codebase.

@charliermarsh
Copy link
Member Author

(If you use from __future__ import annotations, you'll also get more of the expected changes, but that's not always feasible to add either depending on your use-case.)

@charliermarsh charliermarsh force-pushed the charlie/quote branch 2 times, most recently from 1976038 to 0c338cf Compare July 26, 2023 18:24
@charliermarsh
Copy link
Member Author

This was blocked as it became too difficult to quote an annotation given just the reference range. It should be unblocked now that we store all expressions in a vector, since we can store an ExpressionId on all references to go from reference to Expr.

Copy link
Contributor

github-actions bot commented Dec 10, 2023

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+2 -1 violations, +0 -0 fixes in 41 projects)

ibis-project/ibis (+2 -0 violations, +0 -0 fixes)

+ ibis/common/tests/test_graph_benchmarks.py:6:37: RUF100 Unused `noqa` directive (unused: `TCH002`)
+ ibis/common/typing.py:33:46: RUF100 Unused `noqa` directive (unused: `TCH002`)

zulip/zulip (+0 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --no-preview --select ALL

- analytics/views/stats.py:13:31: TCH002 Move third-party import `typing_extensions.TypeAlias` into a type-checking block

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
RUF100 2 2 0 0 0
TCH002 1 0 1 0 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+2 -1 violations, +0 -0 fixes in 41 projects)

ibis-project/ibis (+2 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --preview

+ ibis/common/tests/test_graph_benchmarks.py:6:37: RUF100 Unused `noqa` directive (unused: `TCH002`)
+ ibis/common/typing.py:33:46: RUF100 Unused `noqa` directive (unused: `TCH002`)

zulip/zulip (+0 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --preview --select ALL

- analytics/views/stats.py:13:31: TCH002 Move third-party import `typing_extensions.TypeAlias` into a type-checking block

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
RUF100 2 2 0 0 0
TCH002 1 0 1 0 0

@charliermarsh charliermarsh force-pushed the charlie/quote branch 2 times, most recently from 57915b4 to 77301d4 Compare December 10, 2023 03:31
@charliermarsh
Copy link
Member Author

The diff of this on Dagster actually looks really good.

@charliermarsh
Copy link
Member Author

For example:

@@ -25,7 +25,6 @@
 from dagster._core.events import DagsterEvent
 from dagster._core.execution.api import create_execution_plan, scoped_job_context
 from dagster._core.execution.plan.outputs import StepOutputHandle
-from dagster._core.execution.plan.plan import ExecutionPlan
 from dagster._core.execution.plan.state import KnownExecutionState
 from dagster._core.execution.plan.step import ExecutionStep
 from dagster._core.execution.resources_init import (
@@ -34,7 +33,6 @@
 )
 from dagster._core.instance import DagsterInstance
 from dagster._core.instance.ref import InstanceRef
-from dagster._core.log_manager import DagsterLogManager
 from dagster._core.storage.dagster_run import DagsterRun, DagsterRunStatus
 from dagster._core.system_config.objects import ResolvedRunConfig, ResourceConfig
 from dagster._core.utils import make_new_run_id
@@ -47,6 +45,8 @@
 from .serialize import PICKLE_PROTOCOL

 if TYPE_CHECKING:
+    from dagster._core.execution.plan.plan import ExecutionPlan
+    from dagster._core.log_manager import DagsterLogManager
     from dagster._core.definitions.node_definition import NodeDefinition


@@ -81,8 +81,8 @@ def _setup_resources(
         self,
         resource_defs: Mapping[str, ResourceDefinition],
         resource_configs: Mapping[str, ResourceConfig],
-        log_manager: DagsterLogManager,
-        execution_plan: Optional[ExecutionPlan],
+        log_manager: "DagsterLogManager",
+        execution_plan: Optional["ExecutionPlan"],
         dagster_run: Optional[DagsterRun],
         resource_keys_to_init: Optional[AbstractSet[str]],
         instance: Optional[DagsterInstance],

@charliermarsh charliermarsh marked this pull request as ready for review December 10, 2023 03:37
@charliermarsh
Copy link
Member Author

We may want to add a setting to allow users to either (1) insert __future__ or (2) quote annotations or (3) do neither.

@charliermarsh
Copy link
Member Author

@smackesey -- You may be interested in this (finally resurrected after a long break).

@dhruvmanila
Copy link
Member

I've gone through the changes and it looks good although I'd like to give a second look at it in the evening.

@smackesey
Copy link

You may be interested in this

Definitely-- thanks for bringing it back! To spare churn in imports for other engs I like to apply changes in one PR-- is this anywhere on your agenda: #1107? Would be great to apply them both at once.

@charliermarsh
Copy link
Member Author

@smackesey - I'm still interested in supporting that but it won't be merged on the same timeline, since it requires us to build up an import map that we share across modules.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I only reviewed if I roughly understand what's happening, not if the implemented semantics are correct (because I don't understand thme).

crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
crates/ruff_linter/src/checkers/ast/mod.rs Show resolved Hide resolved
/// The trimmed range of the import (e.g., `List` in `from typing import List`).
pub(crate) range: TextRange,
/// The range of the import's parent statement.
pub(crate) parent_range: Option<TextRange>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which situations can the parent_range be None? Shouldn't an import always come from a statement. This might be worth documenting.

crates/ruff_python_semantic/src/model.rs Outdated Show resolved Hide resolved
crates/ruff_python_semantic/src/reference.rs Show resolved Hide resolved
Comment on lines 1657 to 1660
/// In other words, moving `from collections.abc import Sequence` into an
/// `if TYPE_CHECKING:` block above would cause a runtime error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part confused me. I admint, mainly because I didn't read carefully but I first thought that what ruff does will create a runtime error.

maybe explain why it causes a runtime error.

Or it might be that I got confused because the documentation explains what it does and the example shows when it doesn't work (rather than when it does).

would cause a runtime error because the type is no longer available at runtime.

Copy link
Member

@dhruvmanila dhruvmanila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is a great improvement to the rule. Overall the implementation looks really solid apart from some minor nits.

crates/ruff_linter/src/checkers/ast/mod.rs Outdated Show resolved Hide resolved
Comment on lines 688 to 699
flake8_type_checking::helpers::runtime_evaluated_class(
flake8_type_checking::helpers::runtime_required_class(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that we now have runtime required internally but runtime evaluated in the user facing option. Maybe we could use either v0.2.0 or v0.3.0 as a way to rename these options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes great observation! It's really unfortunate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm gonna change the internal names.

Comment on lines +257 to +268
if annotation.contains('\'') || annotation.contains('"') {
return Err(anyhow::anyhow!("Annotation already contains a quote"));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the function example, we've mentioned to update the internal quote:

  • When quoting Series in Series[Literal["pd.Timestamp"]], we want "Series[Literal['pd.Timestamp']]".

With this code, I think that doesn't work, right? Can we use the Stylist to only check for the opposite quote?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry, it doesn't do that yet.

@charliermarsh charliermarsh force-pushed the charlie/quote branch 3 times, most recently from daa49ed to f3eaec7 Compare December 13, 2023 03:07
@charliermarsh charliermarsh enabled auto-merge (squash) December 13, 2023 03:07
@charliermarsh charliermarsh merged commit 1a65e54 into main Dec 13, 2023
16 checks passed
@charliermarsh charliermarsh deleted the charlie/quote branch December 13, 2023 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TCH setting to autofix with conversion to string
5 participants