Skip to content

3.7: full sys.path on flow PYTHONPATH breaks subprocess CLIs with bundled Python (SRE module mismatch) #22147

@yukiyan

Description

@yukiyan

Bug summary

On 3.7, prefect flow-run execute puts the runner's full sys.path (standard library included) onto the flow subprocess's PYTHONPATH. A pre-built-image deployment that shells out to a CLI bundling its own Python, such as bq, then crashes at that CLI's startup with AssertionError: SRE module mismatch. 3.6.27 and earlier are unaffected.

This is separate from #22080, which covers the auto-uv run path re-creating a .venv and OOM-killing pre-built images (made opt-in by #22116). The trigger and the responsible lines differ, but both come from the same place: the 3.7 workspace-resolver launch path assumes the runtime environment still needs setting up, which does not hold for a pre-built image that already has everything installed.

What the code does

prefect flow-run execute runs through the new WorkspaceResolvingEngineCommandStarter (#21699, #21796). While preparing the flow-run workspace:

  1. The workspace resolver captures the full sys.path with no filtering:
  2. The starter joins those entries into PYTHONPATH:
  3. That environment is passed to the flow-execution subprocess:

The same code is present on main (workspace_environment still joins the full captured sys.path), so the path is unchanged there. I confirmed this by code inspection rather than by reproducing on a main build.

A deployment that set only PYTHONPATH=/app ends up running its flow with:

PYTHONPATH=/app:/usr/local/lib/python312.zip:/usr/local/lib/python3.12:/usr/local/lib/python3.12/lib-dynload:/usr/local/lib/python3.12/site-packages:...

The standard-library directories are now on PYTHONPATH, and every subprocess the flow spawns inherits them.

Observed behavior

A flow task shells out to bq (Google Cloud SDK). bq starts its own Python, inherits this PYTHONPATH, and fails at interpreter startup:

Traceback (most recent call last):
  File "/opt/google-cloud-sdk/bin/bootstrapping/bq.py", line 11, in <module>
    import bootstrapping
  File "/opt/google-cloud-sdk/bin/bootstrapping/bootstrapping.py", line 32, in <module>
    import setup
  File "/opt/google-cloud-sdk/bin/bootstrapping/setup.py", line 57, in <module>
    from googlecloudsdk.core.util import platforms
  File "/opt/google-cloud-sdk/lib/googlecloudsdk/__init__.py", line 26, in <module>
    from googlecloudsdk.core.util import lazy_regex
  File "/opt/google-cloud-sdk/lib/googlecloudsdk/core/util/lazy_regex.py", line 23, in <module>
    import re
  File "/usr/local/lib/python3.12/re/__init__.py", line 125, in <module>
    ...
  File "/usr/local/lib/python3.12/re/_compiler.py", line 18, in <module>
    assert _sre.MAGIC == MAGIC, "SRE module mismatch"
AssertionError: SRE module mismatch

Every subflow that calls bq fails the same way. Tasks that reach BigQuery through the Python SDK in-process are unaffected; only the ones that spawn a separate Python break.

Why it happens

_sre.MAGIC is a constant compiled into the interpreter. re._compiler.MAGIC is read from whichever re/_compiler.py comes first on the path. With another version's standard-library directory on PYTHONPATH, a subprocess running a different Python imports that out-of-tree re; the two MAGIC values disagree and startup aborts.

_sre.MAGIC differs between CPython versions (3.12 is 20221023, 3.13 is 20230612), so the crash appears when the spawned tool's interpreter differs from the one whose standard library landed on PYTHONPATH. SDK-bundled CLIs like bq select their own Python, so they hit it. The deployment's own PYTHONPATH=/app did not trigger this on 3.6; the standard-library entries added on 3.7 do.

Reproduction

Same pre-built image, three PYTHONPATH values, running bq version:

PYTHONPATH passed to the flow process bq startup
/app only (the deployment's own value, 3.6 behavior) works
runner's full sys.path (what 3.7 injects) SRE module mismatch
empty works

The standard-library entries on PYTHONPATH are the trigger.

The mechanism reproduces with any two interpreters whose _sre.MAGIC differs (e.g. 3.12 and 3.13). Launch one while the other version's standard library is on PYTHONPATH:

# pythonA = 3.12, pythonB = 3.13 (or vice versa)
PYTHONPATH="$(pythonB -c 'import sysconfig; print(sysconfig.get_path("stdlib"))')" \
  pythonA -c "import re"
# pythonA imports pythonB's re/_compiler.py off PYTHONPATH; the MAGIC values
# disagree and startup aborts with: AssertionError: SRE module mismatch

Expected behavior

A pre-built-image deployment with its dependencies already installed should not get the runner's standard-library paths added to its PYTHONPATH. #19321, the design this path came from, states that existing deployments continue to work unchanged. That does not hold for image-based deployments which previously set only their own PYTHONPATH.

Version info

Version:              3.7.2
API version:          0.8.4
Python version:       3.12.13
Git commit:           5836855e
Built:                Sat, May 23, 2026 12:23 AM
OS/Arch:              linux/x86_64
Profile:              ephemeral
Server type:          ephemeral
Pydantic version:     2.13.1
Server:
  Database:           sqlite
  SQLite version:     3.40.1
Integrations:
  prefect-shell:      0.3.5
  prefect-slack:      0.3.1
  prefect-gcp:        0.6.19
  prefect-dbt:        0.7.24

Output of prefect version from inside the affected pre-built image (prefect 3.7.2). At flow-run time this image runs under a Cloud Run V2 worker (server type server), not the ephemeral server shown here.

Additional context

Environment:

  • Cloud Run V2 worker (prefect-gcp), pre-built Docker image, dependencies installed into the system Python at build time
  • Deployment job_variables.env includes PYTHONPATH=/app
  • Flow shells out to bq (Google Cloud SDK) via subprocess

Possible directions (for discussion):

  1. Drop standard-library locations (stdlib, platstdlib, lib-dynload, zip stdlib, all discoverable via sysconfig) from the captured sys.path before building PYTHONPATH. The interpreter resolves those itself.
  2. Add an opt-out setting for the PYTHONPATH injection, the way runner.auto_install_dependencies / PREFECT_RUNNER_AUTO_INSTALL_DEPENDENCIES was added in Make runner dependency installation opt-in #22116 for the auto-uv run behavior.
  3. Put on PYTHONPATH only the entries the resolver added during workspace preparation (pulled code, project root), not the runner's pre-existing sys.path.

Workaround in use: clear PYTHONPATH for the offending command only, leaving it intact for in-process imports:

PYTHONPATH="" bq <args>

A blanket unset PYTHONPATH at the top of a wrapper script also drops /app, which breaks any python /app/... call later in the same script.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions