Standardizing Python tasks management

Abstract

This document is not a PEP but an aggregation of my notes, feedbacks, findings and experiments on this topic. However, it tries to be as close as possible from recent PEPs format for :

me to ensure that I give as much as possible information that would be required for a PEP
readers, to make it as easy as possible to read this document

The specifications are what I believe to be an efficient, flexible and scalable tasks management DSL. I added some existing tooling comparison in the appendix (not all) as well as some other reasonning.

Feel free to comment.

Motivation

The key motivations for standardizing Python task management are:

Fragmentation in the ecosystem: Currently, every Python project management tool (PDM, Poetry, Hatch, etc.) implements its own task running system, leading to incompatible configurations and lock-in.
Lack of standardization: Unlike package metadata (PEP 621) or dependencies (PEP 508), there's no standard way to define project tasks, forcing users to learn multiple systems.
Reproducibility challenges: Different task definitions across tools make it harder to maintain consistent development workflows across projects and teams.
Tool migration overhead: Moving between different Python tools requires rewriting task configurations, even when the underlying commands remain the same.
Limited interoperability: Tools can't easily share or reuse task definitions, leading to duplication and inconsistency.

Rationale

The proposal for a standardized task specification system is based on several key principles:

Declarative Over Imperative

Tasks should be primarily defined declaratively in pyproject.toml, making them:

Easy to read and understand
Tool-agnostic and portable
Simple to parse and validate
Compatible with existing TOML parsers

Progressive Complexity

The specification supports both simple and complex use cases:

Basic tasks can be defined as simple strings
Advanced features available through structured configuration
Optional features for more complex workflows

Documentation

Task definition should allow documentation of:

the task itself, what is the intend or the expected action
supported parameters if any
supported options if any

Use Cases

Task management covers several key development workflows:

Local Development
- Running tests and linters
- Starting development servers
- Building documentation
- Managing dependencies
CI/CD Integration
- Build and deployment tasks
- Test automation
- Quality checks
- Release management
Project Maintenance
- Database migrations
- Asset compilation
- Code generation
- Environment setup
Cross-platform Support
- Windows/Unix compatibility
- Environment isolation
- Tool-specific configurations

This standardization effort aims to provide a common foundation that supports these use cases while remaining flexible enough for tool-specific extensions.

Specification

The [tasks] table in pyproject.toml defines project-level tasks that can be executed by any compatible task runner.

Task Definition

Tasks can be defined in two ways:

Simple string form (for basic shell commands):

[tasks]
test = "pytest"
lint = "flake8 ."

Table form (for advanced configuration):

[tasks.serve]
cmd = "pytest"
help = "Execute the test suite"
env = { VALUE = "key" }
cwd = "src"

Dotted form (for flat advanced configuration):

[tasks]
serve.cmd = "pytest"
serve.help = "Execute the test suite"
serve.env = { VALUE = "key" }
serve.cwd = "src"

Task Properties

Each task (in any of the advanced form) can have the following properties:

Execution type (only one of)

cmd (string or list): A command to execute as a subprocess (without shell)
shell (string): A command to execute in a shell
call (string): A reference to a python callable to execute with optionnal parameters
tasks (string or list or table): One or multiple tasks references or definitions

Environment

env (table): Environment variables for this task
envfile (string): Path to a file containing environment variables (in the form KEY=value) for this task
cwd (string): Working directory for task execution

Documentation

help (string): Documentation string for the task
args (list): Known positional arguments documentation
options (table): Known options documentation

Hooks

pre (string or list or table): One or multiple tasks references or definitions to run before the task
post (string or list or table): One or multiple tasks references or definitions to run after the task

Behavior

parallel (boolean, default: false): In case of a composite tasks only, run tasks in parallel
condition (string): Environment markers expression as condition of execution

Tasks types

Task can be one of those kind (and only define one of the execution type property)

Command

This is the most basic task type and the default one when using the simple form.

Task will be executed as a sub-process without a shell.

The command is given by the cmd property:

[tasks.test]
cmd = "pytest"

All extra unparsed arguments and options will be passed throught to the command (ie. <runner> test -k some will execute pytest -k some).

Shell

This kind of tasks will execute the entire shell property in a sub-shell.

It means line endings, variables references and any operators will be processed by the shell.

[tasks.test]
shell = """
echo "Running tests"
pytest
"""

All extra unparsed arguments and options will be passed throught to the command.

It means that in a multiline string, you might need to escape the last line ending.

Ex: to ensures that pytest received all remaining arguments (ie. for <runner> test -k some to execute pytest -k some), you might have to write:

[tasks.test]
shell = """
echo "Running tests"
pytest \
"""

For more controle over arguments processing, see the Arguments processing section.

Python callable

This kind of task execute a python callable. The callable fully qualified name is passed as the call property:

[tasks]
# a function in a module
my_function.call = "module:my_function"
# a function in a package module
my_other_function.call = "package.module:my_function"

Arguments are not processed and parsing is the responsibility of the callable (they will be available in sys.argv).

It is possible to pass parameters to the callable:

[tasks.my_task]
call = "module:my_function('arg1', param=True)"

Composite tasks

This kind of tasks will in fact execute multiple sub-tasks referenced by the tasks property. They can be existing tasks referenced by string:

[tasks.composite]
tasks = ["subtask1", "subtask2"]

Or they can be inline task definitions.

[tasks.composite]
tasks = [
    {shell = "echo 'Hello $NAME", env={NAME = "John Doe"}},
    {call = "module:hello"},
    {use = "existing-task"},
]

Processed arguments can be given explicetely.

In the case of a single task composite task, the list can be removed:

[tasks]
# Single task by name
first = "do something"
second.tasks = "first --with args"
third.tasks = {use = "second", env={KEY = "value"}}

Taks may be executed in parallel using parallel = true

Inline task definition

An inline tasks definition can have the following properties:

Execution type (only one of)

cmd (string or list): A command to execute as a subprocess (without shell)
shell (string): A command to execute in a shell
call (string): A reference to a python callable to execute with optionnal parameters
use (string): An existing task name. This is the one used for string only definition

Environment

env (table): Environment variables for this task
envfile (string): Path to a file containing environment variables (in the form KEY=value) for this task
cwd (string): Working directory for task execution

Behavior

condition (string): Environment markers expression as condition of execution

Any of those property are reserved keywords and can be used as task name.

Execution environment

Environment Variables

Global environment variables:

[tasks.env]
PYTHONPATH = "src"
DEBUG = "1"

Task environment variables:

# Inline object form
[tasks.first.env]
env = {PYTHONPATH = "src", DEBUG = "1"}
# Table form
[tasks.second.env]
PYTHONPATH = "src"
DEBUG = "1"

Variable substitution syntax:

Shell style: ${VAR}
Python style: {env.VAR}
Task properties: {task.name}, {task.cwd}

Environment variables in `dotenv` file

Environement variable can be provided as a dotenv file: Global dotenv file:

[tasks]
# Global
envfile = ".env"

# By tasks
test.envfile = "tests.env"

The script execution should not fail if the file is missing. Task local dotenv variables are loaded after global ones.

Variable substitution syntax:

Shell style: ${VAR}

Working dir

Can be specified using cwd. It can be absolute or relative to the directory where the pyproject.toml is.

# Global
[tasks]
cwd = "src"

# Task-only
[tasks.doc]
cwd = "doc"

String interpolation

Arguments processing

Tasks can receive arguments via:

Positional args: {args} or {args.0}
Named args: {args.name}

They can have a default value: {args:default}

[tasks.serve]
cmd = "flask run -p {args.port:5000}"

Tool specific

Each tool can add its own string interpolation.

`pre` and `post` hooks

Pre/Post hooks have the exact same format as composite tasks tasks property

[tasks.test]
pre = "clean"
post = [
    "coverage",
    {cmd = "echo 'Done'"},
]

Task runners may provide the ability to skip them.

Conditional execution

Each task can have a PEP-508 Environment markers expression as condition of execution. Task will only run if it evaluates as True.

[tasks.build]
condition = "python_version < '3.8' or platform_system == 'Windows'"

An additional files marker is available too test file presence. It only supports the in and not int operators.

condition = "'local.py' in files"

Documentation

Each task can have a documentation using the help property.

Arguments can also be documented using the args property, as a single string or a list of strings to document each positional argument

Options can be documentated with the options table.

[tasks.serve]
help = "Run the server"
options = {port = "Port to listen to (default: 5000)"}

[tasks.deploy]
help = "Deploy to environment"
args = "The target environment (staging|prod)"

Task Groups

Tasks can be organized into groups:

[tasks.format]
help = "Code formatting tasks"

[tasks.format.black]
cmd = "black ."

[tasks.format.isort]
cmd = "isort ."

A group can't have any of the execution type property (cmd, shell, call, tasks).

Groups have dedicated help, it can't have args or options.

env, envfile and cwd are inherited by group tasks.

A group will run it's tasks in sequence unless parallel=true is defined.

Individual grouop task can be run as subcommand.

[tasks.build.assets]
call = "scripts.assets:build"

[tasks.build.docs]
cmd = "mkdocs build"

# Only execute sub tasks
$runner build assets
$runner build docs
# All at once
$runner build

Appendix A: Prior Art in Non-Python Languages

Makefile

Documentation: https://www.gnu.org/software/make/manual/make.html

Example:

.PHONY: test
test:  ## Run tests
    python -m pytest

lint:  ## Run linters
    flake8 .
    mypy .

Key points:

Cons: Limited Windows support, syntax can be cryptic, no built-in Python environment management
Best for: Universal build systems, language-agnostic projects
Task composition: Target dependencies and includes
Environment variables: Built-in support but syntax heavy
Documentation: Limited to target comments
Python integration: None native, requires external scripts

npm scripts

Documentation: https://docs.npmjs.com/cli/v8/using-npm/scripts

Example:

{
  "scripts": {
    "test": "python -m pytest",
    "lint": "flake8 .",
    "format": "black . && isort .",
    "pretest": "npm run lint",
    "dev": "python -m flask run"
  }
}

Key points:

Pros: Simple JSON configuration, huge ecosystem, good environment variables support
Cons: Node.js dependency, not Python-native, limited to shell commands
Best for: JavaScript/TypeScript projects, simple command aliasing
Task composition: Via pre/post hooks and composition operators
Environment variables: Supports both built-in and custom variables
Documentation: Can include description field for each script

mise tasks

Documentation: https://mise.jdx.dev/tasks.html

Example:

[tasks.test]
cmd = "pytest"
help = "Run tests"

[tasks.lint]
deps = ["test"]
cmd = "flake8 ."

Key points:

Cons: Another tool to learn, not Python-specific
Best for: Polyglot projects, version management
Task composition: Good support for task dependencies
Environment variables: Strong environment management
Documentation: Supports task descriptions
Python integration: Limited to environment variables

cargo-make

Documentation: https://sagiegurari.github.io/cargo-make/

Example:

[tasks.test]
command = "pytest"
description = "Run tests"

[tasks.lint]
dependencies = ["test"]
command = "flake8"

Key points:

Cons: Rust ecosystem tool, requires Rust installation
Best for: Complex build workflows, cross-platform projects
Task composition: Rich dependency system with conditions
Environment variables: Comprehensive environment control
Documentation: Extensive task documentation support
Python integration: Can run Python scripts but not native

rake

Documentation: https://ruby.github.io/rake/

Example:

desc "Run tests"
task :test do
  sh "python -m pytest"
end

desc "Run linters"
task :lint => [:test] do
  sh "flake8 ."
end

Key points:

Cons: Requires Ruby knowledge, Ruby installation
Best for: Ruby projects, complex task automation
Task composition: Rich task dependency system
Environment variables: Full programmatic control
Documentation: Task descriptions via comments
Python integration: Can execute Python via shell commands

Features Matrix

Feature	Makefile	npm scripts	mise tasks	cargo-make	rake
Configuration Format	Makefile syntax	package.json	mise.toml	TOML	Ruby code
Task Dependencies	Yes	Yes	Yes	Yes	Yes
Environment Management	No	node_modules	Yes	Yes	Bundler
Parallel Execution	Yes	Yes (with npm-run-all)	Yes	Yes	Yes
Shell Completion	Yes	Yes	Yes	Yes	Yes
Cross-platform	Limited	Yes	Yes	Yes	Yes
Learning Curve	Steep	Low	Medium	Medium	Medium
Python Native	No	No	No	No	No
Shell vs Command Mode	Shell only	Shell only	Shell only	Both	Both
ENV Variables Support	Yes	Yes	Yes	Yes	Yes
Task Composition	Limited	Yes	Yes	Yes	Yes
Arguments Passing	Limited	Yes	Yes	Yes	Yes
Python String Interpolation	No	No	No	No	No
Task Help/Documentation	Limited	Yes	Yes	Yes	Yes
Python Module/Function Call	No	No	No	No	No

Appendix B: Prior Art in Python

Declaration-based Tools

Key points: Easy to set up, integrated with Python tooling

PDM scripts

Documentation: https://pdm-project.org/latest/usage/scripts/

Example:

[tool.pdm.scripts]
test = "pytest"
lint = {shell = "flake8 src/"}
serve = {call = "myapp.main:start_server"}

Key points:

Cons: Requires PDM adoption
Best for: Pure Python projects using modern tooling
Task composition: Supports both shell commands and Python functions
Environment variables: Full support with variable substitution
Documentation: Supports detailed help text in pyproject.toml
Python integration: Can use Python expressions in string interpolation

Rye scripts

Documentation: https://rye.astral.sh/guide/pyproject/#toolryescripts

Example:

[tool.rye.scripts]
test = "pytest"
serve = "python -m myapp.server"
format = { cmd = "black . && isort ." }

Key points:

Cons: New tool, limited features compared to alternatives
Best for: Projects already using Rye for Python management
Task composition: Basic shell command execution

Hatch scripts

Documentation: https://hatch.pypa.io/latest/config/environment/overview/#scripts

Example:

[tool.hatch.envs.default.scripts]
test = "pytest {args:tests}"
lint = [
  "black .",
  "flake8 src/"
]

Key points:

Cons: Limited to environment-specific scripts
Best for: Projects using Hatch for packaging

tox

Documentation: https://tox.wiki/en/latest/

Example:

tox.ini format

[tox]
envlist = py39,py310,lint

[testenv]
deps = pytest
commands = pytest {posargs:tests}

[testenv:lint]
deps = flake8
commands = flake8 src/

pyproject.toml format

[tool.tox]
envlist = ["py39", "py310", "lint"]

[tool.tox.env_run_base]
description = "Run test under {base_python}"
commands = [["pytest", "{posargs:tests}"]]

[tool.tox.env.lint]
deps = ["flake8"]
commands = [["flake8"], ["src/"]]

Key points:

Cons: Complex configuration, slower than alternatives
Best for: Testing across multiple Python versions/environments
Task composition: Supports dependencies and factor-conditional environments
Environment variables: Rich environment isolation and passthrough
Documentation: Detailed configuration documentation
Python integration: Supports substitutions and environment isolation
Module/Function execution: Supports module execution via python -m and direct test runners

Poe the Poet

Documentation: https://poethepoet.natn.io/

Example:

[tool.poe.tasks]
test = "pytest"
lint = "flake8"
format = { shell = "black . && isort ." }
serve = { script = "myapp.server:main()" }

Key points:

Cons: Less powerful than pure Python solutions
Best for: Poetry-based projects, straightforward task automation
Task composition: Supports task dependencies and chaining
Environment variables: Good support with variable substitution
Documentation: Integrated help system
Python integration: Can execute Python code and modules
Module/Function execution: Supports Python module and function execution

Feature matrix

Feature	PDM scripts	Rye scripts	Hatch scripts	tox	Poe
Configuration Format	pyproject.toml	pyproject.toml	pyproject.toml	tox.ini/pyproject.toml	pyproject.toml
Task Dependencies	No	No	No	Yes (`depends`)	Yes (`deps`)
Pre/post hooks	Yes	No	No	Yes	`pre` only
Parallel Execution	No	No	No	Yes	No
Python Native call	Yes	Yes	No	No	Yes
Shell vs Command Mode	Both	Command only	Shell only	Shell only	Both
ENV Variables Support	Yes	Yes	Yes	Yes	Yes
Task Composition	`composite`	`chain`	Limited	No	`sequence`
Arguments Passing	Yes	Yes	Limited	Yes	Yes
String interpolation	Yes	No	Yes	Yes	Yes

Makefile-like Tools

Key points: More flexibility and power when needed

Invoke

Documentation: https://www.pyinvoke.org/

Example:

from invoke import task

@task
def test(c):
    """Run tests"""
    c.run("pytest")

@task(test)
def lint(c):
    """Run linters"""
    c.run("flake8 .")

Key points:

Best for: Complex Python-based automation
Task composition: Rich Python-based composition with decorators
Environment variables: Full programmatic control
Documentation: Auto-generated help from docstrings
Python integration: Native Python code with full access to runtime
Module/Function execution: Native support through direct Python imports and calls

duty

Documentation: https://duty.readthedocs.io/

Example:

from duty import duty

@duty
def test(ctx):
    """Run the test suite."""
    ctx.run("pytest tests/", title="Running tests")

@duty(pre=["test"])
def lint(ctx):
    """Run linting tools."""
    ctx.run("flake8 src/", title="Linting code")

Key points:

Cons: Requires Python code for configuration like Invoke
Best for: Type-safe Python automation projects
Task composition: Strong support via Python functions and decorators
Environment variables: Full programmatic control
Documentation: Auto-generated help from docstrings
Python integration: Native Python code with type safety
Module/Function execution: Native support through Python imports and calls

Features matrix

Feature	Invoke	duty
Configuration Format	Python code	Python code
Task Dependencies	Yes	Yes
Environment Management	No	No
Parallel Execution	Yes	Yes
Shell Completion	Yes	Yes
Cross-platform	Yes	Yes
Learning Curve	Medium	Medium
Python Native	Yes	Yes
Shell vs Command Mode	Both	Both
ENV Variables Support	Yes	Yes
Task Composition	Yes	Yes
Arguments Passing	Yes	Yes
Python String Interpolation	Yes	Yes
Task Help/Documentation	Yes	Yes
Python Module/Function Call	Yes	Yes

Appendix C: Use Cases

Local development

Shell

Common shell-based development tasks:

Running tests (unit, integration, e2e)
Code formatting (black, isort, prettier)
Linting (flake8, pylint, mypy)
Building documentation (Sphinx, MkDocs)
Managing virtual environments (venv, conda)
Database migrations (alembic, django)
Starting development servers (flask, django)
Cleaning build artifacts (pycache, dist)
Running code generators (protobuf, openapi)
Managing dependencies (pip, poetry, pdm)
Asset compilation (webpack, sass)
Local deployment tasks
Pre-commit hook management

IDE

IDE integration requirements:

Task discovery and listing in UI
Task execution from UI/keyboard shortcuts
Environment variable configuration
Debug configuration integration
Output capturing and formatting
Error parsing with source file linking
Task dependencies visualization
Quick access to frequent tasks
Configuration file editing support
Task documentation display
Shell/terminal integration
Live reload capabilities
Multi-root workspace support

Continuous testing

Requirements for continuous testing workflows:

Watch mode for file changes
Fast test selection and execution
Test environment isolation
Coverage reporting and enforcement
Parallel test execution
Test categorization (unit/integration/e2e)
Fixture management
Database reset/cleanup
Log capture and formatting
Test result reporting
Failed test rerunning
Resource cleanup
CI pipeline integration
Cross-platform compatibility
Performance metrics collection

Packaging docker projects

Tasks are an easy way to package docker provided tasks.

The main use case is for Docker-based services with administrative tasks as extra commands.

pdm-dockerize is an example of reusing project provided tasks as docker extra commands.

Appendix D: Semantic: `tasks` vs `scripts`

Project tasks are often referenced with 2 wordings: tasks and scripts. While this might seem secondary, it has consequences. tasks is chosen over scripts for the following reasons:

tasks is more user and project oriented while scripts describe the implementation. The purpose being to provide user and developper with tools, tasks seems more appropriated.
project tasks might be scripts, but they may run binaries which by definition are not scripts.
project.scripts is already used by PEP 621 to describe scripts entrypoints, reusing the same wording will be confusing

Appendix E: `dependencies` vs `hooks` vs `composition`

Tasks dependencies, or declarative dependencies is a way to document tasks requirements. It means that each declared dependency should have run at least once before a task can be executed.

Tasks hooks (pre and post) are tasks taht are systematically execute before or after. They may be deactivated or skipped, en potentially some external system can "hook" into using plugins.

Tasks composition is another concept, it means a task can aggregate some other tasks.

We chose hooks and composition as they cover more cases, while dependencies often forces the runner to have a dependency resolver which is more complex.

noirbizarre/pyproject-tasks.md

Standardizing Python tasks management

Abstract

Motivation

Rationale

Declarative Over Imperative

Progressive Complexity

Documentation

Use Cases

Specification

Task Definition

Task Properties

Tasks types

Command

Shell

Python callable

Composite tasks

Inline task definition

Execution environment

Environment Variables

Environment variables in dotenv file

Working dir

String interpolation

Arguments processing

Tool specific

pre and post hooks

Conditional execution

Documentation

Task Groups

Appendix A: Prior Art in Non-Python Languages

Makefile

npm scripts

mise tasks

cargo-make

rake

Features Matrix

Appendix B: Prior Art in Python

Declaration-based Tools

PDM scripts

Rye scripts

Hatch scripts

tox

Poe the Poet

Feature matrix

Makefile-like Tools

Invoke

duty

Features matrix

Appendix C: Use Cases

Local development

Shell

IDE

Continuous testing

Packaging docker projects

Appendix D: Semantic: tasks vs scripts

Appendix E: dependencies vs hooks vs composition

Environment variables in `dotenv` file

`pre` and `post` hooks

Appendix D: Semantic: `tasks` vs `scripts`

Appendix E: `dependencies` vs `hooks` vs `composition`