Writing a Task¶

This guide walks through adding a new bioinformatics task to Stargazer.

1. Define the Asset Type (if needed)¶

If your task produces a new kind of output, add an asset subclass in src/stargazer/types/:

class MyOutput(Asset):
    _asset_key: ClassVar[str] = "my_output"
    sample_id: str

The _asset_key must be unique. The class auto-registers via __init_subclass__.

2. Create the Task Module¶

Add a file in src/stargazer/tasks/ named after the tool (e.g., my_tool.py):

import asyncio
import flyte

from stargazer.types.reference import Reference
from stargazer.types.alignment import Alignment

tool_env = flyte.TaskEnvironment(name="my_tool")

@tool_env.task
async def run_my_tool(ref: Reference, aln: Alignment) -> MyOutput:
    """One-line description of what this task does."""
    await asyncio.gather(ref.fetch(), aln.fetch())

    output_path = Path("/tmp/output.ext")
    # ... run tool subprocess ...

    result = MyOutput()
    await result.update(output_path, sample_id=aln.sample_id)
    return result

3. Key Rules¶

One task, one operation — don't combine multiple tool calls unless there's a good reason, e.g. piping between tools where the intermediate would have little long-term value for re-analysis
Always fetch() inputs before accessing their paths
Always update() outputs to register them in storage
Use asyncio.gather to fetch multiple inputs in parallel
Specify resources via TaskEnvironment for CPU/memory/GPU-intensive tools
Use pathlib.Path for all filesystem operations; convert to str only for subprocess calls

4. Register in the MCP Server¶

Tasks are automatically discovered if they're importable from the stargazer.tasks package. Ensure your module is imported in src/stargazer/tasks/__init__.py.

5. Test¶

Write a test in tests/unit/ that mocks storage and verifies the task produces the expected output type with correct keyvalues.