Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines with multiple inputs #29

Open
lschr opened this issue Oct 9, 2017 · 5 comments
Open

Pipelines with multiple inputs #29

lschr opened this issue Oct 9, 2017 · 5 comments

Comments

@lschr
Copy link
Contributor

lschr commented Oct 9, 2017

I was thinking about implementing support for multiple inputs (and maybe outputs) to pipelines. For example, it would be nice to be able to

@pipeline
def add(a1, a2):
    return a1 + a2

i1 = pims.open("img1.tif")
i2 = pims.open("img2.tif")
i12 = add(i1, i2)

Before I start I wanted to ask how to go about it. A few possibilities, decreasing in elegance (in my opinion) but increasing in API compatibility:

  • Extend the Pipeline class (and pipeline decorator)
    • Rearrange __init__ arguments: def __init__(proc_func, *ancestors, propagate_attrs=None). This breaks the API.
    • Keep __init__ arguments in order: def __init__(*args, propagate_attrs=None), where args[-1] is proc_func. This turns propagate_attrs into a keyword-only and proc_func and ancestor into positional arguments but otherwise preserves the API.
    • Preserve the API: def __init__(ancestor, proc_func, propagate_attrs=None, *other_ancestors)
    • Some middle ground between the above
  • Create a new class (and decorator). Complete freedom since there is no backward-compatibility to think of.
  • Something else I did not think about

Which is the preferred way?

@caspervdw
Copy link
Member

@lschr Thanks for this great initiative! I would personally start extending the Pipeline class. I am not very worried about breaking the API at this point, as there is still no PIMS release that depends on it (ref soft-matter/pims#247). So I would say, just go for the most elegant option.

@lschr
Copy link
Contributor Author

lschr commented Oct 10, 2017

This should be easy enough. The only problem I have come across so far is attribute propagation. A few possibilities:

  • Propagate only attributes from the first ancestor
  • Propagate attributes from all ancestors as long as there are no name conflicts. If there are conflicts, use the attribute from the first ancestor that has the attribute.
  • Return a tuple containing the respective attribute values from all ancestors.
  • Do some name mangling to avoid conflicts.

I'd strongly prefer the first or second solution. The third option is cumbersome to use in my opinion; the propagated attribute would need different treatment from the original attribute. The fourth option is messy altogether. Anyone needing non-trivial treatment of attributes can do so by subclassing Pipeline.

Any thoughts?

@nkeim
Copy link
Contributor

nkeim commented Oct 10, 2017 via email

@lschr
Copy link
Contributor Author

lschr commented Oct 17, 2017

#30 is an initial implementation. I suggest moving the discussion over there.

@lschr
Copy link
Contributor Author

lschr commented Jan 3, 2018

The implementation in #30 is finished now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants