Skip to content

Commit

Permalink
docs: update pipeline
Browse files Browse the repository at this point in the history
  • Loading branch information
iluwatar committed May 16, 2024
1 parent 5171682 commit f485c3d
Showing 1 changed file with 80 additions and 47 deletions.
127 changes: 80 additions & 47 deletions pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,60 +3,72 @@ title: Pipeline
category: Behavioral
language: en
tag:
- Decoupling
- API design
- Data processing
- Decoupling
- Extensibility
- Functional decomposition
- Scalability
---

## Also known as

* Chain of Operations
* Processing Pipeline

## Intent

Allows processing of data in a series of stages by giving in an initial input and passing the
processed output to be used by the next stages.
The Pipeline design pattern is intended to allow data processing in discrete stages, where each stage is represented by a different component and the output of one stage serves as the input for the next.

## Explanation

The Pipeline pattern uses ordered stages to process a sequence of input values. Each implemented
task is represented by a stage of the pipeline. You can think of pipelines as similar to assembly
lines in a factory, where each item in the assembly line is constructed in stages. The partially
assembled item is passed from one assembly stage to another. The outputs of the assembly line occur
in the same order as that of the inputs.

The Pipeline pattern uses ordered stages to process a sequence of input values. Each implemented task is represented by a stage of the pipeline. You can think of pipelines as similar to assembly lines in a factory, where each item in the assembly line is constructed in stages. The partially assembled item is passed from one assembly stage to another. The outputs of the assembly line occur in the same order as that of the inputs.

Real world example

> Suppose we wanted to pass through a string to a series of filtering stages and convert it as a
> char array on the last stage.
> A real-world analogous example of the Pipeline design pattern is an **assembly line in a car manufacturing plant**.
>
> In this analogy, the car manufacturing process is divided into several discrete stages, each stage handling a specific part of the car assembly. For example:
>
> 1. **Chassis Assembly:** The base frame of the car is assembled.
> 2. **Engine Installation:** The engine is installed onto the chassis.
> 3. **Painting:** The car is painted.
> 4. **Interior Assembly:** The interior, including seats and dashboard, is installed.
> 5. **Quality Control:** The finished car is inspected for defects.
>
>Each stage operates independently and sequentially, where the output of one stage (e.g., a partially assembled car) becomes the input for the next stage. This modular approach allows for easy maintenance, scalability (e.g., adding more workers to a stage), and flexibility (e.g., replacing a stage with a more advanced version). Just like in a software pipeline, changes in one stage do not affect the others, facilitating continuous improvements and efficient production.
In plain words

> Pipeline pattern is an assembly line where partial results are passed from one stage to another.
> Pipeline pattern is an assembly line where partial results are passed from one stage to another.
Wikipedia says

> In software engineering, a pipeline consists of a chain of processing elements (processes,
> threads, coroutines, functions, etc.), arranged so that the output of each element is the input
> of the next; the name is by analogy to a physical pipeline.
> In software engineering, a pipeline consists of a chain of processing elements (processes, threads, coroutines, functions, etc.), arranged so that the output of each element is the input of the next; the name is by analogy to a physical pipeline.
**Programmatic Example**

The stages of our pipeline are called `Handler`s.
Let's create a string processing pipeline example. The stages of our pipeline are called `Handler`s.

```java
interface Handler<I, O> {
O process(I input);
O process(I input);
}
```

In our string processing example we have 3 different concrete `Handler`s.

```java
class RemoveAlphabetsHandler implements Handler<String, String> {
...
// ...
}

class RemoveDigitsHandler implements Handler<String, String> {
...
// ...
}

class ConvertToCharArrayHandler implements Handler<String, char[]> {
...
// ...
}
```

Expand All @@ -65,56 +77,77 @@ Here is the `Pipeline` that will gather and execute the handlers one by one.
```java
class Pipeline<I, O> {

private final Handler<I, O> currentHandler;
private final Handler<I, O> currentHandler;

Pipeline(Handler<I, O> currentHandler) {
this.currentHandler = currentHandler;
}
Pipeline(Handler<I, O> currentHandler) {
this.currentHandler = currentHandler;
}

<K> Pipeline<I, K> addHandler(Handler<O, K> newHandler) {
return new Pipeline<>(input -> newHandler.process(currentHandler.process(input)));
}
<K> Pipeline<I, K> addHandler(Handler<O, K> newHandler) {
return new Pipeline<>(input -> newHandler.process(currentHandler.process(input)));
}

O execute(I input) {
return currentHandler.process(input);
}
O execute(I input) {
return currentHandler.process(input);
}
}
```

And here's the `Pipeline` in action processing the string.

```java
var filters = new Pipeline<>(new RemoveAlphabetsHandler())
.addHandler(new RemoveDigitsHandler())
.addHandler(new ConvertToCharArrayHandler());
filters.execute("GoYankees123!");
var filters = new Pipeline<>(new RemoveAlphabetsHandler()).addHandler(new RemoveDigitsHandler()).addHandler(new ConvertToCharArrayHandler());
filters.execute("GoYankees123!");
```

## Class diagram

![alt text](./etc/pipeline.urm.png "Pipeline pattern class diagram")
![Pipeline](./etc/pipeline.urm.png "Pipeline pattern class diagram")

## Applicability

Use the Pipeline pattern when you want to

* Execute individual stages that yields a final value.
* Add readability to complex sequence of operations by providing a fluent builder as an interface.
* Improve testability of code since stages will most likely be doing a single thing, complying to
the [Single Responsibility Principle (SRP)](https://java-design-patterns.com/principles/#single-responsibility-principle)
* When you need to process data in a sequence of stages.
* When each stage of processing is independent and can be easily replaced or reordered.
* When you want to improve the scalability and maintainability of data processing code.

## Tutorials

* [The Pipeline Pattern — for fun and profit](https://medium.com/@aaronweatherall/the-pipeline-pattern-for-fun-and-profit-9b5f43a98130)
* [The Pipeline design pattern (in Java)](https://medium.com/@deepakbapat/the-pipeline-design-pattern-in-java-831d9ce2fe21)

## Known Uses

* Data transformation and ETL (Extract, Transform, Load) processes.
* Compilers for processing source code through various stages such as lexical analysis, syntax analysis, semantic analysis, and code generation.
* Image processing applications where multiple filters are applied sequentially.
* Logging frameworks where messages pass through multiple handlers for formatting, filtering, and output.

## Consequences

Benefits:

## Known uses
* Decoupling: Each stage of the pipeline is a separate component, making the system more modular and easier to maintain.
* Reusability: Individual stages can be reused in different pipelines.
* Extensibility: New stages can be added without modifying existing ones.
* Scalability: Pipelines can be parallelized by running different stages on different processors or threads.

* [java.util.Stream](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html)
* [Maven Build Lifecycle](http://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html)
* [Functional Java](https://github.com/functionaljava/functionaljava)
Trade-offs:

## Related patterns
* Complexity: Managing the flow of data through multiple stages can introduce complexity.
* Performance Overhead: Each stage introduces some performance overhead due to context switching and data transfer between stages.
* Debugging Difficulty: Debugging pipelines can be more challenging since the data flows through multiple components.

* [Chain of Responsibility](https://java-design-patterns.com/patterns/chain-of-responsibility/)
## Related Patterns

* [Chain of Responsibility](https://java-design-patterns.com/patterns/chain-of-responsibility/): Both patterns involve passing data through a series of handlers, but in Chain of Responsibility, handlers can decide not to pass the data further.
* [Decorator](https://java-design-patterns.com/patterns/decorator/): Both patterns involve adding behavior dynamically, but Decorator wraps additional behavior around objects, whereas Pipeline processes data in discrete steps.
* [Composite](https://java-design-patterns.com/patterns/composite/): Like Pipeline, Composite also involves hierarchical processing, but Composite is more about part-whole hierarchies.

## Credits

* [The Pipeline Pattern — for fun and profit](https://medium.com/@aaronweatherall/the-pipeline-pattern-for-fun-and-profit-9b5f43a98130)
* [The Pipeline design pattern (in Java)](https://medium.com/@deepakbapat/the-pipeline-design-pattern-in-java-831d9ce2fe21)
* [Design Patterns: Elements of Reusable Object-Oriented Software](https://amzn.to/3w0pvKI)
* [Java Design Patterns: A Hands-On Experience with Real-World Examples](https://amzn.to/3yhh525)
* [Patterns of Enterprise Application Architecture](https://amzn.to/3WfKBPR)
* [Pipelines | Microsoft Docs](https://docs.microsoft.com/en-us/previous-versions/msp-n-p/ff963548(v=pandp.10))

0 comments on commit f485c3d

Please sign in to comment.