Frequently Asked Questions

General

What is NPipeline?

NPipeline is a composable data orchestration library for .NET. It processes data through connected nodes (sources, transforms, sinks) with built-in support for parallelism, resilience, observability, and extensibility.

What .NET versions are supported?

NPipeline targets .NET 8.0 and later. Use the latest LTS release for production.

Is NPipeline free?

The core engine and base abstractions are fully free and open-source under the MIT License. Concrete connectors, storage providers, and extensions use the Business Source License 1.1 (BSL 1.1):

Non-production use (dev, CI/CD, testing): always free.
Production use: free for organizations with 4 or fewer developers and ≤ $5M AUD annual revenue.
Commercial license required for larger organizations - see npipeline.com.
Every BSL-licensed package automatically converts to MIT two years after its release.

How does NPipeline compare to other tools?

Feature	NPipeline	Apache Airflow	Azure Data Factory
Programming	C# code-first	Python DAGs	Visual designer
Infrastructure	In-process	Distributed	Managed cloud
Type safety	Strong	Dynamic	Limited
Unit testing	Simple	Requires framework	Complex
Cost	Free core (MIT); BSL connectors/extensions	Free (self-hosted)	Per-operation

Choose NPipeline when you build .NET applications, need lightweight in-process pipelines, and want strong type safety with code-based configuration.

Deployment

Can I use NPipeline in Azure Functions?

Yes, but consider startup time and memory constraints. Works well for moderate data volumes. Cold starts may be slow for large pipelines - pre-warm functions for production.

Can I use NPipeline in AWS Lambda?

Possible for small, fast pipelines. Not ideal for long-running operations (15-minute timeout). Consider Step Functions for orchestration of larger workloads.

Can I use NPipeline in Kubernetes?

Excellent choice. Deploy as containerized console apps with the Worker pattern:

csharp

var host = Host.CreateDefaultBuilder()
    .ConfigureServices((ctx, services) =>
    {
        services.AddNPipeline(Assembly.GetExecutingAssembly());
    })
    .Build();

await host.RunAsync();

Horizontal scaling works naturally with multiple pod instances processing independent data partitions.

Pipeline Design

How many nodes should a pipeline have?

Sweet spot: 3–10 nodes for most pipelines
Complex: 10–50 nodes are manageable with good organization
50+: Break into multiple pipelines using composition

Can I have multiple sources or sinks?

Yes. Connect multiple sources to downstream transforms, and fan out to multiple sinks:

csharp

var source1 = builder.AddSource<DbSource, Order>("db");
var source2 = builder.AddSource<ApiSource, Order>("api");
var transform = builder.AddTransform<Enrich, Order, Order>("enrich");
var sink1 = builder.AddSink<FileSink, Order>("file");
var sink2 = builder.AddSink<DbSink, Order>("db-out");

builder.Connect(source1, transform);
builder.Connect(source2, transform);
builder.Connect(transform, sink1);
builder.Connect(transform, sink2);

Should transforms be stateless?

Stateless is preferred - easier to test, thread-safe, and compatible with parallel execution. Stateful transforms (running totals, caches) are allowed but require synchronization if used with parallel execution. See Thread Safety.

When should I use composition vs a flat pipeline?

Use composition when:

You have reusable sub-workflows (validation, enrichment) shared across pipelines
A section of your pipeline is complex enough to warrant independent testing
You want clear separation of concerns

Use a flat pipeline when:

The workflow is simple and linear
Performance overhead of sub-pipeline context creation matters (high-throughput, CPU-bound)

Performance

How do I make my pipeline faster?

Profile first - enable observability to find the bottleneck
Use parallel execution for CPU-bound transforms - see Parallel Execution
Stream data - use DataStream<T> instead of materializing entire datasets
Override ExecuteValueTaskAsync - avoid Task allocations for synchronous transforms (see Synchronous Fast Paths)
Batch I/O operations - use batching for database writes and API calls
Avoid LINQ in hot paths - analyzer NP9103 catches this

How many pipelines can run concurrently?

Depends on resource usage per pipeline. Lightweight in-memory pipelines can run dozens to hundreds concurrently. I/O-heavy pipelines are typically limited by connection pool sizes and external service capacity.

What if I only need to process data once?

Use a console app - no background service or long-lived host needed:

csharp

var services = new ServiceCollection();
services.AddNPipeline(Assembly.GetExecutingAssembly());
var runner = services.BuildServiceProvider().GetRequiredService<IPipelineRunner>();
await runner.RunAsync<MyPipeline>(new PipelineContext());

Next Steps

Your First Pipeline - hands-on tutorial
Key Concepts - core model
Samples - runnable examples

Frequently Asked Questions ​

General ​

What is NPipeline? ​

What .NET versions are supported? ​

Is NPipeline free? ​

How does NPipeline compare to other tools? ​

Deployment ​

Can I use NPipeline in Azure Functions? ​

Can I use NPipeline in AWS Lambda? ​

Can I use NPipeline in Kubernetes? ​

Pipeline Design ​

How many nodes should a pipeline have? ​

Can I have multiple sources or sinks? ​

Should transforms be stateless? ​

When should I use composition vs a flat pipeline? ​

Performance ​

How do I make my pipeline faster? ​

How many pipelines can run concurrently? ​

What if I only need to process data once? ​

Next Steps ​

Frequently Asked Questions

General

What is NPipeline?

What .NET versions are supported?

Is NPipeline free?

How does NPipeline compare to other tools?

Deployment

Can I use NPipeline in Azure Functions?

Can I use NPipeline in AWS Lambda?

Can I use NPipeline in Kubernetes?

Pipeline Design

How many nodes should a pipeline have?

Can I have multiple sources or sinks?

Should transforms be stateless?

When should I use composition vs a flat pipeline?

Performance

How do I make my pipeline faster?

How many pipelines can run concurrently?

What if I only need to process data once?

Next Steps