Low-Code Data Pipelines Framework

The Pipeline Framework You've Been Looking For!

Simple solutions for common data pipeline challenges

Stream Processing

Connect to Kafka topics instantly
Process message queues effortlessly
Handle real-time data feeds

Simple Configuration

Define pipelines in YAML
No streaming code required
Built-in error handling

Built-in Integrations

Connect to databases directly
Process API data streams
Write to vector stores

Common Use Cases

Real solutions for real streaming challenges

Stream to AI Vector Store

Transform streams into embeddings for real-time AI search and recommendations

Event Stream Processing

Process Kafka topics and message queues with simple YAML configuration

Real-time Data Feeds

Handle IoT sensors, logs, and live data streams without complex code

API Integration

Connect and transform API data streams automatically

Get Started in Minutes

Run your first data pipeline with our built-in example

Install DataYoga

Using pip package manager

1

Run 'pip install datayoga' to install the framework

Initialize Project

Create sample project

2

Run 'datayoga init hello_world' to create a new project with examples

Run Sample Pipeline

See it in action

3

Execute 'datayoga run sample.hello' to transform and display sample user data

Pipeline Examples

Flexible integrations across diverse sources and targets

Kafka to Vector DB

source:
  type: kafka
  topic: user-content
transform:
  type: embedding
  model: openai  
target:
  type: vectordb
  store: pinecone  # or milvus/weaviate/etc
  index: real-time-content

API to Queue

source:
  type: rest-api
  endpoint: /events
target:
  type: rabbitmq

Log Processing

source:
  type: file
  pattern: "*.log"
target:
  type: elasticsearch

Frequently Asked Questions

Common questions about DataYoga Transform

How is DataYoga Transform different from traditional ETL tools?

DataYoga Transform focuses on simplicity and flexibility. Instead of complex workflows or proprietary interfaces, you define pipelines in simple YAML files. This means faster development, easier maintenance, and no vendor lock-in.

Can I use DataYoga Transform alongside my existing data tools?

Yes! DataYoga Transform is designed to complement your existing stack. Use it for specific pipelines while keeping your current tools, or gradually migrate processes as needed.

Can I use DataYoga Transform for AI/ML pipelines?

Absolutely! DataYoga Transform makes it easy to build AI-enabled data pipelines. You can transform data streams into embeddings, connect to vector databases, and power real-time AI applications - all using the same simple YAML configuration you use for traditional pipelines.

Do I need to be a Python expert to use DataYoga Transform?

Not at all. While DataYoga is built in Python, you define pipelines using YAML configuration files. No Python coding required for standard pipelines.

How scalable is DataYoga Transform?

DataYoga Transform handles everything from simple one-off pipelines to production streaming workloads. Built-in features like back-pressure handling and checkpointing ensure reliable processing at scale.

Can I extend DataYoga Transform's functionality?

Yes! While the built-in blocks cover most needs, you can easily create custom blocks for specific requirements. The pluggable architecture makes extending functionality straightforward.