Introducing effGen — The Future of SLM Agents

Build Powerful AI Agents
with Small Language Models

Optimized for SLMs. 5-10x faster with complexity routing, automatictask decomposition, multi-agent orchestration, and vLLM.

$
10x
Faster
14
Built-in Tools
5
Presets
3
Protocols
100%
Open Source
● RUNNING
quick_start.py
1from effgen import Agent, load_model
2from effgen.tools.builtin import Calculator, PythonREPL
3
4# Load a small but mighty model
5model = load_model("Qwen/Qwen2.5-1.5B-Instruct", quantization="4bit")
6
7# Create agent with tools
8agent = Agent(name="math_agent", model=model, tools=[Calculator(), PythonREPL()])
9
10result = agent.run("What is 24344 * 334?")
11print(f"Answer: {result.output}") # 8,130,896
OUTPUT
Answer: 8,130,896
SCROLL
Features

Everything You Need to Build
Production-Ready AI Agents

Optimized for Small Language Models with enterprise-grade features

Intelligent Task Decomposition

Automatically breaks down complex tasks with multi-dimensional complexity analysis and spawns specialized sub-agents.

  • Automatic complexity scoring
  • Sub-agent routing
  • Parallel execution

Universal Tool Integration

14 built-in tools with full MCP, A2A, and ACP protocol support. Create custom tools in minutes.

  • 14 tools (search, code, JSON, weather...)
  • Plugin system for custom tools
  • MCP/A2A/ACP protocols

SLM-Optimized Prompts

Advanced prompt engineering designed for smaller models with Jinja2 templates and few-shot learning.

  • Template management
  • Context compression
  • Chain orchestration

Multi-Agent Orchestration

Coordinate multiple specialized agents with lifecycle management and agent-to-agent communication.

  • Task routing
  • Shared memory
  • A2A protocol

Ultra-Fast vLLM Integration

Native vLLM support delivers 5-10x faster inference. Auto multi-GPU tensor parallelism and PagedAttention.

  • 5-10x faster inference
  • 60% memory reduction
  • Auto multi-GPU support

Production Infrastructure

Docker sandboxed execution, comprehensive logging, state persistence, and enterprise security.

  • CI/CD pipelines
  • OpenTelemetry tracing
  • Prometheus metrics

Real Token Streaming

True token-by-token streaming via generate_stream() with callbacks for thoughts, tool calls, and answers.

  • Token-by-token output
  • SSE API endpoint
  • Streaming callbacks

Agent Presets

One-line agent creation with ready-to-use configurations. Math, research, coding, general, and minimal.

  • create_agent() factory
  • 5 built-in presets
  • CLI --preset flag

Integrated Memory System

Short-term, long-term, and vector memory connected to every agent. Persistent multi-turn context.

  • ShortTerm + LongTerm memory
  • Vector store (FAISS/Chroma)
  • Auto-summarization
Built-in Tools

14 Tools Ready to Use

From web search to code execution — everything your agent needs, built in

🧮

Calculator

Math, conversions, statistics

🔍

WebSearch

DuckDuckGo search

CodeExecutor

Sandboxed code execution

🐍

PythonREPL

Interactive Python

📁

FileOperations

File read/write/search

📖

Retrieval

RAG + BM25 hybrid search

🔎

AgenticSearch

ripgrep-based search

💻

BashTool

Shell commands

🌤️

WeatherTool

Open-Meteo API (free)

📋

JSONTool

Parse, query, validate JSON

🕐

DateTimeTool

Timezones, date arithmetic

📝

TextProcessing

Regex, word count, diff

🌐

URLFetchTool

Web page text extraction

📚

WikipediaTool

Wikipedia article search

Agent Presets

One-Line Agent Creation with 5 Presets

Ready-to-use configurations optimized for common use cases

🧮

math

Mathematical computations

CalculatorPythonREPL
create_agent("math", model)
CLICK FOR DETAILS
🔬

research

Web research & information

WebSearchURLFetchWikipedia
create_agent("research", model)
CLICK FOR DETAILS
💻

coding

Code execution & development

CodeExecutorPythonREPLFileOpsBash
create_agent("coding", model)
CLICK FOR DETAILS
🚀

general

All 11 tools for any task

All 11 tools included
create_agent("general", model)
CLICK FOR DETAILS

minimal

Direct inference, no tools

No tools — pure LLM
create_agent("minimal", model)
CLICK FOR DETAILS
Quick Start

Up and Running in 60 Seconds

Three simple steps to your first AI agent

01

Install

Get started with pip. Includes vLLM for blazing-fast inference.

bash
pip install effgen[vllm]
02

Create Agent

One-line agent creation with built-in presets and all 11 tools.

python
from effgen import load_model
from effgen.presets import create_agent

model = load_model("Qwen/Qwen2.5-3B-Instruct", quantization="4bit")
agent = create_agent("general", model)  # All 11 tools included
03

Execute Tasks

Run tasks with real-time token streaming and tool execution.

python
result = agent.run(
    "Analyze the latest tech trends and "
    "create a comprehensive report"
)
Start Building Today — It's Free

Ready to Build the
Future of AI?

Join thousands of developers building next-gen agents with effGen. Open source, production-ready, and blazing fast.

...
GitHub Stars
...
Forks
...
Contributors