Track your platform usage in real-time and understand costs associated with your agents, gateways, and capabilities.
Usage Dashboard
Accessing Usage
- Go to Account > Usage
- View metrics for:
- Tokens consumed (input + output)
- Cost per model
- Request counts
- Error rates
- Time period selection (Today, Week, Month, Custom)
Key Metrics
| Metric | Description |
|---|
| Tokens | Total input + output tokens across all models |
| Cost | Calculated based on model pricing and region |
| Requests | Number of API calls and tool executions |
| Errors | Failed requests and their error types |
| Latency | Average response time (p50, p95, p99) |
Token Accounting
Tokens are the primary unit of usage and cost calculation.
Token Types
Input Tokens:
- Prompt text you send to the model
- System prompt
- Context windows and memory
- Tool descriptions
- Prior conversation history
Output Tokens:
- Model response text
- Tool execution results
- Streaming overhead
Token Pricing
Pricing varies by model. For current rates, see Pricing.
Cost Optimization
Use Smart Routing
Let Noorle optimize model selection automatically. Smart Routing analyzes complexity and picks the right model:
{
"routing_strategy": {
"strategy": "smart"
}
}
Pair Primary + Budget Model
Cheap model for simple tasks, expensive model for complex ones:
{
"model_name": "gpt-4o",
"budget_model": "gpt-4o-mini"
}
Budget model typically saves 50-80% on simple requests.
Optimize System Prompts
Shorter system prompts = fewer input tokens:
Verbose (high cost):
You are an AI assistant. You should be helpful, harmless, and honest.
You should always provide detailed explanations...
[continues for 2KB]
Concise (low cost):
Helpful AI assistant. Provide clear, brief responses.
Reduce context window size for cheaper operations:
- Agent Settings > Memory Configuration
- Set
working_memory_size to minimum needed
- Reduce
summary_token_threshold
- Set
summary_message_threshold to trigger summaries earlier
Batch Requests
Process multiple items in one request instead of individual calls:
Expensive (separate calls):
3 API calls × 1000 tokens each = 3000 tokens
Efficient (batch):
1 API call × 2500 tokens = 2500 tokens saved
Use Caching and Memory
Leverage conversation memory to avoid re-processing:
- Enable agent memory (3-tier system)
- Reuse context across requests
- Let platform manage summarization
| Limit | Value | Notes |
|---|
| Upload Size | 20MB | Maximum request payload |
| MCP Request Timeout | 240 seconds | Per gateway request |
For plan-specific limits and quotas, see Pricing.
Contact sales for custom enterprise limits.
Hitting Limits
If you hit a rate limit:
HTTP 429 Too Many Requests
Response includes:
{
"error": "rate_limit_exceeded",
"retry_after_seconds": 60
}
Handling:
- Implement exponential backoff
- Batch requests
- Request custom limits
Usage Breakdown by Resource
By Agent
- Go to Agents > Select Agent
- Click Usage tab
- See tokens and cost for this agent only
By Gateway
- Go to Gateways > Select Gateway
- Click Usage tab
- See tokens and cost for this gateway only
By Model
- Go to Account > Usage
- View “By Model” table
- See which models are most expensive
By Capability
- Go to Account > Usage
- View “By Capability” table
- See cost of each capability (Web Search, Code Runner, etc.)
Capability Limits Reference
Web Search
| Limit | Value | Notes |
|---|
| Timeout | 30 seconds | Per request |
| Max Results | 10 | Per query |
| Max Retries | 3 | Automatic |
Optimization Tips: Use specific queries to reduce result processing. Cache recurring searches in Files capability.
Browser
| Limit | Value | Notes |
|---|
| Timeout | 30 seconds | Configurable |
| Default Viewport | 1280 × 720 | Configurable |
| JavaScript | Enabled | Fully renders dynamic content |
Optimization Tips: Use text extraction instead of screenshots when you only need content. Set appropriate timeouts for slow pages.
Code Runner
| Limit | Value | Notes |
|---|
| Timeout | 30 seconds | Maximum execution time |
| Memory | 128MB | Per execution (configurable up to 512MB) |
| CPU Time | ~100ms | Maximum CPU time |
| Code Size | 10KB | Maximum code length |
| Output Size | 1MB | Console output |
| Network | Disabled | No HTTP requests |
| File I/O | Disabled | No file read/write |
Optimization Tips: Keep executions under 100ms CPU time. Process large datasets in chunks. Use Sandbox for anything needing external packages or network access.
Sandbox
| Limit | Value | Notes |
|---|
| Sizes | XS to XL | 5 predefined tiers |
| vCPU Range | 0.25 – 4.0 | Based on size |
| Memory Range | 1 GB – 12 GB | Based on size |
| Disk Range | 4 GB – 20 GB | Based on size |
| Command Timeout | 5 min default, 30 min max | Configurable |
| Max Lifetime | 60 minutes | Auto-terminates |
| Idle Timeout | 20 min | Auto-terminates |
| Containers per Session | 1 | One active at a time |
| Network | Outbound only | No inbound connections |
Optimization Tips: Start with the smallest size (XS) and scale up as needed. Use Code Runner for simple operations that don’t need packages. Sandboxes auto-terminate after 20 min idle to save costs.
HTTP Client
| Limit | Value | Notes |
|---|
| Request Body | 2MB | Maximum size |
| Response Body | 2MB | Maximum size |
| Timeout | 30s default | Min 1s, max 300s |
| Rate Limit | 1,000/min | Per user |
| Max Redirects | 10 | Per request chain |
Optimization Tips: Implement retry logic with exponential backoff. Compress large payloads. Use connection pooling for repeated calls to the same host.
Files
| Limit | Value | Notes |
|---|
| Per-File Size | 10MB | All scopes |
| Files per Scope | 100 | Per scope |
| Agent Home Storage | 50MB | Persistent agent storage |
Optimization Tips: Use session-scoped files for temporary processing. Only persist results to agent home when needed. Clean up unneeded files to stay within limits.
Knowledge Retrieval
| Limit | Value | Notes |
|---|
| Query Length | 500 chars | Search query max |
| Results per Search | 100 | Maximum |
| Document Size | 10MB | Per document |
| Index Size | 10GB | Per knowledge base |
| Concurrent Searches | 100/sec | Per knowledge base |
Optimization Tips: Pre-filter with metadata before searching. Use appropriate chunk sizes for your documents. Cache frequent queries.
Computer
| Limit | Value | Notes |
|---|
| Sizes | x2 to x32 | 5 predefined tiers |
| vCPU Range | 2 – 16 | Based on size |
| RAM Range | 2 GB – 32 GB | Based on size |
| Disk Range | 40 GB – 360 GB | Based on size |
| Default Size | x4 | 3 vCPU, 4 GB RAM, 80 GB disk |
| SSH Timeout | 30 seconds | Per command |
| Browser Max Tabs | 5 | Per session |
| Browser Max Download | 50 MB | Per file |
| Snapshot Elements | 500 | Truncated if exceeded |
Optimization Tips: Start with x4 (default) and scale up for memory-intensive or multi-process workloads. Enable browser subsystem only if needed. Agent-only — not available on MCP gateways.
Plugins (WebAssembly)
| Limit | Value | Notes |
|---|
| Memory | 128MB default, 512MB max | Per invocation |
| CPU Time | 100ms max | Per request |
| Timeout | 30s default, 120s max | Configurable |
| Module Size | 10MB | Compressed npack file |
Optimization Tips: Configure memory limits based on your workload (up to 512MB max). Set appropriate timeouts (up to 120s max). Keep module size small for faster deployment.
Pricing and Billing
For current pricing details, plan comparisons, and billing information, see Pricing.
Alerts and Budgets
Budget Alerts
Set spending threshold to receive notifications:
- Go to Account > Billing > Budget Alerts
- Click Create Alert
- Set threshold amount
- Choose notification method (email, webhook)
- Click Save
You’ll be notified when spending reaches the threshold.
Usage Alerts
Get notified when you approach limits:
- Go to Account > Usage > Alerts
- Toggle the limits you want to monitor:
- Daily API calls
- Storage usage
- Token consumption
- Set notification threshold
- Click Save
Enterprise Limits
For enterprise plans with custom limits, contact sales@noorle.com or see Pricing.
Troubleshooting
High Unexpected Cost
Check:
- Go to Usage dashboard
- Filter by time period (daily, hourly)
- Identify which resource is expensive
- Check token breakdown by model
Common causes:
- Large system prompt
- Expensive model selected
- Budget model not triggering
- Streaming overhead
- Knowledge base queries with large documents
Hitting Rate Limit
Solutions:
- Implement exponential backoff
- Batch requests
- Reduce concurrent requests
- Upgrade plan
- Contact support for custom limits
Usage Not Updating
Give it time: Usage updates every 5-15 minutes. Check again in a few minutes.
Force refresh: Go to different tab and return.
API Access
Query usage programmatically:
# Get usage statistics
curl https://api.noorle.com/v1/usage/summary \
-H "X-API-Key: ak-{your_key}"
# Get detailed usage
curl https://api.noorle.com/v1/usage/detailed?period=month \
-H "X-API-Key: ak-{your_key}"
# Get resource-specific usage
curl https://api.noorle.com/v1/agents/{agent_id}/usage \
-H "X-API-Key: ak-{your_key}"
Next Steps