generalMarch 24, 202611 min read

Building a Multi-Tenant Data Analysis Service

Architecture guide for wrapping DataStoryBot in a multi-tenant service: user isolation, quota management, result caching.

By DataStoryBot Team

Building a Multi-Tenant Data Analysis Service

If you're building a product where multiple customers each upload their own data and run their own analyses, DataStoryBot is a capable backend — but it doesn't handle tenancy for you. A raw DataStoryBot integration gives every user the same API key, the same rate limit pool, and no isolation between their data. That's fine for a prototype. It's not fine for production.

This guide covers the architecture for wrapping DataStoryBot in a proper multi-tenant service: how to isolate tenant data, enforce per-tenant quotas, cache results to control costs, and track usage for billing. Code examples use TypeScript with Node.js, but the patterns apply to any backend.

Before diving in, make sure you're comfortable with the basic API flow. The getting started guide covers the three-call pipeline (upload → analyze → refine) that everything here builds on.

Architecture Overview

A multi-tenant wrapper sits between your end users and DataStoryBot. No user ever calls DataStoryBot directly.

Tenant A ──┐
Tenant B ──┼──> [ Your Service Layer ] ──> DataStoryBot API
Tenant C ──┘         │
                      ├── Tenant DB (quotas, billing, metadata)
                      ├── Cache (Redis / S3)
                      └── Key Vault (API credentials)

Your service layer handles:

Authentication — identifying which tenant made the request
Quota enforcement — blocking requests that exceed limits before calling DataStoryBot
Request proxying — forwarding the actual analysis work with per-tenant tracking
Result caching — storing results so identical analyses don't re-run
Usage recording — writing billable events for downstream invoicing

Each layer is independent. You can add them incrementally.

Tenant Identification and Middleware

Every request to your service needs to carry a tenant identity. The simplest approach is an API key that maps to a tenant record. Issue each customer their own key; they never see yours.

// middleware/tenant.ts
import { Request, Response, NextFunction } from "express";
import { db } from "../db";

export interface TenantContext {
  tenantId: string;
  plan: "starter" | "pro" | "enterprise";
  quotas: {
    analysesPerMonth: number;
    maxFileSizeMb: number;
    cacheEnabled: boolean;
  };
}

declare global {
  namespace Express {
    interface Request {
      tenant: TenantContext;
    }
  }
}

export async function tenantMiddleware(
  req: Request,
  res: Response,
  next: NextFunction
) {
  const apiKey = req.headers["x-api-key"];

  if (!apiKey || typeof apiKey !== "string") {
    return res.status(401).json({ error: "Missing API key" });
  }

  // Hash the key before the DB lookup — never store plaintext keys
  const keyHash = hashApiKey(apiKey);
  const tenant = await db.tenants.findByKeyHash(keyHash);

  if (!tenant || !tenant.active) {
    return res.status(401).json({ error: "Invalid or inactive API key" });
  }

  req.tenant = {
    tenantId: tenant.id,
    plan: tenant.plan,
    quotas: getPlanQuotas(tenant.plan),
  };

  next();
}

function getPlanQuotas(plan: string) {
  const quotas = {
    starter: { analysesPerMonth: 50, maxFileSizeMb: 10, cacheEnabled: false },
    pro: { analysesPerMonth: 500, maxFileSizeMb: 50, cacheEnabled: true },
    enterprise: { analysesPerMonth: 10000, maxFileSizeMb: 50, cacheEnabled: true },
  };
  return quotas[plan] ?? quotas.starter;
}

Apply this middleware to every route. From here on, req.tenant is always populated and trustworthy.

Data Isolation

DataStoryBot uses file IDs to reference uploaded CSVs. A file ID returned from one upload session is not accessible to other sessions, so there is baseline isolation at the storage level — but your service needs to enforce that tenants can only reference their own file IDs.

Store every file ID your service creates, tagged to the tenant:

// services/fileRegistry.ts
export async function registerFile(
  tenantId: string,
  fileId: string,
  metadata: { fileName: string; sizeBytes: number; uploadedAt: Date }
) {
  await db.files.insert({ tenantId, fileId, ...metadata });
}

export async function assertFileOwnership(
  tenantId: string,
  fileId: string
): Promise<void> {
  const record = await db.files.findOne({ tenantId, fileId });
  if (!record) {
    throw new ForbiddenError(
      `File ${fileId} does not belong to tenant ${tenantId}`
    );
  }
}

Call assertFileOwnership before any downstream request that takes a fileId parameter. This prevents tenant A from referencing tenant B's file IDs even if they somehow obtain one.

Never expose raw DataStoryBot file IDs to end users if you can avoid it. Instead, maintain your own opaque IDs that map to DataStoryBot file IDs server-side. This gives you an additional indirection layer and makes key rotation easier.

Quota Management

Quotas need to be checked before the analysis runs, not after. A post-hoc check means you've already incurred the cost.

// services/quota.ts
import { redis } from "../redis";

export async function checkAndIncrementQuota(
  tenantId: string,
  limit: number
): Promise<void> {
  const key = quotaKey(tenantId);
  const current = await redis.get(key);
  const used = current ? parseInt(current, 10) : 0;

  if (used >= limit) {
    throw new QuotaExceededError(
      `Monthly analysis quota of ${limit} exceeded. Used: ${used}`
    );
  }

  // Increment atomically. Set expiry to end of current billing month.
  const pipeline = redis.pipeline();
  pipeline.incr(key);
  pipeline.expireat(key, endOfMonthTimestamp());
  await pipeline.exec();
}

function quotaKey(tenantId: string): string {
  const month = new Date().toISOString().slice(0, 7); // "2026-03"
  return `quota:${tenantId}:${month}`;
}

function endOfMonthTimestamp(): number {
  const now = new Date();
  const end = new Date(now.getFullYear(), now.getMonth() + 1, 1);
  return Math.floor(end.getTime() / 1000);
}

Using Redis with atomic INCR prevents double-spending under concurrent requests. The key expires at the billing period boundary so counts reset automatically.

Wire the quota check into your analyze route:

// routes/analyze.ts
router.post("/analyze", tenantMiddleware, async (req, res) => {
  const { fileId, steeringPrompt } = req.body;
  const { tenant } = req;

  // 1. Verify file ownership
  await assertFileOwnership(tenant.tenantId, fileId);

  // 2. Check quota before doing any work
  await checkAndIncrementQuota(tenant.tenantId, tenant.quotas.analysesPerMonth);

  // 3. Check cache (if plan allows)
  if (tenant.quotas.cacheEnabled) {
    const cached = await getCachedResult(tenant.tenantId, fileId, steeringPrompt);
    if (cached) {
      await recordUsageEvent(tenant.tenantId, "cache_hit", { fileId });
      return res.json({ ...cached, fromCache: true });
    }
  }

  // 4. Forward to DataStoryBot
  const result = await runAnalysis(fileId, steeringPrompt);

  // 5. Cache and record
  if (tenant.quotas.cacheEnabled) {
    await cacheResult(tenant.tenantId, fileId, steeringPrompt, result);
  }
  await recordUsageEvent(tenant.tenantId, "analysis_complete", { fileId, result });

  res.json(result);
});

Note that the quota is incremented before the analysis runs. If the downstream call fails, you have two choices: decrement the quota (optimistic) or eat the cost (conservative). The conservative approach is simpler and avoids abuse patterns where users trigger failures intentionally to avoid quota charges.

For robust handling of DataStoryBot failures in this proxy layer, see the error handling and retry patterns guide.

Result Caching

Analysis results are deterministic for a given file and prompt. If tenant A runs "show me sales trends" on the same CSV twice, the second call should return the cached result rather than re-running the full pipeline.

// services/cache.ts
import { createHash } from "crypto";
import { s3 } from "../s3";

const CACHE_TTL_SECONDS = 60 * 60 * 24 * 7; // 7 days

function cacheKey(tenantId: string, fileId: string, prompt: string): string {
  const normalized = prompt.trim().toLowerCase();
  const hash = createHash("sha256")
    .update(`${fileId}:${normalized}`)
    .digest("hex")
    .slice(0, 16);
  return `cache/${tenantId}/${hash}.json`;
}

export async function getCachedResult(
  tenantId: string,
  fileId: string,
  prompt: string
): Promise<AnalysisResult | null> {
  const key = cacheKey(tenantId, fileId, prompt);
  try {
    const obj = await s3.getObject({ Bucket: process.env.CACHE_BUCKET!, Key: key });
    const body = await obj.Body!.transformToString();
    return JSON.parse(body);
  } catch (err: any) {
    if (err.name === "NoSuchKey") return null;
    throw err;
  }
}

export async function cacheResult(
  tenantId: string,
  fileId: string,
  prompt: string,
  result: AnalysisResult
): Promise<void> {
  const key = cacheKey(tenantId, fileId, prompt);
  await s3.putObject({
    Bucket: process.env.CACHE_BUCKET!,
    Key: key,
    Body: JSON.stringify(result),
    ContentType: "application/json",
    // S3 lifecycle rule should match this tag for TTL enforcement
    Tagging: `ttl=${CACHE_TTL_SECONDS}`,
  });
}

A few notes on this approach:

The cache key combines the file ID and a normalized prompt. Normalization (trim, lowercase) prevents cache misses from trivial variations in whitespace or capitalization. You may want more aggressive normalization — stripping filler words — if your users repeat prompts with minor rewording.

Use S3 (or equivalent object storage) rather than Redis for result caching. Analysis results include chart metadata and narrative text that can be several kilobytes. Redis works, but S3 is cheaper at scale and you get per-key lifecycle policies for expiry.

Cache the chart file IDs that DataStoryBot returns, not the chart bytes themselves. Chart files are already stored by DataStoryBot; caching the reference is enough. When you serve a cached result, users fetch charts through your files proxy, which re-downloads from DataStoryBot on demand.

Billing Events

Usage-based billing requires a clean event log. Write a record for every billable action:

// services/billing.ts
export type UsageEventType =
  | "file_upload"
  | "analysis_complete"
  | "cache_hit"
  | "file_download";

export async function recordUsageEvent(
  tenantId: string,
  eventType: UsageEventType,
  metadata: Record<string, unknown>
) {
  await db.usageEvents.insert({
    tenantId,
    eventType,
    metadata,
    occurredAt: new Date(),
  });
}

Keep the event table append-only. Never update or delete rows. This gives you an audit trail for billing disputes and lets you replay events if your billing logic changes.

Common billing models for data analysis services:

Per analysis — charge for each /analyze + /refine call pair. Cache hits can be free or discounted.
Per row analyzed — charge based on the CSV row count, extracted from DataStoryBot's response metadata.
Flat monthly by tier — quota-based plans where overages trigger tier upgrades or hard blocks.

The event log supports all three. Query analysis_complete events for per-analysis counts, join with file metadata for row counts, aggregate by month for tier billing.

API Key Management

Never hardcode your DataStoryBot API key. Use a secrets manager (AWS Secrets Manager, HashiCorp Vault, or even environment variables injected at deploy time). Rotate the key without redeploying by fetching it at runtime with a short cache TTL:

// services/datastorybot.ts
import { SecretsManager } from "@aws-sdk/client-secrets-manager";

const sm = new SecretsManager({});
let cachedKey: { value: string; fetchedAt: number } | null = null;
const KEY_CACHE_TTL_MS = 5 * 60 * 1000; // refresh every 5 minutes

async function getApiKey(): Promise<string> {
  const now = Date.now();
  if (cachedKey && now - cachedKey.fetchedAt < KEY_CACHE_TTL_MS) {
    return cachedKey.value;
  }
  const secret = await sm.getSecretValue({ SecretId: "datastorybot/api-key" });
  cachedKey = { value: secret.SecretString!, fetchedAt: now };
  return cachedKey.value;
}

export async function callDataStoryBot(
  endpoint: string,
  options: RequestInit
): Promise<Response> {
  const key = await getApiKey();
  return fetch(`https://datastory.bot${endpoint}`, {
    ...options,
    headers: {
      ...options.headers,
      Authorization: `Bearer ${key}`,
    },
  });
}

This means a key rotation requires updating the secret in the vault — no code changes, no deploy.

Security Considerations

Tenant data never mixes. The file ownership check at the start of every request is your primary isolation guarantee. Test it explicitly: create two tenants, upload a file as tenant A, and verify that tenant B cannot analyze or download it.

Sanitize file uploads before forwarding. Validate MIME type and extension on your side before passing to DataStoryBot. Reject files that don't match expected CSV formats at your layer rather than relying on DataStoryBot's validation.

Log tenant IDs, not user data. Your service logs should record tenantId and fileId for debugging, but should never log CSV content, narrative results, or file metadata that could contain PII. Configure your logger to redact request bodies.

Rate limit at the network layer too. Quota enforcement in application code is vulnerable to thundering herd — 1000 concurrent requests from one tenant can all pass the quota check before the first increment commits. Complement application-level quotas with a reverse proxy rate limiter (nginx limit_req, AWS API Gateway throttling) set to something like 10 req/s per API key.

Scope the DataStoryBot API key minimally. If DataStoryBot introduces scoped API keys, use the narrowest scope that covers your use case. A key that can only upload and analyze is less dangerous than one that can also delete files or access account settings.

Putting It Together

The complete request path for a tenant analysis looks like this:

POST /analyze  (x-api-key: tenant-key)
  │
  ├── tenantMiddleware        → resolve tenantId, plan, quotas
  ├── assertFileOwnership     → verify fileId belongs to tenant
  ├── checkAndIncrementQuota  → block if limit exceeded
  ├── getCachedResult         → return cached result if available
  ├── callDataStoryBot        → upload → analyze → refine
  ├── cacheResult             → store in S3 for future hits
  └── recordUsageEvent        → append billing event

Each step is independently testable and can fail without taking down the others. The quota check and ownership assertion are synchronous guards. The cache lookup and DataStoryBot calls are async I/O that can be instrumented separately.

For the full DataStoryBot API surface you're proxying, see the API reference. It covers every endpoint parameter and response field you'll need to handle correctly in your proxy layer.

Start with tenant identification and quota enforcement — those are the pieces that protect you from runaway costs and data leakage. Add caching once you have real usage data and can see which analyses repeat most often. Build billing event recording from day one, even if you're not charging yet; retrofitting an event log into a live system is painful.

Ready to find your data story?

Upload a CSV and DataStoryBot will uncover the narrative in seconds.

Try DataStoryBot →