general8 min read

How to Generate a Data Report from CSV in One API Call

Turn any CSV file into a complete data report with narrative insights, charts, and supporting data — using a single API call. No templates, no formatting.

By DataStoryBot Team

How to Generate a Data Report from CSV in One API Call

Every Monday morning, the same ritual plays out across thousands of teams: someone opens a CSV export, fires up a spreadsheet, builds a few charts, writes some commentary, formats it all into a slide deck or PDF, and emails it out. The whole process takes one to three hours — and most of that time is spent on formatting, not thinking.

The actual analytical work — identifying what changed, why it matters, and what to do about it — gets maybe 15 minutes. The rest is layout, labeling axes, writing bullet points that say things like "Revenue increased 12% month-over-month."

What if you could skip the formatting entirely and go from a raw CSV to a complete data report — narrative, charts, and filtered dataset — in under 30 seconds?

What a "Complete Data Report" Actually Means

Before diving into the API, let's define what we're generating. A DataStoryBot report includes three components:

  1. A written narrative in markdown — not bullet points, but actual prose that explains what the data shows, highlights key insights in bold, and uses blockquote callouts for critical findings.
  2. Publication-quality charts — dark-themed PNGs with proper labels, legends, and annotations. These aren't matplotlib defaults; they're styled for direct use in presentations and dashboards.
  3. A filtered dataset — a CSV containing only the rows and columns relevant to the story, so stakeholders can dig deeper without wading through the full dataset.

This is the output of DataStoryBot's three-step pipeline: upload your CSV, discover story angles, then refine one into a full report.

The DataStoryBot API: Three Endpoints, One Report

The pipeline uses three API calls. Here's how they work.

Step 1: Upload Your CSV

curl -X POST https://datastory.bot/api/upload \
  -F "file=@quarterly_sales.csv"

The response gives you a container ID (an ephemeral sandbox that lives for 20 minutes), a file ID, and metadata about your dataset:

{
  "containerId": "ctr_abc123",
  "fileId": "file-xyz789",
  "metadata": {
    "fileName": "quarterly_sales.csv",
    "rowCount": 2847,
    "columnCount": 12,
    "columns": ["date", "region", "product", "revenue", "units", "cost", ...]
  }
}

This metadata is useful for deciding how to steer the analysis in the next step.

Step 2: Discover Story Angles

curl -X POST https://datastory.bot/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123"
  }'

DataStoryBot's Code Interpreter runs statistical analysis inside the container — profiling distributions, computing correlations, detecting outliers — and returns three distinct story angles:

[
  {
    "id": 1,
    "title": "West Region Drives 68% of Q4 Revenue Growth",
    "summary": "While overall revenue grew 23%, the West region accounted for...",
    "chartFileId": "file-chart001"
  },
  {
    "id": 2,
    "title": "Premium Product Line Margins Eroding Since September",
    "summary": "Unit sales for premium products are up 15%, but margins have...",
    "chartFileId": "file-chart002"
  },
  {
    "id": 3,
    "title": "Tuesday-Thursday Sales Pattern Suggests B2B Dominance",
    "summary": "A day-of-week analysis reveals that 74% of transactions occur...",
    "chartFileId": "file-chart003"
  }
]

Each story comes with a preview chart you can fetch via GET /api/files/ctr_abc123/file-chart001. These are real charts generated from your data, not placeholders.

Step 3: Refine Into a Full Report

Pick the story that matters most and generate the full report:

curl -X POST https://datastory.bot/api/refine \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123",
    "selectedStoryTitle": "West Region Drives 68% of Q4 Revenue Growth"
  }'

The response contains everything you need:

{
  "narrative": "## West Region Drives 68% of Q4 Revenue Growth\n\nThe West region...",
  "charts": [
    { "fileId": "file-chart010", "caption": "Regional revenue contribution, Q3 vs Q4" },
    { "fileId": "file-chart011", "caption": "West region monthly revenue with trendline" }
  ],
  "resultDataset": {
    "fileId": "file-data001",
    "fileName": "west_region_q4_analysis.csv",
    "rowCount": 743,
    "colCount": 8
  }
}

The narrative is markdown with bold key insights and blockquote callouts for the most important findings. The charts are dark-themed, publication-ready PNGs. The filtered dataset contains only the rows relevant to the West region Q4 story.

Complete Python Example

Here's the full pipeline in Python:

import requests
import time

BASE_URL = "https://datastory.bot/api"

# Step 1: Upload
with open("quarterly_sales.csv", "rb") as f:
    upload_resp = requests.post(f"{BASE_URL}/upload", files={"file": f})
    upload_data = upload_resp.json()

container_id = upload_data["containerId"]
print(f"Uploaded: {upload_data['metadata']['rowCount']} rows, "
      f"{upload_data['metadata']['columnCount']} columns")

# Step 2: Analyze — discover 3 story angles
analyze_resp = requests.post(f"{BASE_URL}/analyze", json={
    "containerId": container_id
})
stories = analyze_resp.json()

for story in stories:
    print(f"\n[{story['id']}] {story['title']}")
    print(f"    {story['summary'][:100]}...")

# Step 3: Refine — generate full report from selected story
selected = stories[0]["title"]
refine_resp = requests.post(f"{BASE_URL}/refine", json={
    "containerId": container_id,
    "selectedStoryTitle": selected
})
report = refine_resp.json()

# Save narrative
with open("report.md", "w") as f:
    f.write(report["narrative"])

# Download charts
for i, chart in enumerate(report["charts"]):
    chart_resp = requests.get(
        f"{BASE_URL}/files/{container_id}/{chart['fileId']}"
    )
    with open(f"chart_{i+1}.png", "wb") as f:
        f.write(chart_resp.content)
    print(f"Saved chart: {chart['caption']}")

# Download filtered dataset
ds = report["resultDataset"]
ds_resp = requests.get(f"{BASE_URL}/files/{container_id}/{ds['fileId']}")
with open(ds["fileName"], "wb") as f:
    f.write(ds_resp.content)
print(f"Saved dataset: {ds['fileName']} ({ds['rowCount']} rows)")

That's roughly 40 lines of code to go from CSV to a complete data report. No chart styling, no narrative writing, no dataset filtering — the API handles all of it.

Steering the Analysis

The /api/analyze endpoint accepts an optional steeringPrompt parameter that focuses the analysis:

analyze_resp = requests.post(f"{BASE_URL}/analyze", json={
    "containerId": container_id,
    "steeringPrompt": "Focus on regional performance differences and year-over-year changes"
})

Similarly, /api/refine accepts a refinementPrompt to adjust the output:

refine_resp = requests.post(f"{BASE_URL}/refine", json={
    "containerId": container_id,
    "selectedStoryTitle": selected,
    "refinementPrompt": "Emphasize actionable recommendations for the sales team"
})

This is useful when you know the audience. An executive summary needs different framing than a report for the analytics team.

Building an Automated Report Pipeline

The real power shows up when you stop running this manually. Here's a Python script that generates a weekly report and emails it:

import requests
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
import markdown

BASE_URL = "https://datastory.bot/api"

def generate_and_send_report(csv_path, recipients, steering=None):
    # Upload
    with open(csv_path, "rb") as f:
        upload = requests.post(f"{BASE_URL}/upload", files={"file": f}).json()
    cid = upload["containerId"]

    # Analyze
    analyze_payload = {"containerId": cid}
    if steering:
        analyze_payload["steeringPrompt"] = steering
    stories = requests.post(f"{BASE_URL}/analyze", json=analyze_payload).json()

    # Pick the first story (or implement your own selection logic)
    refine = requests.post(f"{BASE_URL}/refine", json={
        "containerId": cid,
        "selectedStoryTitle": stories[0]["title"]
    }).json()

    # Build HTML email
    html_body = markdown.markdown(refine["narrative"])

    msg = MIMEMultipart("related")
    msg["Subject"] = f"Weekly Data Report: {stories[0]['title']}"
    msg["To"] = ", ".join(recipients)

    # Embed charts inline
    for i, chart in enumerate(refine["charts"]):
        chart_bytes = requests.get(
            f"{BASE_URL}/files/{cid}/{chart['fileId']}"
        ).content
        html_body += f'<p><img src="cid:chart{i}"><br><em>{chart["caption"]}</em></p>'
        img = MIMEImage(chart_bytes)
        img.add_header("Content-ID", f"<chart{i}>")
        msg.attach(img)

    msg.attach(MIMEText(html_body, "html"))

    # Send
    with smtplib.SMTP("smtp.example.com", 587) as server:
        server.starttls()
        server.login("reports@example.com", "password")
        server.send_message(msg)

# Run it
generate_and_send_report(
    csv_path="/data/exports/weekly_sales.csv",
    recipients=["team@example.com"],
    steering="Focus on week-over-week changes and anomalies"
)

Schedule this with cron, Airflow, or any task runner, and your weekly report writes itself.

Formatting Outputs for Different Channels

The narrative comes back as markdown, which gives you flexibility in how you deliver it.

HTML email — use Python's markdown library (shown above) or any markdown-to-HTML converter.

PDF reports — pipe the HTML through WeasyPrint or Puppeteer:

# Using WeasyPrint
weasyprint report.html report.pdf

# Using Puppeteer (Node.js)
node -e "
const puppeteer = require('puppeteer');
(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setContent(require('fs').readFileSync('report.html', 'utf8'));
  await page.pdf({ path: 'report.pdf', format: 'A4' });
  await browser.close();
})();
"

Slack messages — convert the markdown to Slack's block kit format, attach charts as file uploads, and post via the Slack API.

Container Lifecycle: What to Know

DataStoryBot runs analysis in ephemeral OpenAI Code Interpreter containers with a 20-minute TTL. This means:

  • Your data is automatically deleted after 20 minutes. No cleanup needed.
  • You need to download all files (charts, datasets) before the container expires.
  • If you need to re-analyze, upload again. Containers are cheap and fast to spin up.

For automated pipelines, always download outputs immediately after the refine step. Don't store container IDs for later — they won't be there.

When This Approach Works Best

Automated report generation from CSV works well when:

  • The data structure is consistent — weekly exports from the same system, monthly financial pulls, daily operational metrics.
  • The audience wants narrative, not dashboards — executives, board members, clients who receive reports via email.
  • You need to scale reporting — generating reports for 50 regional offices from a single pipeline.

It works less well when you need real-time dashboards, highly customized visualizations, or reports that require domain-specific statistical models. For those cases, DataStoryBot's chart generation capabilities can handle the visualization layer while you bring your own analysis.

What to Read Next

If you're new to the API, start with the getting started guide for authentication and setup details.

For more on what happens during the analysis step — how DataStoryBot profiles your CSV, detects column types, and identifies statistical patterns — read how to analyze a CSV file with AI.

And if you want to experiment before writing any code, try the DataStoryBot playground — upload a CSV and see the full pipeline in action, no API key required.

Ready to find your data story?

Upload a CSV and DataStoryBot will uncover the narrative in seconds.

Try DataStoryBot →