Build a RAG AI System in 5 Minutes with Cloudflare and Firecrawl

Retrieval Augmented Generation (RAG) systems are revolutionizing how we interact with AI by feeding it real-time, custom data. However, building a RAG system has traditionally been complex, requiring vector embeddings, vector databases, and sophisticated query processing. Today, I'll show you how to build a complete RAG system in under 5 minutes using Cloudflare's powerful tools and Firecrawl for data extraction.

What You'll Need for This RAG System Tutorial

Firecrawl: An open-source tool for scraping and cleaning website data
Cloudflare R2: S3-compatible object storage
Bun S3 client: For uploading data to R2
Cloudflare AutoRAG: For vectorizing and querying your data
MCP server: To connect your RAG system to Cursor or other AI tools

Step 1: Scraping Website Data with Firecrawl

First, we'll use Firecrawl to extract and clean website data. Firecrawl is perfect for RAG systems as it converts messy HTML into clean markdown format that's ideal for AI processing.

Documentation page that will be scraped and processed by our RAG system, showing a lightweight data fetching library's introduction and installation instructions

To get started with Firecrawl, install it via npm and create a simple script to crawl your target website:

JAVASCRIPT

import { crawlUrl } from 'firecrawl';

const url = 'https://yourdocumentation.com';

const result = await crawlUrl(url, {
  maxPages: 15,
  exportFormat: 'markdown'
});

console.log(result);

This code will crawl up to 15 pages from your target website and convert them to markdown format. The result includes not just the content but also valuable metadata like page title, description, and language information.

Terminal showing the execution of our RAG system code with metadata extraction and processing from the scraped documentation

Step 2: Storing Data in Cloudflare R2

Next, we need to store our scraped data in Cloudflare R2, which will serve as the data source for our RAG system. We'll use the Bun S3 client for this task since R2 is S3-compatible.

JAVASCRIPT

import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';

// Configure the S3 client for Cloudflare R2
const client = new S3Client({
  region: 'auto',
  endpoint: 'https://your-account-id.r2.cloudflarestorage.com',
  credentials: {
    accessKeyId: 'YOUR_ACCESS_KEY_ID',
    secretAccessKey: 'YOUR_SECRET_ACCESS_KEY'
  }
});

// Function to upload data to R2
async function uploadToR2(data, title) {
  const fileName = `${title.replace(/[^a-zA-Z0-9]/g, '-').toLowerCase()}.md`;
  
  await client.send(new PutObjectCommand({
    Bucket: 'your-bucket-name',
    Key: fileName,
    Body: data.content,
    ContentType: 'text/markdown'
  }));
  
  console.log(`Uploaded ${fileName} to R2`);
}

// Upload each page from Firecrawl results
for (const page of result.pages) {
  await uploadToR2(page, page.metadata.title);
}

This code creates an S3 client configured for Cloudflare R2, then defines a function to upload each page of our scraped content as a separate markdown file. The file names are derived from the page titles, ensuring they're easy to identify later.

Step 3: Setting Up Cloudflare AutoRAG

Now comes the magic part: Cloudflare's AutoRAG. This tool will automatically create vector embeddings from our markdown files and store them in Cloudflare's vector database, making our data queryable by AI models.

Cloudflare's AutoRAG dashboard showing the configured retrieval-augmented generation system ready to process queries against our scraped documentation data

To set up AutoRAG, navigate to the AI menu in your Cloudflare dashboard and follow these steps:

Create a new AutoRAG instance
Connect it to your R2 bucket containing the markdown files
Choose a model for vectorizing your data (the "Auto" option works well)
Select a model for retrieving responses (again, "Auto" is a good choice)
Name your RAG system and create it

Cloudflare will then index your data, creating vector embeddings and storing them in its vector database. This process may take a few minutes depending on the amount of data.

Step 4: Connecting to Cursor with an MCP Server

To make our RAG system truly useful, we can connect it to AI tools like Cursor using Cloudflare's MCP (Model Completion Provider) server. This allows us to query our custom knowledge base directly from within the AI interface.

JAVASCRIPT

// MCP server code for AutoRAG
import { Ai } from '@cloudflare/ai';

export default {
  async fetch(request, env) {
    const ai = new Ai(env.AI);
    const { messages } = await request.json();
    
    const response = await ai.run('@cf/meta/llama-2-7b-chat-int8', {
      messages,
      rag: {
        data_source: 'YOUR_AUTORAG_NAME'
      }
    });
    
    return new Response(JSON.stringify(response));
  }
};

Add this code as a global MCP server in your Cursor settings, and you'll be able to query your RAG system directly. The system will retrieve the most relevant information from your custom knowledge base and provide accurate answers based on your data.

Benefits of This RAG Implementation

Speed: Build a complete RAG system in under 5 minutes
Simplicity: No need to manually manage vector embeddings or databases
Accuracy: Get responses based on your specific data, not just general LLM knowledge
Up-to-date information: Easily refresh your knowledge base by re-crawling your sources
Integration: Works seamlessly with tools like Cursor through the MCP server

Practical Applications for Your RAG System

This quick RAG system setup has numerous practical applications for developers and organizations:

Internal documentation: Create an AI assistant that knows your company's specific processes and protocols
Customer support: Build a knowledge base from customer conversations to provide consistent support
Technical documentation: Keep up with the latest framework or library documentation
Research: Analyze and query large collections of papers or articles
Personal knowledge base: Create a system that understands your notes and references

Conclusion

Building a RAG AI system no longer requires extensive knowledge of vector databases or embedding models. With Cloudflare's AutoRAG, Firecrawl for data scraping, and R2 for storage, you can create a powerful custom knowledge base in minutes rather than days or weeks. This democratization of RAG technology opens up exciting possibilities for developers to create more accurate, context-aware AI applications with minimal effort.

The combination of website data scraping, vector embeddings, and AI retrieval creates a powerful system that can answer questions based on your specific data, making it invaluable for documentation, support, and knowledge management use cases.

How to Build a Powerful RAG AI System in Just 5 Minutes Using Cloudflare and Firecrawl

What You'll Need for This RAG System Tutorial

Step 1: Scraping Website Data with Firecrawl

Step 2: Storing Data in Cloudflare R2

Step 3: Setting Up Cloudflare AutoRAG

Step 4: Connecting to Cursor with an MCP Server

Benefits of This RAG Implementation

Practical Applications for Your RAG System

Conclusion

Let's Watch!

Build a RAG AI System in 5 Minutes with Cloudflare and Firecrawl

Ready to enhance your neural network?