NY English Messenger Bot — Bilingual AI Assistant for Facebook Messenger
AI Automation
Production

NY English Messenger Bot — Bilingual AI Assistant for Facebook Messenger

The production Facebook Messenger assistant for New York English — an executive English coaching studio in Guadalajara — answering prospective students 24/7 in English and Spanish.

The Problem

New York English reactivated its Facebook page as a marketing channel, and prospective students — Mexican professionals weighing executive English coaching — began sending Messenger questions about pricing, scheduling, methodology, and online vs. in-person classes, in both English and Spanish. The studio can't watch the inbox around the clock, prospects expect fast answers, and a generic auto-reply or a chatbot that invents course details would cost real enrollments.

The Solution

A two-layer router sends quick pleasantries to Claude Haiku for fast, low-cost replies and real questions to Claude Sonnet with RAG context — so answers about pricing, scheduling, and methodology are grounded in New York English's own bilingual content, never invented, and delivered in the prospect's detected language. When someone wants a person, Meta's Handover Protocol passes the thread to the studio's Page inbox and auto-resumes the bot after a 30-minute timeout. HMAC-verified webhooks, Sentry monitoring, and a /health endpoint keep it production-solid.

Key Features

  • 24/7 answers to prospective-student questions about NY English's pricing, scheduling, and methodology
  • Bilingual EN/ES with automatic language detection — Mexican professionals are answered in their own language
  • RAG-grounded responses pulled from New York English's real course and pricing content — never invented
  • Handover to the studio's Page inbox when a prospect is ready to enroll or needs a personal reply
  • Proven in production — the reference build the multi-tenant CushLabs Messenger platform was extracted from

Results

Instant, accurate answers on pricing, scheduling, and methodology — in the prospect's own language
Hours of repetitive Messenger triage reclaimed for the studio owner
RAG-grounded on real course content — zero invented details
Production reference build the CushLabs Messenger platform was extracted from

Overview

NY English Messenger Bot is a production Cloudflare Worker that powers the Facebook Messenger assistant for the New York English page — an executive English coaching business based in Guadalajara, Mexico. It handles student and prospective client inquiries 24/7 with AI-powered, bilingual responses grounded in actual business content via RAG retrieval.

The bot uses intelligent model routing: simple pleasantries (greetings, thanks) go to Claude Haiku for speed and cost efficiency, while knowledge questions trigger Sonnet with RAG context from a Cloudflare Vectorize index containing bilingual content files. Language is auto-detected on the first message and locked for the session to prevent mid-conversation flipping.

This is the first deployment of a reusable CushLabs Messenger bot template. Once fully production-validated, the architecture will be forked into a standalone template repo for the CushLabs Facebook page and future client deployments.

The Challenge

  • 24/7 availability gap: The business owner can't monitor Messenger around the clock, but prospective students expect fast responses — especially from a page that's being reactivated as a marketing channel.
  • Bilingual market: The target audience is Mexican professionals who may write in English or Spanish. The bot needs to detect language automatically and respond in kind, without awkward language switching.
  • Accurate, grounded answers: Generic chatbots hallucinate or give vague responses. Prospective clients asking about pricing, services, or methodology need answers drawn from actual business content — not made up.
  • Graceful human handover: Some conversations need a real person. The bot must recognize when to step aside and hand control to the business owner, then resume automatically after a timeout.

The Solution

Intelligent model routing: A two-layer routing system classifies incoming messages. Pleasantries (greetings, thanks, goodbyes) are handled by Claude Haiku for sub-second responses at minimal cost. Knowledge questions — anything about services, pricing, methodology, scheduling — trigger Claude Sonnet with full RAG context for accurate, grounded answers.

RAG-powered bilingual knowledge base: Bilingual content files covering the New York English services, pricing, and methodology are chunked, embedded via Cloudflare Workers AI (native binding, no external HTTP), and stored in Cloudflare Vectorize. At query time, the bot retrieves the most relevant chunks filtered by the user's detected language and injects them into Claude's system prompt.

Meta Handover Protocol: When a user asks for a human (in English or Spanish — 27 regex patterns), the bot hands thread control to the Page Inbox. The business owner can reply from Meta Business Suite while the bot stays silent. The /bot-resume command or a 30-minute inactivity timeout restores bot control automatically.

Production-grade security and observability: HMAC-SHA256 webhook signature verification (strict mode) rejects forged requests. Sentry monitors errors end-to-end. Cloudflare AI Gateway provides request logging and cost tracking. A /health endpoint enables uptime monitoring.

Technical Highlights

  • No SDK abstractions: Raw fetch() calls to every external API (Anthropic, Meta, Cloudflare AI). Every request is explicit, debuggable, and has no hidden middleware.
  • KV-backed session state with TTLs: Conversation history (1hr), language lock (1hr), and handoff state (30min) all auto-expire via Cloudflare KV TTLs — no cleanup jobs needed.
  • Graceful RAG degradation: If Vectorize is unreachable, the bot still responds using its system prompt — no crash, no error message to the user.
  • Immediate 200 response pattern: Meta requires webhook acknowledgment within 5 seconds. The Worker returns 200 immediately and processes the message asynchronously via ctx.waitUntil().
  • Two-Meta-app architecture: Separate Meta apps for the bot runtime (webhook signatures) and page administration (content management), enforcing least-privilege token separation.
  • Facebook Page admin CLI: A companion fb-admin.ts script manages page metadata, posts, and diagnostics programmatically via the Graph API — no manual dashboard clicking.

Results

For the End User / Business:

  • Prospective students get instant, accurate answers about services, pricing, and scheduling — in their preferred language
  • The business owner reclaims hours previously spent answering repetitive Messenger inquiries
  • Human handover ensures complex conversations still get personal attention

Technical Demonstration:

  • Full Cloudflare Workers production deployment with KV state management and Workers AI integration
  • RAG pipeline: content ingestion, embedding, vector storage, retrieval, and context injection
  • Claude API integration with model routing for cost optimization
  • Meta Platform integration: webhooks, Send API, Handover Protocol, Graph API
  • Security hardening: HMAC signature verification, token separation, Sentry monitoring

Ready to discuss a similar solution?

Let's explore how AI automation can help your business.

Schedule a Consultation