Product Announcement

Introducing Arcade

Your lab's AI-powered institutional memory. Ask natural-language questions about every experiment, route, dataset, and protocol your team has ever run.

February 2026 · Rasyn AI

Chemistry labs generate enormous amounts of data—retrosynthesis sessions, executed protocols, analytical datasets, accepted routes, sample inventories. But this knowledge is scattered across tools, spreadsheets, and the memories of individual researchers. When someone leaves, their institutional knowledge walks out the door.

Arcade changes this. It automatically indexes every piece of structured data your lab produces and makes it searchable through natural language. Ask “What were the highest-yielding Suzuki coupling conditions we've ever used?” and get a grounded answer with citations back to the original experiments.

No manual tagging. No data migration. Arcade ingests from your existing Rasyn workflows—sessions, routes, execute runs, datasets, and samples—and builds a unified knowledge graph that your entire team can query.

Architecture

From question to
cited answer in seconds.

Question

Natural language query from the user

Query Planner

Claude Haiku parses intent, keywords, numeric filters, and chemical references

Hybrid Search

BM25 + vector similarity + numeric range filters executed in parallel

Evidence Chunks

Paragraph-level retrieval from protocol steps, observations, and notes

Answer Generation

Claude Sonnet synthesizes an answer grounded in retrieved evidence

Citations

Each claim linked back to source cards with full provenance chain

Retrieval

Three retrieval streams,
one unified ranking.

Hybrid Search Architecture

Full-Text (BM25)

PostgreSQL GIN indexes with weighted fields: title (A), summary (B), content (C), tags (B). English stemming and stop-word removal.

40%

Vector Similarity

OpenAI text-embedding-3-small (1536 dims) with HNSW cosine index. Captures semantic meaning beyond exact keyword matches.

60%

Numeric Filters

Range queries on yield, conversion, purity, scale (g), temperature (°C), and time (min) stored in structured JSONB with indexed access.

Filter

Data Sources

Every piece of data your
lab produces, indexed.

Sessions

Full retrosynthesis planning sessions with target molecules, models used, and discovered routes

Accepted Routes

Curated synthesis routes with step-by-step reactions, conditions, and scoring

Execute Runs

Executed experiments with protocols, reagent lists, reaction conditions, and QC reports

Datasets

Analytical results (HPLC, NMR) with QC metrics and instrument metadata

Samples

Physical samples tracked in inventory with lot/batch numbers and storage info

Features

Built for chemistry,
not generic search.

Hybrid Retrieval

Combines BM25 full-text search, vector similarity (OpenAI embeddings), and numeric range filters in a single query. Weighted 40/60 text-to-vector for optimal chemistry domain performance.

AI Query Planning

Claude parses natural-language questions into structured query plans, recognizing chemical names, SMILES notation, reaction types, and percentage ranges automatically.

Citations & Provenance

Every answer is grounded in your actual lab records. Citations link back to source cards with full audit trails to the original experiment, route, or dataset.

Auto-Ingestion

Background hooks automatically index new sessions, accepted routes, executed experiments, and analytical datasets. No manual ETL required.

Metric-Aware Filtering

Numeric range queries on yield, conversion, purity, scale, temperature, and reaction time. Ask "experiments with yield > 70% and temp < 80°C" and get precise results.

Molecule-Level Aggregation

Canonical SMILES indexing and Morgan fingerprint similarity enable structural queries. Find every experiment that has ever used a specific molecule or its close analogues.

Under the Hood

Technical details.

Database Architecture

Arcade uses PostgreSQL with the pgvector extension, giving us full-text search (GIN indexes), vector similarity (HNSW with cosine distance), and structured JSONB queries in a single database. Six core tables handle the data model:

arcade_cards

Primary search documents. One card per canonical entity with title, summary, content, embedding (1536-dim), key_metrics (JSONB), tags, and molecule references.

arcade_chunks

Paragraph-level evidence chunks for deep retrieval. Protocol steps grouped in chunks of 3–5 with independent embeddings.

arcade_molecules

Canonical molecule entries with SMILES, InChIKey, computed properties, and Morgan fingerprints (1024-bit) for structural similarity.

arcade_events

Full audit log of every ingest, update, delete, reindex, and embed operation with timing and error tracking.

arcade_interactions

User behavior tracking (views, clicks, pins, copies) for future ranking improvements.

arcade_conversations

Persistent multi-turn chat history with linked source cards for provenance.

LLM Stack

Arcade uses three AI models, each chosen for a specific role in the pipeline:

Query Planning

Claude Haiku

Parses natural language into structured query plans. Fast and cheap for high-frequency calls.

Embeddings

text-embedding-3-small

1536-dimension vectors at $0.02/1M tokens. Batched in groups of 100 with 30K char truncation.

Answer Generation

Claude Sonnet

Synthesizes grounded answers from retrieved evidence. Enforced citation format and factual discipline.

Graceful Degradation

Arcade is designed to work even when external services are unavailable. If OpenAI's embedding API is down, search falls back to BM25-only text matching. If Claude is unavailable, query planning uses regex-based keyword extraction and answers return raw search results instead of synthesized responses. Ingestion continues even if individual cards fail to embed, and cards are flagged for re-embedding once the service recovers.

Start building your lab's institutional memory.

Arcade is available now on all Rasyn plans. Every experiment you run, every route you accept, every dataset you upload is automatically indexed and searchable.

Get started free Back to research

No credit card required

Introducing Arcade

From question tocited answer in seconds.

Question

Query Planner

Hybrid Search

Evidence Chunks

Answer Generation

Citations

Three retrieval streams,one unified ranking.

Full-Text (BM25)

Vector Similarity

Numeric Filters

Every piece of data yourlab produces, indexed.

Sessions

Accepted Routes

Execute Runs

Datasets

Samples

Built for chemistry,not generic search.

Hybrid Retrieval

AI Query Planning

Citations & Provenance

Auto-Ingestion

Metric-Aware Filtering

Molecule-Level Aggregation

Technical details.

Database Architecture

LLM Stack

Graceful Degradation

Start building your lab's institutional memory.

From question to
cited answer in seconds.

Three retrieval streams,
one unified ranking.

Every piece of data your
lab produces, indexed.

Built for chemistry,
not generic search.