system:
role: assistant
temperature: 0.7
max_tokens: 150
{ output: ... }
✓ PASS

LeetRule

|

Real-world AI configuration challenges. Write prompts, run tests, iterate until every edge case passes.

55Challenges
21Easy
23Medium
11Hard
Difficulty:
Problem Type:

Capital City Responder

easyPrompt

You're wiring up a capital city lookup for a legacy pipeline. The system expects minimal, machine-readable responses — no fluff, no sentences. Some questions are straightforward. Others are ambiguous or nonsensical. The pipeline needs to handle both gracefully.

14 testsL 1
Start →

Country Info JSON

mediumPrompt

An internal API returns country information as JSON. The backend deserializes the response directly, so it must be valid JSON with a consistent shape every time. The system handles various inputs.

14 testsL 2
Start →

Context-Only Question Answering

mediumPrompt

A retrieval-augmented QA system. The model receives context from a knowledge base along with a user question. It should answer based only on what's in the context. Outside knowledge isn't trusted here. When the context doesn't support an answer, the system needs a clear signal.

12 testsL 2
Start →

Safe Command Filter

hardPrompt

A safety filter that gates requests before they reach the main AI system. It classifies each request and emits a routing decision. The downstream infrastructure only understands two states. No explanations, no hedging — just the decision.

24 testsL 3
Start →

Sum of Squares Program

mediumPython

The model generates Python code that gets executed in a pipeline. Input comes via stdin, output goes to stdout. The pipeline expects exact numeric output — nothing extra. The code should handle basic arithmetic on integers.

8 testsL 2
Start →

Planner & Coder Math Solver

hardMulti-Agent

A two-agent pipeline for solving math word problems. The first agent thinks through the problem, the second produces the final answer. Downstream systems only care about the numeric result from the second agent — no labels, no formatting.

15 testsL 3
Start →

Sentiment Classifier

easyPrompt

A customer feedback pipeline needs sentiment classification. Each piece of feedback gets routed based on sentiment. The system only understands three labels. Mixed signals and sarcasm are common.

17 testsL 1
Start →

Email Subject Generator

easyPrompt

An email automation system needs subject lines generated from email bodies. Subjects should be concise and informative. The downstream system has strict length limits and format requirements.

12 testsL 1
Start →

Unit Converter

easyPrompt

A measurement conversion API. Takes a value with a unit and converts it. Output must be just the number — the calling system adds the unit label itself.

4 testsL 1
Start →

Date Parser

mediumPrompt

A date normalization service. Takes various natural language date expressions and converts them to a standard format. The backend database expects a specific format.

5 testsL 2
Start →

Language Detector

easyPrompt

A language detection endpoint for a translation pipeline. Returns ISO language codes. The routing system downstream only understands specific codes.

6 testsL 1
Start →

One-Line Summarizer

mediumPrompt

A summarization endpoint for a news aggregator. Takes article text and produces a single-sentence summary. The UI has limited space — brevity is essential.

3 testsL 2
Start →

FizzBuzz Generator

mediumPython

Classic programming challenge as a code generation task. The model writes Python that processes input and produces exact output. Edge cases matter. Format is strict.

7 testsL 2
Start →

Word Counter

mediumPython

Text analysis tool. Reads text and counts words. The pipeline expects a specific output format for downstream processing.

4 testsL 2
Start →

Intent Classifier

mediumPrompt

A chatbot's NLU layer. Messages get classified into intents before reaching specialized handlers. Ambiguous messages, typos, and off-topic queries are common. The routing system only understands specific labels.

19 testsL 2
Start →

Priority Tagger

hardPrompt

A ticket triage system. Support tickets get priority levels assigned based on content. The queue management system routes by priority. False urgency, spam, and ambiguous requests are common.

20 testsL 3
Start →

Code Review Pipeline

hardMulti-Agent

A two-agent code review system. The first agent reviews code and identifies issues. The second agent fixes the code. The final output should be working code only — no commentary, no markdown.

8 testsL 3
Start →

Log Level Normalizer

easyPrompt

An observability pipeline needs log messages mapped to standard levels so routing rules stay simple. The model reads a raw log line and outputs a single normalized level token that downstream systems understand.

10 testsL 1
Start →

HTTP Status Mapper

easyPrompt

An API gateway needs textual error descriptions mapped to HTTP status codes. The model sees a short summary of what happened and responds with a single numeric status code.

10 testsL 1
Start →

API Contract Validator

mediumPrompt

An internal tool checks whether incoming JSON payloads match a strict contract. The model receives a JSON string and must output either VALID or INVALID so the gateway can accept or reject the request.

10 testsL 2
Start →

Feature Flag Evaluator

mediumPrompt

A feature flag service decides whether a flag is enabled for a given user and environment. The model receives a small JSON context and must respond with ENABLED or DISABLED so callers can gate behavior.

9 testsL 2
Start →

Rate Limit Decider

mediumPrompt

An API edge proxy decides what to do with each incoming request based on quota usage. The model reads a small JSON record and outputs ALLOW, THROTTLE, or BLOCK so the proxy can react.

8 testsL 2
Start →

Alert Router

mediumPrompt

An incident management system needs to decide where each alert should go. The model reads an alert description and outputs one of ONCALL, TICKETING, or IGNORE so the system can route it.

9 testsL 2
Start →

Rollout Strategy Decider

mediumPrompt

A deployment planner chooses rollout strategies based on risk and blast radius. The model reads a short change description and outputs one of SIMPLE, CANARY, or BLUE_GREEN for the orchestrator.

8 testsL 2
Start →

Log Redactor

hardPrompt

A logging pipeline must strip sensitive data before logs leave the cluster. The model receives a raw log line and must return a redacted version, replacing secrets with placeholders while leaving the rest intact.

5 testsL 3
Start →

SQL Query Classifier

hardPrompt

A database firewall classifies incoming SQL before deciding how to handle it. The model sees a single SQL statement and must output READ_ONLY, MUTATING, DDL, or UNKNOWN.

10 testsL 3
Start →

Experiment Bucketing

mediumPrompt

An experimentation platform assigns users deterministically to experiment variants. The model reads a small JSON payload and outputs CONTROL or TREATMENT based on a stable bucketing strategy.

5 testsL 2
Start →

Regex Pattern Generator

hardPrompt

Your team's junior dev keeps asking you to write regex patterns for them. You're tired of it, so you're building a tool that generates regex from plain English. The catch? It needs to output raw patterns that go straight into code.

10 testsL 3
Start →

Error Message Formatter

easyPrompt

Your product manager is furious. Users keep seeing cryptic error messages like "ECONNREFUSED 127.0.0.1:5432" and flooding support with tickets. You need to build a translator that converts these technical nightmares into friendly messages that won't scare grandma.

10 testsL 1
Start →

Git Commit Message Writer

easyPrompt

Every code review, your tech lead leaves the same comment: "Please follow conventional commits." You're automating this once and for all. The tool takes a description of changes and spits out a proper commit message. One line. No excuses.

12 testsL 1
Start →

Natural Language to SQL

hardPrompt

The business team keeps asking for "quick data pulls" but refuses to learn SQL. You're building a translator so they can ask questions in plain English. The database has three tables: users(id, name, email, created_at), orders(id, user_id, amount, status, created_at), products(id, name, price, category). Raw SQL only — no markdown, no explanations.

12 testsL 3
Start →

Cron Expression Translator

mediumPrompt

Nobody on the ops team can read cron expressions except Dave, and Dave is on vacation. You're building a translator so everyone can understand when scheduled jobs run. Turn cryptic cron syntax into plain English that anyone can understand.

11 testsL 2
Start →

Semantic Version Comparator

easyPrompt

Your package manager's auto-update logic is broken and it's your fault. The fix is simple: compare two semantic versions and say which one wins. GREATER, LESS, or EQUAL. That's it. Your CI pipeline depends on this.

10 testsL 1
Start →

Currency Amount Formatter

easyPrompt

The finance team exports numbers like "1234.5" and expects them to look like "$1,234.50" in reports. They're tired of doing it manually in Excel. Build a formatter that handles multiple currencies and gets the commas, decimals, and symbols right.

10 testsL 1
Start →

Phone Number Validator

mediumPrompt

Users keep entering phone numbers in creative ways: with dashes, dots, parentheses, country codes, or just mashing random digits. Your signup form needs to know: is this actually a phone number? VALID or INVALID. No maybes.

12 testsL 2
Start →

Email Address Validator

mediumPrompt

Your signup form accepts literally anything as an email. Last week someone registered as "notanemail" and now your mailing list is broken. Time to add validation. Is it VALID or INVALID? Be reasonable but don't let garbage through.

14 testsL 2
Start →

Password Strength Checker

easyPrompt

Security audit time. Half your users have "password123" as their password. You need a strength checker that rates passwords as WEAK, MEDIUM, or STRONG. Consider length, variety, and whether they're using obvious patterns.

12 testsL 1
Start →

Markdown to Plain Text

mediumPrompt

Your CMS stores everything in Markdown, but the mobile team needs plain text for push notifications. They can't have "**SALE TODAY**" showing up on someone's lock screen. Strip the formatting, keep the message.

12 testsL 2
Start →

Time Zone Converter

mediumPrompt

Your remote team is spread across 8 time zones and nobody can figure out when the standup actually is. Build a converter that takes a time in one zone and tells you what time that is elsewhere. 24-hour format, no fluff.

10 testsL 2
Start →

URL Component Extractor

mediumPrompt

Your analytics pipeline needs to break down URLs into pieces: domain, path, query params, fragments. The upstream service sends messy URLs and expects clean extractions.

12 testsL 2
Start →

File Type Classifier

easyPrompt

Your file upload system needs to sort incoming files into buckets: IMAGE, VIDEO, AUDIO, DOCUMENT, CODE, ARCHIVE, or OTHER. The icons, preview generation, and storage paths all depend on getting this right.

16 testsL 1
Start →

Color Code Converter

easyPrompt

Your design team uses hex codes, your CSS-in-JS library wants RGB, and the animation team needs HSL. Everyone's manually converting colors and making mistakes. Build a converter that speaks all formats fluently.

10 testsL 1
Start →

Prime Number Checker

easyPython

The math tutoring app needs a prime checker for its number theory module. Kids enter a number, and the system tells them if it's prime. Simple stdin/stdout contract. Handle the edge cases or the 8-year-olds will find them.

10 testsL 1
Start →

Palindrome Checker

easyPython

The word game app needs a palindrome detector. Players submit words and phrases, and the game checks if they read the same forwards and backwards. Ignore case and punctuation — "A man, a plan, a canal: Panama!" should pass.

10 testsL 1
Start →

Binary Search Implementation

mediumPython

The search service is too slow with linear scan on sorted data. Implement binary search: given a sorted list and a target, find the index. First line is the sorted list, second line is the target. Print the 0-based index or -1 if not found.

11 testsL 2
Start →

Fibonacci Generator

easyPython

The coding bootcamp curriculum needs a Fibonacci generator for teaching recursion vs iteration. Given N, output the Nth Fibonacci number. Use 0-indexing: F(0)=0, F(1)=1, F(2)=1. Keep it fast enough for large N.

10 testsL 1
Start →

Anagram Checker

easyPython

The word puzzle game needs an anagram detector. Two words enter, one answer leaves: YES if they're anagrams, NO if they're not. Ignore case and spaces. "Listen" and "Silent" are anagrams. "Hello" and "World" are not.

11 testsL 1
Start →

Bug Report Analyzer Pipeline

hardMulti-Agent

Your on-call rotation is drowning in bug reports. Some are critical security issues, others are just someone complaining about font sizes. Build a two-agent triage system: one analyzes the bug, the other assigns priority. Final output: CRITICAL, HIGH, MEDIUM, or LOW.

11 testsL 3
Start →

Translation Quality Pipeline

hardMulti-Agent

The localization team is overwhelmed. You're building an AI translation pipeline: one agent does the initial translation, another reviews and polishes it. The final output should be just the clean translated text — no notes, no "here's my translation," just the words.

8 testsL 3
Start →

Data Extraction Pipeline

hardMulti-Agent

Your data team gets unstructured text dumps from partners and manually copies info into spreadsheets. Build an ETL pipeline: one agent extracts the key fields, another normalizes them into clean JSON. No markdown code blocks, just raw JSON.

8 testsL 3
Start →

JSON Path Value Extractor

mediumPrompt

The API returns deeply nested JSON and your frontend team is tired of writing obj.data.results[0].user.email. Build a path extractor: give it JSON and a dot-notation path, get back just the value. No wrapper, no quotes around strings, just the raw value.

11 testsL 2
Start →

HTML to Plain Text

easyPrompt

The email service stores messages as HTML but the SMS gateway needs plain text. You're building the converter. Strip all tags, keep the content. No one wants to receive "<b>URGENT</b>" in a text message.

11 testsL 1
Start →

Base64 Encoder/Decoder

easyPrompt

The legacy API only accepts Base64-encoded payloads, and the new API returns plain text. You need a bidirectional converter. Encode or decode based on the instruction. Just the result, nothing else.

11 testsL 1
Start →

URL Slug Generator

easyPrompt

The blog CMS needs URL-friendly slugs for article titles. "What's New in JavaScript 2024?" becomes "whats-new-in-javascript-2024". Lowercase, hyphens, no special characters. The SEO team is counting on you.

12 testsL 1
Start →

YAML to JSON Converter

mediumPrompt

DevOps writes config in YAML, the app reads JSON. The deploy script needs a converter.

10 testsL 2
Start →