← Back to CELLO Main

📊 Detailed Evaluation Reports

Complete transparency — C source, Rust output, compilation errors, and scoring details

â„šī¸ About These Reports: Each report includes the original C code, generated Rust code from both Claude and Gemini, full compilation output (including errors), and detailed scoring breakdowns across all 6 evaluation dimensions.

Latest Evaluations (2026-02-16) — With Real Compilation

string_utils

Latest Both Compiled

String utility functions: duplicate, trim, lowercase, concat, char count

Claude: 76/100 | Gemini: 81/100

View Report →

buffer

Latest Both Compiled

Dynamic buffer with growing byte array management

Claude: 77/100 | Gemini: 66/100

View Report →

hashmap

Latest Both Failed

Hash map with separate chaining — borrow checker challenge

Claude: 53/100 (E0506) | Gemini: 52/100 (E0277, E0506)

View Report →

Previous Evaluations (2026-02-15) — Syntax Check Only

These evaluations used basic syntax checking (rustc not available). Scores differ from real compilation results.

string_utils (Feb 15)

Claude: 76/100 | Gemini: 81/100

View Report →

buffer (Feb 15)

Claude: 77/100 | Gemini: 63/100

View Report →

hashmap (Feb 15)

Claude: 72/100 | Gemini: 74/100

View Report →

📖 Report Structure

Each markdown report contains: