Rams Design Score
How the score works
The Rams Design Score is a single number from 0–100 that summarizes the design quality of a codebase's UI layer. It's applied the same way every time, against the same ruleset, by the same reviewer.
What it measures
Rams reviews the UI code itself — the .tsx, .jsx, .vue, .svelte, and styling files that produce the rendered interface. Issues are evaluated against 109 rules across 8 design categories, each catching a specific failure mode.
It does not run the app, take screenshots, or evaluate visual aesthetics. It reads code the way a senior designer reviewing a pull request would.
The 8 categories
Accessibility a11y
Semantic HTML, keyboard navigation, screen-reader compatibility, focus management, ARIA correctness, color contrast on text. Things WCAG 2.2 AA cares about and that real users with disabilities run into.
Color color
Hardcoded hex values that bypass design tokens, contrast ratios, semantic color usage, hover and focus states.
Typography type
Heading hierarchy correctness, font scale adherence, line height, font weight, text truncation patterns.
Spacing space
Whether layout values come from a scale (4/8/16) or are arbitrary one-offs. Padding and margin consistency. Gap usage.
Components comp
Component composition, prop API design, primitive reuse, design-system token leaks, in-line style overrides.
UX ux
Loading states, empty states, error handling, destructive action confirmations, form patterns, scroll behavior.
Motion motion
Animation duration and easing, prefers-reduced-motion support, infinite loops, layout shifts caused by animations.
Anti-slop slop
Patterns specific to AI-generated UI code: arbitrary radius values, magic numbers, overly defensive null checks, unused props, copy-paste duplication.
How issues are weighted
Each issue carries a severity. The score deducts points based on severity:
- Critical — breaks core functionality, blocks accessibility, ships visible bugs. Heaviest deduction.
- Serious — degrades quality, leaks design-system intent, will cause user-facing regressions. Moderate deduction.
- Moderate — minor polish or convention issue. Light deduction. (Excluded from the public score view to focus attention.)
Score bands
90+
Low risk
75–89
Moderate
55–74
Elevated
<55
High
What the score doesn't measure
- ·Visual aesthetics, brand fit, or taste
- ·Performance (Lighthouse covers that)
- ·Bundle size or build output
- ·Backend, business logic, or data layer
- ·Code that doesn't render UI
How public scoring samples a repo
When you score a public repo via the homepage widget, Rams pulls up to 30 UI files prioritized by location (app/, pages/, components/) and reviews them as a representative sample. For continuous review on every pull request, install the GitHub App — it reviews exactly the files that changed.
Methodology stability
The rubric is intentionally stable. We do not loosen rules to make scores look better, and we do not retroactively rescore repos when the rubric is updated. When new rules are added or weights are revised, we publish the change here so you can read the diff.