Rams Design Score

How the score works

The Rams Design Score is a single number from 0–100 that summarizes the design quality of a codebase's UI layer. It's applied the same way every time, against the same ruleset, by the same reviewer.

What it measures

Rams reviews the UI code itself — the .tsx, .jsx, .vue, .svelte, and styling files that produce the rendered interface. Issues are evaluated against 109 rules across 8 design categories, each catching a specific failure mode.

It does not run the app, take screenshots, or evaluate visual aesthetics. It reads code the way a senior designer reviewing a pull request would.

The 8 categories

Accessibility a11y

Semantic HTML, keyboard navigation, screen-reader compatibility, focus management, ARIA correctness, color contrast on text. Things WCAG 2.2 AA cares about and that real users with disabilities run into.

Color color

Hardcoded hex values that bypass design tokens, contrast ratios, semantic color usage, hover and focus states.

Typography type

Heading hierarchy correctness, font scale adherence, line height, font weight, text truncation patterns.

Spacing space

Whether layout values come from a scale (4/8/16) or are arbitrary one-offs. Padding and margin consistency. Gap usage.

Components comp

Component composition, prop API design, primitive reuse, design-system token leaks, in-line style overrides.

UX ux

Loading states, empty states, error handling, destructive action confirmations, form patterns, scroll behavior.

Motion motion

Animation duration and easing, prefers-reduced-motion support, infinite loops, layout shifts caused by animations.

Anti-slop slop

Patterns specific to AI-generated UI code: arbitrary radius values, magic numbers, overly defensive null checks, unused props, copy-paste duplication.

How issues are weighted

Each issue carries a severity. The score deducts points based on severity:

  • Critical — breaks core functionality, blocks accessibility, ships visible bugs. Heaviest deduction.
  • Serious — degrades quality, leaks design-system intent, will cause user-facing regressions. Moderate deduction.
  • Moderate — minor polish or convention issue. Light deduction. (Excluded from the public score view to focus attention.)

Score bands

90+

Low risk

75–89

Moderate

55–74

Elevated

<55

High

What the score doesn't measure

  • ·Visual aesthetics, brand fit, or taste
  • ·Performance (Lighthouse covers that)
  • ·Bundle size or build output
  • ·Backend, business logic, or data layer
  • ·Code that doesn't render UI

How public scoring samples a repo

When you score a public repo via the homepage widget, Rams pulls up to 30 UI files prioritized by location (app/, pages/, components/) and reviews them as a representative sample. For continuous review on every pull request, install the GitHub App — it reviews exactly the files that changed.

Methodology stability

The rubric is intentionally stable. We do not loosen rules to make scores look better, and we do not retroactively rescore repos when the rubric is updated. When new rules are added or weights are revised, we publish the change here so you can read the diff.