Methodology
How New Music Roundup collects, matches, and scores albums from 13+ sources each week.
Step 1 — Album Matching
Each source returns its own list of new releases. Before scores can be aggregated, releases from different sources that represent the same album must be unified. This uses a two-stage approach.
Stage 1 — MusicBrainz ID Lookup
For each release, the artist and album title are normalised (lowercased, diacritics stripped, articles like "The" removed, punctuation collapsed), then queried against the MusicBrainz release-group API. If a match is returned with sufficient confidence, the canonical MusicBrainz Release Group ID is assigned. This is the preferred method — a shared MBID guarantees two records refer to the same album regardless of how each source spells the title or artist name.
Stage 2 — Fuzzy Matching Fallback
When no MusicBrainz match is found (e.g. for very new releases not yet in the
database), fuzzy string matching is used instead. The normalised album title
and artist name from two candidates are compared using the
token_sort_ratio algorithm from the
fuzzball
library. A combined confidence score is calculated:
confidence = (title_similarity × 0.6) + (artist_similarity × 0.4)
Title is weighted more heavily (60%) as it is the more distinctive identifier;
artist name variations like "The Beatles" vs "Beatles, The" are common and
handled by token_sort_ratio's word-order independence.
Confidence Thresholds
| Confidence | Action | Description |
|---|---|---|
| ≥ 90% | Auto-match | High confidence — albums are treated as the same release |
| 80 – 89% | Pending review | Flagged for manual confirmation via the admin interface |
| < 80% | Skip | Treated as distinct albums; not merged |
Step 2 — Score Normalisation
Different sources use different rating scales. All scores are converted to a common 0–10 scale before aggregation.
| Scale | Original Range | Conversion | Example |
|---|---|---|---|
| 100-point | 0 – 100 | ÷ 10 | 85 → 8.5 |
| 5-star | 0 – 5 | × 2 | 4.5 → 9.0 |
| 10-point | 0 – 10 | No change | 8.3 → 8.3 |
| Letter grade | A – F | Fixed mapping | A → 10.0, B → 7.5, C → 5.0, D → 2.5, F → 0 |
Some sources (Bandcamp Daily, Invisible Oranges) publish curated lists without numeric scores. These are assigned a fixed representative score reflecting that inclusion is itself an editorial recommendation.
Step 3 — Weighted Aggregation
Once all scores are on the 0–10 scale, a weighted average is computed. Each source has a base credibility weight. Specialist sources receive a 1.2× genre bonus when reviewing albums in their area of expertise — e.g. Metal Archives gets a higher effective weight for a death metal album than for a jazz record.
aggregate = Σ(normalised_score × base_weight × genre_bonus) / Σ(base_weight × genre_bonus) Source Weights
| Source | Base Weight | Scale | Fixed Score | Genre Bonus (1.2×) |
|---|---|---|---|---|
| Metacritic | 1.0 | 100-point | — | — |
| AllMusic | 0.9 | 100-point | — | — |
| Metal Archives | 0.85 | 100-point | — | Metal genres |
| All About Jazz | 0.85 | 100-point | — | Jazz genres |
| ProgArchives | 0.8 | 5-star | — | Prog genres |
| AnyDecentMusic | 0.75 | 10-point | — | — |
| Bandcamp Daily | 0.75 | — | 7.5 | — |
| Saving Country Music | 0.7 | 10-point | — | Country / folk genres |
| Album of the Year | 0.7 | 100-point | — | — |
| Invisible Oranges | 0.65 | — | 7.0 | Metal genres |
| RateYourMusic | 0.6 | 5-star | — | — |
| Spotify | 0.5 | 100-point | — | — |
Worked Example
A progressive metal album reviewed by three sources:
| Source | Raw Score | Normalised | Base Weight | Genre Bonus | Final Weight | Weighted Score |
|---|---|---|---|---|---|---|
| Metacritic | 85 / 100 | 8.5 | 1.0 | 1.0 | 1.00 | 8.50 |
| ProgArchives | 4.2 / 5 | 8.4 | 0.8 | 1.2 | 0.96 | 8.06 |
| Metal Archives | 82 / 100 | 8.2 | 0.85 | 1.2 | 1.02 | 8.36 |
| Totals | 2.98 | 24.93 | ||||
Aggregate = 24.93 / 2.98 = 8.36 Data & Attribution
Album metadata and canonical identifiers are provided by MusicBrainz (CC BY-NC-SA 4.0). Cover art is sourced from the Cover Art Archive. The pipeline runs every Friday at midnight UTC via GitLab CI/CD. See the About page for the full source list.