Fix sorting order based on order_column condition by jonaslandsgesell · Pull Request #46 · sherbold/autorank

jonaslandsgesell · 2026-05-06T12:25:59Z

Disclaimer: Claude generated code.

Observation: sorting according to "higher raw values is better" results in unexpected ordering in latex tables (the entity with the worst rank is at the top and used as baseline for effect size calculation)

The autorank library currently couples the input metric direction with the output dataframe sorting, leading to incorrect baseline selection in _util.py.

Current Mechanism: _create_result_df_skeleton uses the asc boolean (derived from the order parameter) to sort the final rankdf.

The Conflict: When order='descending' (higher-is-better), asc is set to False. The code then executes .sort_values(by='meanrank', ascending=False).

Result: The model with the highest mean rank (the worst performer) is placed at the first index.

Consequence: Because the first index serves as the control group, effect sizes and post-hoc tests are incorrectly calculated relative to the worst model instead of the best.

Proposed Fix:
Force meanrank to always sort in ascending order (lowest rank first) regardless of the raw metric's direction:

Minimal script which passes with the proposed fix but does not pass without the fix:

import numpy as np
import pandas as pd
from autorank import autorank

rng = np.random.default_rng(42)
n_datasets = 20

# Test 1: Higher-is-better (like R²)
print("=" * 60)
print("TEST 1: Higher-is-better (R²) with order='descending'")
print("=" * 60)
data_hib = pd.DataFrame({
    "best":   rng.uniform(0.85, 1.00, n_datasets),
    "middle": rng.uniform(0.60, 0.80, n_datasets),
    "worst":  rng.uniform(0.20, 0.45, n_datasets),
})

print("\nInput: 3 models, higher value = better (like R²)")
print(f"  best   median ≈ {data_hib['best'].median():.3f}")
print(f"  middle median ≈ {data_hib['middle'].median():.3f}")
print(f"  worst  median ≈ {data_hib['worst'].median():.3f}")

result = autorank(data_hib, alpha=0.05, order='descending')
rdf = result.rankdf

central = 'mean' if 'mean' in rdf.columns else 'median'
print("\nautorank rankdf (order='descending'):")
print(rdf[['meanrank', central, 'effect_size', 'magnitude']].to_string())

print()
best_first = rdf.index[0]
print(f"rankdf.index[0] (baseline for effect size): '{best_first}'")

if best_first == 'best' and rdf.at['best', 'effect_size'] == 0.0:
    print("✓ CORRECT — best model is first, effect_size=0 for 'best'")
else:
    print("✗ BUG     — wrong model is first; effect sizes relative to wrong baseline!")
    print()
    print("Expected order : best  → middle → worst")
    print("Actual order   :", " → ".join(rdf.index.tolist()))

The autorank library currently couples the input metric direction with the output dataframe sorting, leading to incorrect baseline selection in _util.py. Current Mechanism: _create_result_df_skeleton uses the asc boolean (derived from the order parameter) to sort the final rankdf. The Conflict: When order='descending' (higher-is-better), asc is set to False. The code then executes .sort_values(by='meanrank', ascending=False). Result: The model with the highest mean rank (the worst performer) is placed at the first index. Consequence: Because the first index serves as the control group, effect sizes and post-hoc tests are incorrectly calculated relative to the worst model instead of the best. Proposed Fix: Force meanrank to always sort in ascending order (lowest rank first) regardless of the raw metric's direction: Minimal script which passes with the proposed fix but does not pass without the fix: ``` import numpy as np import pandas as pd from autorank import autorank rng = np.random.default_rng(42) n_datasets = 20 # Test 1: Higher-is-better (like R²) print("=" * 60) print("TEST 1: Higher-is-better (R²) with order='descending'") print("=" * 60) data_hib = pd.DataFrame({ "best": rng.uniform(0.85, 1.00, n_datasets), "middle": rng.uniform(0.60, 0.80, n_datasets), "worst": rng.uniform(0.20, 0.45, n_datasets), }) print("\nInput: 3 models, higher value = better (like R²)") print(f" best median ≈ {data_hib['best'].median():.3f}") print(f" middle median ≈ {data_hib['middle'].median():.3f}") print(f" worst median ≈ {data_hib['worst'].median():.3f}") result = autorank(data_hib, alpha=0.05, order='descending') rdf = result.rankdf central = 'mean' if 'mean' in rdf.columns else 'median' print("\nautorank rankdf (order='descending'):") print(rdf[['meanrank', central, 'effect_size', 'magnitude']].to_string()) print() best_first = rdf.index[0] print(f"rankdf.index[0] (baseline for effect size): '{best_first}'") if best_first == 'best' and rdf.at['best', 'effect_size'] == 0.0: print("✓ CORRECT — best model is first, effect_size=0 for 'best'") else: print("✗ BUG — wrong model is first; effect sizes relative to wrong baseline!") print() print("Expected order : best → middle → worst") print("Actual order :", " → ".join(rdf.index.tolist())) ```

sherbold · 2026-05-17T09:15:34Z

Thanks for the PR. I do not have time for a review right now, but wanted to let you know already that I have seen this and hopefully get around to this next week.

sherbold · 2026-05-21T08:28:20Z

Yes, this is indeed undesirable behavior in the Latex table generation. However, the suggested fix is a good demonstration that Claude is often cannot understand broader considerations. It breaks the complete sorting logic, effectively rendering the sorting parameter useless for meanrank based sorting, without even documenting this.

A better solution would be to give a warning of some sorts in the latex generation, possibly suggesting to order ascending instead. Additionally, a new parameter could be introduced in the latex table generation, to facilitate descending sorting but comparison of effect sizes to the best model, e.g., to which row, the effect sizes, etc. should be reported (index as integer in Python logic, such that -1 is the last row).

jonaslandsgesell closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sorting order based on order_column condition#46

Fix sorting order based on order_column condition#46
jonaslandsgesell wants to merge 1 commit into
sherbold:masterfrom
jonaslandsgesell:jonaslandsgesell-change-sorting

jonaslandsgesell commented May 6, 2026 •

edited

Loading

Uh oh!

sherbold commented May 17, 2026

Uh oh!

sherbold commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jonaslandsgesell commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sherbold commented May 17, 2026

Uh oh!

sherbold commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jonaslandsgesell commented May 6, 2026 •

edited

Loading