[perf] Reduce logger hot-path overhead and sync IO pressure by Xekep · Pull Request #3210 · Pryaxis/TShock

Xekep · 2026-02-27T03:09:15Z

Summary

optimize caller resolution for log writes
reduce sync flush pressure by batching text log flushes
replace SQL failure-path First() with index access

Scope

focused perf-only change in TextLog and SqlLog`n- no unrelated file changes

Batch text log flushes, serialize writes under a lock, and avoid expensive caller stack introspection for info/verbose log events. Also remove LINQ First() allocation in SQL failure replay.

hakusaro · 2026-02-27T05:09:00Z

@greptile review?

greptile-apps · 2026-02-27T05:11:36Z

Greptile Summary

This PR implements targeted performance optimizations for TShock's logging subsystem without changing functionality. Key improvements include:

Optimized caller resolution by skipping expensive stack traces for high-volume Info/Verbose logs
Reduced sync I/O pressure in TextLog by batching flushes (every 32 writes or 1 second) while preserving immediate flushing for Warning/Error messages
Added thread safety to TextLog with proper locking
Replaced LINQ .First() with direct index access in SQL failure recovery path
Correctly adjusted StackFrame depth from 2 to 3 to account for extracted ResolveCaller method

The changes are well-designed, maintain correctness, and deliver meaningful performance gains for logging hot paths.

Confidence Score: 5/5

This PR is safe to merge with no risk
Focused performance-only changes with correct logic throughout, proper thread safety added, and no behavioral changes to existing functionality
No files require special attention

Important Files Changed

Filename	Overview
TShockAPI/SqlLog.cs	Optimized caller resolution with early-exit for high-volume logs and replaced LINQ with index access
TShockAPI/TextLog.cs	Added thread safety and intelligent flush batching to reduce sync I/O pressure

_{Last reviewed commit: 15e4ce8}

hakusaro · 2026-02-27T05:38:10Z

@Xekep do you have any instrumentation that shows the performance gains on these PRs? E.g., before and after profiling, or some indication that these performance gains aren't purely theoretical or, at the very least, within the margin of error for CPU scheduling, etc?

Xekep · 2026-02-27T06:37:55Z

@Xekep do you have any instrumentation that shows the performance gains on these PRs? E.g., before and after profiling, or some indication that these performance gains aren't purely theoretical or, at the very least, within the margin of error for CPU scheduling, etc?

That matches my real-world experience: before this patch series the server became progressively unresponsive under load (chat and general packet handling lagged noticeably), and after applying the changes responsiveness improved enough to make gameplay stable again.

hakusaro · 2026-02-27T06:53:34Z

That matches my real-world experience: before this patch series the server became progressively unresponsive under load (chat and general packet handling lagged noticeably), and after applying the changes responsiveness improved enough to make gameplay stable again.

You changed like, 9 things at the same time it seems?

sgkoishi · 2026-02-27T07:08:32Z

This PR probably won’t make a noticeable difference unless:

Your hard drive has a response time of 100ms or more (for the Flush change), or
You can notice differences at the microsecond level (for the StackFrame vs. StackTrace change).

Xekep · 2026-02-27T07:15:23Z

This PR probably won’t make a noticeable difference unless:

Your hard drive has a response time of 100ms or more (for the Flush change), or

You can notice differences at the microsecond level (for the StackFrame vs. StackTrace change).

Thanks for the feedback - I agree this won’t matter much on low-log/idle servers.

The two changes are cumulative under sustained log volume:

Flush() is still a synchronous call per log write, even if the OS buffers disk I/O, so removing per-line flush pressure reduces contention and syscall frequency.
StackTrace vs StackFrame is small per call, but this path executes very frequently, so the CPU/allocation cost adds up under load.

So this is not about needing a 100ms disk or perceiving microseconds directly; it’s about reducing repeated overhead in a hot path at scale.

hakusaro · 2026-02-27T07:48:24Z

@Xekep do you know what the juice number * 64 / 128 is? You can find it under valid channels

Xekep · 2026-02-27T08:02:37Z

@Xekep do you know what the juice number * 64 / 128 is? You can find it under valid channels

😐TShock is pretty sluggish under load right now, so I’d like to focus on fixes.

hakusaro · 2026-02-27T08:07:37Z

@Xekep (antiparticles here, stepping in for particles) 😅 You're absolutely right! Let's focus on the performance improvements, rather than worrying about the juice number. How many players are you playing with, and what are the resources allocated to your server?

Xekep · 2026-02-27T08:13:16Z

@Xekep (antiparticles here, stepping in for particles) 😅 You're absolutely right! Let's focus on the performance improvements, rather than worrying about the juice number. How many players are you playing with, and what are the resources allocated to your server?

The slowdown starts even at low player counts, not only at peak load.

Server specs:

Windows Server
Ryzen 9 5950X
16 GB RAM
NVMe storage

sgkoishi · 2026-02-27T08:26:58Z

Thanks for the feedback - I agree this won’t matter much on low-log/idle servers.
it’s about reducing repeated overhead in a hot path at scale

This method isn’t on a hot path at all, and most of the time in real world it’s called less than once per second. Even at peak load it is still unlikely to be calling it 100 times per second. The StackFrame change saves about 2 microseconds per call, and the similar goes for the syscall, which is a sub-millisecond level difference that human will never notice.

hakusaro · 2026-02-27T12:06:55Z

@Xekep

So this is not about needing a 100ms disk or perceiving microseconds directly; it’s about reducing repeated overhead in a hot path at scale.

At the scale of how many players? Realistically I'm worried that you've combined and changed so many things that there are superfluous changes that don't impact much conflating things.

The issue here is that your PRs all slightly change performance characteristics, while changing a lot of code in more complex ways. Highly optimized code is typically less understandable because it has to do more acrobatics to do the same thing.

TShock is already a highly complex codebase. A big barrier to entry for new contributors is understanding the control flow. We sincerely want to avoid changing the codebase to make it harder to maintain, especially if the performance overhead is negligible at the cost of code maintainability.

As you may have noticed, we have not had a steady set of releases. We have, now, had real contributors diving in to solve major problems. Changes like this are difficult to accept without hard numbers.

How do we know which PRs you've submitted actually make meaningful improvements and warrant the extra complexity? It is difficult to know without concrete numbers.

Reduce logger hot-path overhead and sync IO pressure

15e4ce8

Batch text log flushes, serialize writes under a lock, and avoid expensive caller stack introspection for info/verbose log events. Also remove LINQ First() allocation in SQL failure replay.

pull-request-size bot added the size/L label Feb 27, 2026

Uh oh!

Conversation

Xekep commented Feb 27, 2026

Summary

Scope

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

Xekep commented Feb 27, 2026

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

sgkoishi commented Feb 27, 2026

Uh oh!

Xekep commented Feb 27, 2026

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

Xekep commented Feb 27, 2026

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

Xekep commented Feb 27, 2026

Uh oh!

sgkoishi commented Feb 27, 2026

Uh oh!

hakusaro commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants