Skip to content

Feat/streaming optimization#138

Open
mysamimi wants to merge 25 commits into
asticode:masterfrom
mysamimi:feat/streaming-optimization
Open

Feat/streaming optimization#138
mysamimi wants to merge 25 commits into
asticode:masterfrom
mysamimi:feat/streaming-optimization

Conversation

@mysamimi

@mysamimi mysamimi commented Feb 6, 2026

Copy link
Copy Markdown

Description

This PR refactors the WriteToSRT WriteToWebVTT and WriteToSSA methods to use io.Writer streaming (via bufio.Writer) instead of accumulating the entire file content in an in-memory byte slice.

Motivation

Previously, methods like
WriteToSRT
built the entire output in a []byte buffer before writing it. For large subtitle files or high-throughput services, this caused excessive memory allocation (O(N) with file size) and GC pressure. This change ensures memory usage remains constant (O(1)) regardless of the subtitle duration/line count, as data is flushed to the writer in chunks.

Changes

  • SRT: Replaced internal srtBytes() building with writeSRT(w io.Writer)
  • WebVTT: Replaced internal webVTTBytes() (which used append/copy) with writeWebVTT(w io.Writer)
  • SSA: Replaced internal bytes() building with write(w io.Writer)
  • Go Version: Upgraded project to Go 1.21+ (1.24) to support modern standard library features and improved GC.
  • Tests: Updated internal tests to verify streaming behavior. Added Benchmarks.

Benchmarks

While there is a negligible overhead for tiny files due to bufio initialization, the memory scalability is significantly improved.

Results (Apple M3 Max):

BenchmarkWriteToSRT-16 301935 3902 ns/op 6136 B/op 135 allocs/op
BenchmarkWriteToWebVTT-16 237235 4967 ns/op 7030 B/op 160 allocs/op

note: The static B/op in benchmarks reflects the bufio buffer allocation (default 4KB). In real-world large file scenarios, the previous implementation's allocation would grow linearly, whereas this remains constant.

Supersedes

Supersedes #61 (if applicable)

@asticode asticode left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and sorry for the slow review! ❤️

I've quite a few remarks 👍

Comment thread go.mod
Comment thread srt.go Outdated
Comment thread srt.go
Comment thread srt.go
Comment thread srt.go Outdated
Comment thread webvtt.go Outdated
// WriteToWebVTT writes subtitles in .vtt format
func (s Subtitles) WriteToWebVTT(o io.Writer) (err error) {
// if set true in second args write index as item index
func (s Subtitles) WriteToWebVTT(args ...interface{}) (err error) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove replacing the function signature from this PR?

Comment thread webvtt.go Outdated
}

// Init writer
w := bufio.NewWriter(o)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid using a bufio.Writer here?

Comment thread webvtt.go Outdated
}

func (l Line) webVTTBytes() (c []byte) {
func (l Line) writeWebVTT(w io.Writer) (err error) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide the *astikit.WriteChainer instead ?

Comment thread webvtt.go Outdated
}

func (li LineItem) webVTTBytes(previous, next *LineItem) (c []byte) {
func (li LineItem) writeWebVTT(w io.Writer, previous, next *LineItem) (err error) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide the *astikit.WriteChainer instead ?

Comment thread webvtt_internal_test.go
},
}}.webVTTBytes()))
}}.writeWebVTT(w)
assert.NoError(t, err)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use require instead of assert?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do this too?

Comment thread go.mod
module github.com/asticode/go-astisub

go 1.13
go 1.24.0

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems your "revert to go 1.13" change was reverted by one of the last commit. Could you revert it back?

Comment thread ssa.go
o = appendStringToBytesWithNewLine(o, "; "+c)
// write writes the block to the writer
func (b *ssaScriptInfo) write(c *astikit.WriteChainer) (err error) {
if _, err = c.Write(astikit.WriteWithLabel("script info header", []byte("[Script Info]"))); err != nil {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you group/chain the 2 writes here?

Comment thread ssa.go
}
for _, comment := range b.comments {
if _, err = c.Write(
astikit.WriteWithLabel("comment start", []byte("; ")),

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you simplify with

if _, err = c.Write(
			astikit.WriteWithLabel("comment", append([]byte("; "+comment), bytesLineSeparator...)),
); err != nil {
			return
		}

Comment thread ssa.go
}
if len(b.collisions) > 0 {
o = appendStringToBytesWithNewLine(o, ssaScriptInfoNameCollisions+": "+b.collisions)
if _, err = c.Write(

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you simplify all writes below with one big write with conditions:

if _, err = c.Write(
  astikit.WriteWithCondition("collisions", append([]byte(ssaScriptInfoNameCollisions+": "b.collisions), bytesLineSeparator...), len(b.collisions) > 0),
  astikit.WriteWithCondition("original editing", append([]byte(ssaScriptInfoNameOriginalEditing+": "b.originalEditing), bytesLineSeparator...), len(b.originalEditing) > 0),
[...]
); err != nil {
			return
		}

Comment thread ssa.go
if len(b.collisions) > 0 {
o = appendStringToBytesWithNewLine(o, ssaScriptInfoNameCollisions+": "+b.collisions)
if _, err = c.Write(
astikit.WriteWithLabel("collisions label", []byte(ssaScriptInfoNameCollisions+": ")),

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also could you delete the appendStringToBytesWithNewLine() function since it's not used anymore?

Comment thread webvtt.go
}

// WriteToWebVTT writes subtitles in .vtt format
// if set true in second args write index as item index

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this comment too?

Comment thread webvtt.go
}

// Lines
lines := r.InlineStyle.WebVTTLines

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you "compute" all variables first (lines, ra, scroll, etc.) and then add one big write with condition writes? Something like

lines := r.InlineStyle.WebVTTLines
[...]
ra := r.InlineStyle.WebVTTRegionAnchor
[...]

if _, err = c.Write(
				astikit.WriteWithLabel("region id", []byte("Region: id="+r.ID)),
				astikit.WriteWithCondition("lines", append(bytesSpace, []byte("lines="+strconv.Itoa(lines))...)),
[...]
			); err != nil {
				return
			}

Comment thread webvtt.go
} else if item.Style != nil && item.Style.InlineStyle != nil && item.Style.InlineStyle.WebVTTPosition != nil {
c = append(c, bytesSpace...)
c = append(c, []byte("position:"+item.Style.InlineStyle.WebVTTPosition.String())...)
// Align

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, could you compute all variables first and make one big Write?

Comment thread webvtt.go
// Voice name
if l.VoiceName != "" {
c = append(c, []byte("<v "+l.VoiceName+">")...)
if _, err = c.Write(

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use WriteWithCondition to remove the if instead?

Comment thread webvtt_internal_test.go
},
}}.webVTTBytes()))
}}.writeWebVTT(w)
assert.NoError(t, err)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do this too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants