Skip to content

Conversation

@mimbrown
Copy link

@mimbrown mimbrown commented May 3, 2025

The PR has a couple changes:

  • Moves most of the bidi functionality to the core folder (for lack of maybe a better place). Minimal tweaks to some bidi functions to take, for example, a direction instead of a typesetter, to reduce direct dependency on the typesetter.
  • Move the splitIntoRuns and reorder calls directly into typesetters/base.lua#boxUpNodes
  • Add the splitIntoRuns and reorder calls into typesetters/base.lua#makeHbox
  • Strip the bidi package down to be mostly just deprecated exports.

@mimbrown mimbrown requested a review from alerque as a code owner May 3, 2025 12:59
@mimbrown mimbrown changed the title Initial change to move bidi to core, get bidi working in hboxes WIP: Initial change to move bidi to core, get bidi working in hboxes May 3, 2025
@mimbrown
Copy link
Author

mimbrown commented May 3, 2025

@alerque this does not yet include any test updates or anything like that, just the basic code change. The thing that's controversial is, I've also updated the output logic for hboxes, since I've found them to be wrong by default in RTL mode.

Given the content:

Testing, testing, \hbox{Test \font[family = "Al Tarikh", direction = "RTL"]{مفتاح معايير الويب!} another}

When the document is LTR

Current (RTL text is laid out backwards)

image

New

image

When the document is RTL

Current (RTL text is laid out backwards, content in the hbox starts from the left edge of the box)

image

New

image

@alerque alerque marked this pull request as draft May 5, 2025 21:05
@Omikhleia
Copy link
Member

This does break a lot of tests, including unexpected ones.

@mimbrown
Copy link
Author

Coming back to this after a while. @Omikhleia I've tried to visually inspect the difference between the current typesetter and the changes I've made, and can't make anything out. If I'm reading the diffs correct, it's looking like a lot of these tests are differences of fractions of points, but maybe I'm missing something. Do you have any idea what's going on?

@Omikhleia
Copy link
Member

Omikhleia commented Jun 24, 2025

Do you have any idea what's going on?

Sorry for the delay @mimbrown I failed to see earlier that I was pinged here ^^

I retraced your steps, it turns out that with in the original code, self:shapeAllNodes() was invoked twice (once in the bidi logic, then in the regular typesetter) at different times (on different node lists).

I daresay it's really dubious... It might also explain that when I once tried to remove the whole bidi logic (e.g. in the old code, just commenting out the content of the enabling/disabling methods), the debug output also had some subtle differences of the same sort, for a document that had just French text (so I wouldn't have expected bidi to do something at all). My feeling is that the old code was wrong, despite being "active" all the time by default... Doh.

@Omikhleia
Copy link
Member

Omikhleia commented Jun 24, 2025

it turns out that with in the original code, self:shapeAllNodes() was invoked twice (once in the bidi logic, then in the regular typesetter) at different times (on different node lists).

That is -- the old bidi code was:

  • invoking splitNodelistIntoBidiRuns much earlier
  • Then invoking shapeAllNodes (!)
  • Then call calling the original typesetter - which removed some "discardable" glues, did other fancy things, and invoked shapeAllNodes (thus again?)...

Your proposed code moved all that mess to the base typesetter, a nice move IMHO, but invokes splitNodelistIntoBidiRuns at a later stage (not illogically, IMHO), and there's only one shapeAllNodes then. To me, you might be in the right track here... but don't take this for granted, I don't understand a lot of what SILE does the way it does it, crippled with side-effects and unclear intents, even after a 4-year involvement as a user and casual contributor.

@mimbrown
Copy link
Author

Thanks for looking at this! I see your point, yes I remember thinking that removing the double shapeAllNodes call was a plus for this PR, but I just put it back to see if that fixes a lot of the random test failures. I agree, it's troubling that the behavior would be different, it doesn't seem like it should be, but I'm not prepared to break all tests and stand by the new behavior as better!

@Omikhleia
Copy link
Member

(...) but I just put it back to see if that fixes a lot of the random test failures.

Darn, all tests failed but for some font fetching issue in the CI. Let's invoke our great @alerque here :)

I agree, it's troubling that the behavior would be different, it doesn't seem like it should be, but I'm not prepared to break all tests and stand by the new behavior as better!

I'll add a dedicated comment on the code, but still, this indeed has heavy code smell... Well let's see if it does the trick and confirms the root cause when the CI is repaired.
Yet, personally I'd prefer SILE to address its issues even at the cost of breaking many tests.

If the decision is to keep these moved lines, my opinion is that it would need an extra in-code comment to note the code small and refer to this issue (or a new one), so people who maintain or investigate the code in the future are aware of the potential issue... But let's wait for a result, discussion and decision.

return {}
end
nodelist = bidi.splitNodelistIntoBidiRuns(nodelist, self.frame:writingDirection())
self:shapeAllNodes(nodelist)
Copy link
Member

@Omikhleia Omikhleia Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. the main discussion in the PR: yes 👍 having those two lines here is also what I was discussing. I arrived at the same conclusion and all other things equal (I haven't checked our code was identical as mine; when just starting from the old code and following the same steps as you), the test tests/image.sil gave the same result as before. That's the only one I checked though.

But the double shapeAllNodes (here and further below), or the reason why this splitting/shaping has to occur here (rather than after some discardable nodes or penalties are removed) are very unclear. I'd like to grasp why, and have proper in-code comments here... ideally ^^

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'll add the comment (if this works). Just want to validate that it does before going to all the trouble 🙃

Copy link
Member

@Omikhleia Omikhleia Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait... from boxUpNodes(), we invoke breakIntoLines() which also invokes shapeAllNodes(). How many times does SILE ends up reshaping the same content (with a few glues and penalties added or removed)?
This is not even cost-less, with the dubious "inplace" boolean there, we browse, re-copy this list multiple times. Inefficient and weird.

Copy link
Author

@mimbrown mimbrown Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷‍♂️ it definitely does look weird. If we figure out why this changes subtly when calling shapeAllNodes either twice or before/after discarding nodes (not sure which is the culprit yet), that would probably help us think through how it should be fixed.

Looking at the function, it seems like calling it multiple times shouldn't have any effect, unless calling shape() on a node can somehow return more unshaped nodes, which would be problematic. So I would guess that calling shapeAllNodes() after removing the discardable nodes is what is causing the difference?

Copy link
Member

@Omikhleia Omikhleia Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My best guess for the difference in the logs is that it is introduced by some glues, and that the line ratio (affecting their shrinking/stretching) occurs to be slightly different -- but the amount is small and distributed on all glues, so in the end its just a few decimal points (but this would affect "Mx" lines in the logs in the way we observe).

Hmm... I wonder if the issue isn't what we put pack into the state.nodes or not, during all that chain of calls...

See:

function typesetter:breakIntoLines (nodelist, breakWidth)
   self:shapeAllNodes(nodelist)
   local breakpoints = SILE.linebreak:doBreak(nodelist, breakWidth)
   return self:breakpointsToLines(breakpoints)
end

Then breakPointsToLines works on self.states.nodes, not the nodelist. Ouch. That's the big assumption. They might be the same or not and shapeAllNodes works "in place" "by default", so it all depends on what the nodelist originally was. It might have been self.states.nodes initially... but nodelist = bidi.splitNodelistIntoBidiRun returns a new nodelist, and do we put it back to self.states.nodes? Otherwise later at some point this will be inconsistent.

Frankly these user of the internal states and the multiple side-affecting methods are a mess to follow. As as wrote in another comment, the constant hesitation on what we work (putting things back and forth in the states) is a likely source of error. The code is quite a mess there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I see such things, I am frightened:

SILE.typesetter.state.nodes = self.nodes -- Horrible breaking of separation of concerns here. :-(

Another cross-boundary side-effect here... Ideally linebreak:doBreak() should do that, but should return the nodelist it worked on...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree, there's some major refactoring that would be helpful. But my goal with this PR has been to be minimal, not expansive, with changes. Once @alerque can let us know what's causing the 404 errors in the test runs, we should be able to see if what I'm doing now is actually consistent with what was there before. Based on this, I think we should be able to keep the new shapeAllNodes call before the discarding of other nodes, remove the second (redundant) one, and most tests should pass fine. More expansive changes should be a different PR I think.

@mimbrown
Copy link
Author

Yes I just pulled in the upstream, possibly in between me making this PR and now something in the test setup changed.

self.state.hmodeOnly = true
SILE.process(content)

self.state.nodes = bidi.splitNodelistIntoBidiRuns(self.state.nodes, self.frame:writingDirection())
Copy link
Member

@Omikhleia Omikhleia Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the way, do we need to put that back into self.state.nodes? There's a lot of back-and-forth stuff of that kind, I know, in the code base, but it would be best to avoid it, and just use locals.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mostly following what was there before, because (as discussed) I can't track through if this is necessary so I'm afraid to do anything different!

for i = 1, #vboxes do
local v = vboxes[i]
if v.is_vbox then
v.nodes = bidi.reorder(v.nodes, self.frame:writingDirection())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mimbrown

As I am working on my own rewrite of a typesetter (here), I tried to port your proposed changes to it.

It made me realize there's an issue here, when the document uses "liners" (which I introduced earlier to SILE), i.e. for underlining, striking out, etc. portions of text that can be line broken.1 The liner logic eventually wraps part of lines if liner boxes (responsible for applying the wanted effect), and these can be deeply nested...

So here, bidi re-ordering -- which as far as I can tell, expects a flat list of shaped nodes, comes far too late, and will not handle these liner boxes.

This raises a few questions:

  • Why is it done here so late on the final line vboxes (via another repeated loop on all lines), while it could perhaps have been done on each line individually before it's finalized (i.e. just before the liner reboxing logic applies).
  • But why is it done, actually, after line-breaking has occurred? Is that due to some interpretation with respect to the Knuth-Plass algorithm?
  • And what exactly is it doing? The bidi code is totally undocumented (and even has some code commented out, never a good sign), with opaque variables (what's rv !? why to we have to SU.compress it for holes2); and weird sequences3...

I'd totally be in favor of having bidi also made "standard" in my re-implementation, but it's a tricky piece of code, and I don't even know what to test, without knowing the languages :)

Footnotes

  1. E.g. \underline{....} in SIL.

  2. That's the only use for this compress utility in all the SILE code base, if not mistaken...

  3. E.g. sequences of SU.flip_in_place / reverse_each_node, where the later browses the node list again to SU.flip_in_place internal hboxes, knowing their inner structure. Whether such code is legit is hard to say, for one not knowing the details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked to a 10-year old issue: #76

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omikhleia I definitely don't have the answers to all the "why" questions. My goal with this PR was to change things as minimally as possible, and only remove/change things that I was logically positive shouldn't have an effect. BiDi is really painful... but, I am pretty sure, we can't reverse before the line-breaking happens. Here's an example of why:

Imagine a string of glyphs (' means the character is RTL):

ABC DEF A'B'C' D'E'F'

If we want to apply the BiDi algorithm, it becomes:

ABC DEF F'E'D' C'B'A'

Easy. But let's say we need a line break in the middle of the RTL text. In that case, we want the output to look like this:

ABC DEF C'B'A'
F'E'D'

But if we reorder before breaking into lines, we'll end up with:

ABC DEF F'E'D'
C'B'A'

Off the top of my head, I don't know why we couldn't do the reordering before the reboxing. Maybe we should! But here's why we can't do it before line-breaking.

@Omikhleia
Copy link
Member

Linked #2268

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants