site_logo

Top| New| Show| Ask| Job| About

Ask HN: Is there prior art for this rich text data model?

I've built a rich text data model for a desktop word processor in Python, based on a persistent balanced n-ary tree with cached weights for O(log n) index translation. The document model uses only four element types: Text, Container, Single, and Group — where Group is purely structural (for balancing) and has no semantic meaning in the document. Individual elements are immutable; insert and takeout return new trees rather than mutating the old one. This guarantees that old indices remain valid as long as the old tree exists. I'm aware of Ropes, Finger Trees, and ProseMirror's flat index model. Is there prior art I should know about — specifically for rich text document models with these properties?

3 points by chrisecker | 2 hours ago | 2 comments
PaulHoule - an hour ago
So my first take is that there is something off about all WYSIWYG editors going back to Microsoft Word. Like you try to select some text to cut-and-paste and you can select the whole text except the first letter but if you try to get the whole text it suddenly jumps to select the whole document. Or hitting the "B" button sometimes bolds the paragraph above where you are, etc. It's like there is something wrong with the conceptual model behind all of these things.

So when I read "Group is purely structural (for balancing) and has no semantic meaning" my blood runs cold. Here's a mysterious something hidden in the document which bizzare misbehavors will be centered around. Like maybe when I try to select something and I got something else, it is one of those groups.

an hour ago [Collapse]
chrisecker - 16 minutes ago
You're right that hidden structure is often the cause of strange effects — but Group is not a document element in that sense. It's invisible at the index level too: when you iterate over the document by index, you pass through Groups transparently. A Group is purely a tree node for balancing purposes — it has no position in the document model that a cursor could land on, no semantic meaning, and no user-visible boundary. It has no effect the user can see or feel. Think of it like the internal nodes of a B-Tree: they exist to keep the structure efficient, but you never 'see' them when you traverse the data."