For July 2009, I managed to snuff out something about ropes, a kind of heavier programming type than string. Apart from the pun/joke, it was an interesting article to research. I have no recollection any more about how I came across the rope type; presumably it was through a late night surfing session, fueled with some microbrew.
Ropes are like strings in that they store text, but unlike strings in that the bytes making up the text are no longer contiguous in memory. A rope will split up the text into nodes and build up a tree with them. The original text can be recovered by reading the nodes in an in-order traversal.
Why would anyone do such a thing? Well it’s great for building up a string through lots of insertions/deletions — in effect it’s the same performance argument as inserting into/deleting from an array versus a linked list — and it’s good for really long strings. The final benefit is all about heap memory usage: long strings (or any large allocations) in run-times like .NET tend to stick around after disposal since the garbage collector is tuned more to small allocations rather than large ones.
Of course, these benefits come at a cost: for smaller pieces of text, the effort in manipulating them (reading the nth character, copying them, etc) tends to have high overhead, both in terms of memory and in terms of performance. So, not for every situation, by all means.
I couldn’t find a reference implementation of ropes in C# or .NET, but there is one for Java as part of this discussion here. Personally speaking, I’d have to say that the StringBuilder class in .NET would suffice in many scenarios and the effort of creating an implementation of a rope class probably not worth the effort.
This article first appeared in issue 283, July 2009.
You can download the PDF here.
(I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there’s an Xmas issue as well — so it’s a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so.)
Now playing:
Ferry, Bryan - Tokyo Joe
(from In Your Mind)
1 Response
#1 Patrick van Logchem said...
04-Jul-10 3:39 AMWell, there's one advantage that can be quite significant: suffix-sharing. If you have a situation in which strings are often prepended with all sorts of prefixes, ropes are not just quicker, they cost much less memory too! This is, because the suffixes can be re-used and thus take memory only once. This was actually the main attraction to me, when I spoke with Marco Cantu about them. If I'm not mistaken, I send him a proof-of-concept in Delphi I once made. Can't find it here. Ask and I'll look it up.
Leave a response
Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.
_emphasis_
**strong**
[text](url)
`IEnumerable`
* an item
1. an item
> Now is the time...
Preview of response