Speed difference in Gnus
Stephen J. Turnbull
stephen at xemacs.org
Mon Oct 17 21:39:43 EDT 2005
>>>>> "Ben" == Ben Wing <ben at xemacs.org> writes:
Ben> Stephen J. Turnbull wrote:
>> It's really ugly to have a buffer property that's actually a
>> text property just because performance sucks when an extent
>> covers the whole buffer.
I assume the above is what you're talking about?
Ben> what do you mean? as i said before, i have mostly finished
Ben> code that will make text properties efficient by implementing
Ben> them entirely in non-overlapping intervals, as fsf does.
What I mean is that there are a number of things that we do as buffer
attributes or buffer-local variables (language environment, coded
character set, major mode for starters) that conceptually are text
properties. Think about it: wouldn't it be cool if instead of needing
a hack like mmm, we simply attached LISP-major-mode to an extent
covering the buffer, string-major-mode to strings, and
comment-major-mode to comments? Wouldn't it be cool if we could
attach xml-mode to individual strings in xml.el or use html-mode for
strings by default in blog-mode.el?
Ben> in general this is a very tough problem, and using simpler
Ben> solutions when possible (e.g. non-overlapping text
Whatever it is it has to be part of the implementation, not a
constraint on user code. (I'm pretty sure you agree, but it's not
clear to me from that sentence.)
Ben> it might be possible to simulate extents in general using
Ben> non-overlapping intervals, where multiple intervals could
Ben> represent a single extent and each interval contains parts of
Ben> all extents crossing over them. this interval would be
Ben> guaranteed efficient and not too hard to implement, i think
Ben> -- in fact it's the same algorithm i'd be using for
Ben> text-properties, and could be extended to extents in general.
Something like this? A linked list P of (buffer position, extent
list) pairs, with an intervals defined as [P[n].pos,P[n+1].pos) for
each n. An extent is a pair of pointers into P defining which
fragments are covered by the extent plus a property list. P[n].ext is
the list of extents covering that interval. Then for random access,
finding n is O(lg n), and finding the property is O(|P[n].ext|), which
is the best you can get, I guess. (For many operations, in particular
redisplay, which walk P in order, finding n is O(1), right?) P itself
only needs to be updated when text is inserted/deleted in the buffer,
and when an extent is created/deleted. |P| <= |# boundary points of
all extents|, so that is no worse than what we currently have, and we
never have to update the extents themselves.
Ben> but first we need some more profiling data.
Why? Basically in your explanation of how the soe works, you are
saying that large extents (not just extents covering the whole buffer,
but any extent that overlaps many other extents) are very expensive
because they make _all_ the overlapped extents inefficient. So that
means that code like
(let ((header (make-extent (message-beginning) (header-end)))
(body (make-extent (header-end) (message-end))))
necessarily sucks. If I'm not totally missing the point, I have to
ask: Do you consider that restriction acceptable?
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Ask not how you can "do" free software business;
ask what your business can "do for" free software.
More information about the XEmacs-Beta