Speed difference in Gnus
ben at xemacs.org
Thu Oct 20 04:29:53 EDT 2005
Stephen J. Turnbull wrote:
>>>>>>"Ben" == Ben Wing <ben at xemacs.org> writes:
> Ben> Stephen J. Turnbull wrote:
> >> It's really ugly to have a buffer property that's actually a
> >> text property just because performance sucks when an extent
> >> covers the whole buffer.
>I assume the above is what you're talking about?
> Ben> what do you mean? as i said before, i have mostly finished
> Ben> code that will make text properties efficient by implementing
> Ben> them entirely in non-overlapping intervals, as fsf does.
>What I mean is that there are a number of things that we do as buffer
>attributes or buffer-local variables (language environment, coded
>character set, major mode for starters) that conceptually are text
>properties. Think about it: wouldn't it be cool if instead of needing
>a hack like mmm, we simply attached LISP-major-mode to an extent
>covering the buffer, string-major-mode to strings, and
>comment-major-mode to comments? Wouldn't it be cool if we could
>attach xml-mode to individual strings in xml.el or use html-mode for
>strings by default in blog-mode.el?
> Ben> in general this is a very tough problem, and using simpler
> Ben> solutions when possible (e.g. non-overlapping text
> Ben> properties).
>Whatever it is it has to be part of the implementation, not a
>constraint on user code. (I'm pretty sure you agree, but it's not
>clear to me from that sentence.)
yes, i do. once i get the new text property implementation down, then
any program that only uses text properties and not directly uses extents
or overlays, or which only uses extents/overlays covering small sections
of the buffer, will be fast. however, in the long run that's not
enough; we'd like for extents to be fast, period, and not make the user
go through the text property interface. (the reason why this is harder
is that the text property interface is higher-level and doesn't
explicitly give you a concept of an entity tied to a particular range.)
if people normally used text properties, and tended to only use a small
number of extents, then we might consider as a stop-gap measure
separating the two internally into a sort of parallel structure -- this
is much like what FSF does, with its overlays, i think. however, it
might turn out to be not much easier to implement that than to just move
entirely to the "correct" implementation of extents in terms of
> Ben> it might be possible to simulate extents in general using
> Ben> non-overlapping intervals, where multiple intervals could
> Ben> represent a single extent and each interval contains parts of
> Ben> all extents crossing over them. this interval would be
> Ben> guaranteed efficient and not too hard to implement, i think
> Ben> -- in fact it's the same algorithm i'd be using for
> Ben> text-properties, and could be extended to extents in general.
>Something like this? A linked list P of (buffer position, extent
>list) pairs, with an intervals defined as [P[n].pos,P[n+1].pos) for
>each n. An extent is a pair of pointers into P defining which
>fragments are covered by the extent plus a property list. P[n].ext is
>the list of extents covering that interval. Then for random access,
>finding n is O(lg n), and finding the property is O(|P[n].ext|), which
>is the best you can get, I guess. (For many operations, in particular
>redisplay, which walk P in order, finding n is O(1), right?) P itself
>only needs to be updated when text is inserted/deleted in the buffer,
>and when an extent is created/deleted. |P| <= |# boundary points of
>all extents|, so that is no worse than what we currently have, and we
>never have to update the extents themselves.
this sounds about right, although i'm a little confused about what
"finding n" means, and finding anything in a linked list is O(n), not
O(lg n). we maintain an ordered gap array (or whatever) of
non-overlapping intervals as defined by the extents, where each interval
is the maximum stretch not crossed by an extent endpoint. Each extent
contains a pointer to the first and last interval covered by the
extent. most likely, these will be indices into the gap array, and
hence need to be adjusted as we do gap movement in the array; this is
just like what happens currently with the endpoints of an extent in the
buffer. we can avoid this update if the gap array, rather than
containing the interval structures themselves, contains pointers to
heap-allocated interval objects, and each such object contains a forward
and probably backward pointers to the next/previous object, so that
essentially the intervals are in a combination linked-list and array.
the array, btw, is necessary so that we can locate the interval
surrounding a particular position in O(lg n) time. it's not possible to
do this in a pure linked list; you'd get O(n) instead. another data
structure would involve having the intervals be objects with pointers
back to the array position they came from. doing gap movement in the
array requires updating the intervals rather than the extents; there
could potentially be significant savings if there are a whole lot of
overlapping extents but in practice this seems unlikely. in reality i'm
somewhat loath to do any of these alternative schemes because it would
add a lot of space overhead, would fragment the memory, etc. in
practice, i think that extents are modified much less often than they
are accessed by redisplay, and you only get O(n) behavior when inserting
or deleting extents when the gap is far away from where you previously
were. but i'm certainly willing to be proven wrong by appropriate
the crucial thing, then, is that each interval contains a some sort of
list of all the extents overlapping that interval. this list will
probably be small; so either a linked list or a Stynarr could be used.
(A Stynarr is a new struct that i invented in my unicode-internal ws.
it is based on existing face-cachel code. the idea is that it uses a
fixed array of small size (e.g. 4 or 6) that's built into the object and
pointer to a dynarr, which is used to handle overflow. it's useful when
you need to maintain a list of objects and handle an arbitrary # of
objects on the list, but where in practice it very rarely exceeds a
small # and would like to avoid lots and lots of heap allocation.)
we might still have to maintain the display-order and e-order for extent
endpoints, like we currently do; i'll have to look into the code, but i
bet there will still be places that need to look stuff up by endpoints,
and it may not be too easy to derive this from the interval data.
what do you mean "we never have to update the extents themselves"?
> Ben> but first we need some more profiling data.
>Why? Basically in your explanation of how the soe works, you are
>saying that large extents (not just extents covering the whole buffer,
>but any extent that overlaps many other extents) are very expensive
>because they make _all_ the overlapped extents inefficient.
why, because i have limited time and i need to schedule things according
to how badly they slow things down. i see that my attempts to speed up
byte<->char conversion actually slowed things down significantly for
Raymond, although they indeed sped things up in my profiling for
whatever i was running into problems with. i think it's because i
deleted the "known range" cache that was the first line of attack. i
did this because fsf didn't have any such thing and seemed to be doing
fine; but they may have a lot more code internally that operates in byte
positions rather than char positions [or even in pairs of the two; that
seems common]. so i'm going to look into putting them back.
meanwhile, i'm mostly done with the rewrite of char tables for
unicode-internal. there isn't much coding left before i will start
compiling; but there's definitely a good deal of work to be done. for
one, the font handling is still by charset, meaning that unicode chars
for which there's no charset won't be displayable! yuck. that requires
some significant work. for another char-classes haven't been
implemented yet. and i don't know what the hell to do with ccl;
hopefully it really won't be necessary because i've expanded the notion
of charset to include any rectangular subset of 256x256 (so we can
properly have a windows-1252 charset, e.g.) and i'm planning on creating
a general mbcs coding system to replace the ad-hoc shift-jis and big5
coding systems; it could also be used for windows-125* coding systems
plus lots of simple coding systems (e.g. iso-8859-1) that are currently
handled using iso2022.
questions for you, stephen:
 how are unicode fonts handled under X windows?
 now that i look more at ccl, i see that it's utterly tied to the old mule encoding, as it expects to look directly at the internal encoding of the text. it would have to be radically restructured for it to work on a character-by-character level, not byte-by-byte. do you think it's reasonable to just make ccl be an optional feature that is compiled in only with old mule?
> So that
>means that code like
>(let ((header (make-extent (message-beginning) (header-end)))
> (body (make-extent (header-end) (message-end))))
>necessarily sucks. If I'm not totally missing the point, I have to
>ask: Do you consider that restriction acceptable?
no, not in the long run.
More information about the XEmacs-Beta