All Blogs

Life (56)
Technology (131)
- Coding Corner (64)
  - Docbook (7)
  - Summer of Code 06 (0)
- Desktop (93)
- Ivman (3)
- Linux (90)
- Networks (14)

RealNitro's Blog

Coding (9)
Fun (2)
Gaming (4)
- CUBE (3)
- Max Payne (0)
- RTCW:ET (2)
Life (11)
Linux (7)
Webdesign (4)

Peter's Blog

Free Software (9)
- Jabber (4)
- Linux (9)
  - My Gentoo (19)
  - Ubuntu@AMD64 (4)
- Workgroup Free Software (4)
Internet & Blogs (12)
Life & Fun (47)
Politics (16)
- EU Software Patents (8)
- N-VA (4)
Studies (23)
- Siemens (13)
UGent & Stuver (18)
VTK (10)

Last comments

- louboutin on Crosswords
- rhinestone motif on OOo2 on 770
- richard on OpenOffice rant
- Tom on GOB signals walkthrough
- Tom on GOB signals walkthrough

June 2006
Sun	Mon	Tue	Wed	Thu	Fri	Sat
<< <	Current	> >>
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Archives for: June 2006, 16

06/16/06

Regarding string duplication

This entry might bore most of you, upset some others, and could be useful for some. Sorry!!!

Last 3 days I've been reading (in my study pauses ;-)) on code optimalisation (both code speed and memory) and DSO pitfalls (pointers and some samples might follow later).

One of the things I read was a paper by Ulrich Drepper (Glibc guy), where one of the samples was about string duplication (strdup) optimalisation.

I looked into the Glib (no, thats not glibc) code to see how it was handled there, and found a simple, straight-forward, (unoptimized) implementation.
So obviously, I wanted to find out if this could be somewhat optimized. So I started my precious gvim, and made a testcase, which does some (well, a lot of) different duplications, both using g_strdup and my own little implementation.

An overview of what the test code does:

Do one simple strdup using g_strdup, and free the result (so the test is "fair", think of relocation time)
Copy, print and free a compile-time constant string, a variable string (argv[0]), and an empty string (strlen printed here) to make sure the code works fine
In a loop of 10000000 iterations, duplicate a constant string, a variable one (again, argv[0]) and an empty one, and free them

Why always a constant string, a variable one and an empty one? Simply because my implementation tries to handle these cases in the most efficient way ;-)

Here's the result:

nicolas@marslander ~/tmp $ gcc -o my_strdup_test -O2 `pkg-config --cflags --libs glib-2.0` g_strdup_test.c
g_strdup_test.c:9:10: warning: #warning "Using own implementation"
nicolas@marslander ~/tmp $ gcc -o g_strdup_test -O2 `pkg-config --cflags --libs glib-2.0` g_strdup_test.c -DUSE_G_STRDUP
g_strdup_test.c:5:10: warning: #warning "Using g_strdup"
nicolas@marslander ~/tmp $ ./g_strdup_test > /dev/null && echo -e "g_strdup_test:\n==============" && time ./g_strdup_test && ./my_strdup_test > /dev/null && echo -e "my_strdup_test:\n===============" && time ./my_strdup_test
g_strdup_test:
==============
foo = foo
./g_strdup_test = ./g_strdup_test
0 = 0

real    0m9.438s
user    0m8.968s
sys     0m0.064s
my_strdup_test:
===============
foo = foo
./my_strdup_test = ./my_strdup_test
0 = 0

real    0m5.906s
user    0m5.627s
sys     0m0.033s

I know 'time' is not a very good benchmark, but I guess the result is fairly obvious.
This result is +- the average, g_strdup version ranging between 9.4 and 9.6 seconds, mine between 5.8 and 6.1 seconds. This is a 30-40% speed gain, not wasting more memory.

Obviously, not all duplications would become faster. Actually, only the ones of strings which are constant at compile-time. In some applications this can be a major amount of g_strdup calls though.

I should note this "enhancement" is only possible when GCC is used as compiler as it uses some GCC-specific extensions. This is not an issue though, as on other compilers the code won't become slower :-)

The code I used can be found here. I added comments to make some things easier to understand (I hope), but the code is not "clean". A check for GCC should be added etc, but hey, this is just a proof of concept.

I really like these rather low-level optimisation things. They allow a lot of applications to become more efficient without changing the actual app, and low-level code is fun.
I'll read some more about this, and try to blog about interesting papers/techniques/... too, got some things in mind, but lack time now (yeah, those exams).

Luckily the end is in sight, and so is GUADEC :-) Booked my plane some days ago, still awaiting one email about lodging stuff. I'm so glad and thankful I'll be able to be there. Great start of holidays ;-)

Enough now, more algebraics and geometry to learn...

. Ikke . 07:30:53 pm . 645 Words . Linux, Coding Corner . . 969 views . 3 comments

All Blogs

Ikke's Blog

Fedora Stateless @ UGent

RealNitro's Blog

Peter's Blog

Who's Online?

Archives for: June 2006, 16

06/16/06