[fpc-pascal] Sorted map vs hash map ?

Discussion:

Serguei TARASSOV

2015-07-20 19:12:32 UTC

Hi all,

I did a small test to compare performance of TFPGMap and TFPHashList in
sequential and random accessing values by keys.
http://arbinada.com/main/en/node/1511

The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted map.

Any explanations and suggestions are welcome.

Regards,
Serguei
_______________________________________________
fpc-pascal maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Michael Van Canneyt

2015-07-21 10:13:33 UTC

Permalink

Post by Serguei TARASSOV
Hi all,
I did a small test to compare performance of TFPGMap and TFPHashList in
sequential and random accessing values by keys.
http://arbinada.com/main/en/node/1511
The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted map.
Any explanations and suggestions are welcome.

In my opinion there is a simple explanation:

As a general solution, using generics will always be slower than 'native' classes.
The reason is that any generic implementation which does not place restrictions on the types used,
will be forced to use CompareMem() and Move() for its implementation (as do the classes in fgl).
Even when the generic class is otherwise optimally programmed, calling these routines will
always be slower than a direct comparision in 2 CPU registers in case of integers/pointers.

Michael.
_______________________________________________
fpc-pascal maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Sven Barth

2015-07-21 12:25:34 UTC

Permalink

Post by Michael Van Canneyt

Post by Serguei TARASSOV
Hi all,
I did a small test to compare performance of TFPGMap and TFPHashList in

sequential and random accessing values by keys.

Post by Michael Van Canneyt

Post by Serguei TARASSOV
http://arbinada.com/main/en/node/1511
The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted map.
Any explanations and suggestions are welcome.

As a general solution, using generics will always be slower than 'native' classes.
The reason is that any generic implementation which does not place

restrictions on the types used,

Post by Michael Van Canneyt
will be forced to use CompareMem() and Move() for its implementation (as

do the classes in fgl). Even when the generic class is otherwise optimally
programmed, calling these routines will always be slower than a direct
comparision in 2 CPU registers in case of integers/pointers.

You are wrong with your assumptions regarding TFPGList: it does not use
Move(). The base implementation in TFPSList does, but the generic class
overrides that with an assignment. Otherwise for example Strings wouldn't
work with TFPGList.

Regards,
Sven

Michael Van Canneyt

2015-07-21 14:32:39 UTC

Permalink

Post by Serguei TARASSOV

Post by Michael Van Canneyt

Post by Serguei TARASSOV
Hi all,
I did a small test to compare performance of TFPGMap and TFPHashList in

sequential and random accessing values by keys.

Post by Michael Van Canneyt

Post by Serguei TARASSOV
http://arbinada.com/main/en/node/1511
The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted

map.

Post by Michael Van Canneyt

Post by Serguei TARASSOV
Any explanations and suggestions are welcome.

As a general solution, using generics will always be slower than 'native'

classes.

Post by Michael Van Canneyt
The reason is that any generic implementation which does not place

restrictions on the types used,

Post by Michael Van Canneyt
will be forced to use CompareMem() and Move() for its implementation (as

do the classes in fgl). Even when the generic class is otherwise optimally
programmed, calling these routines will always be slower than a direct
comparision in 2 CPU registers in case of integers/pointers.
You are wrong with your assumptions regarding TFPGList: it does not use
Move(). The base implementation in TFPSList does, but the generic class
overrides that with an assignment. Otherwise for example Strings wouldn't
work with TFPGList.

I am glad it is not as bad as I thought :)

Michael.
_______________________________________________
fpc-pascal maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Sven Barth

2015-07-21 15:37:24 UTC

Permalink

Post by Michael Van Canneyt

Post by Serguei TARASSOV

Post by Michael Van Canneyt

Post by Serguei TARASSOV
Hi all,
I did a small test to compare performance of TFPGMap and TFPHashList in

sequential and random accessing values by keys.

Post by Michael Van Canneyt

Post by Serguei TARASSOV
http://arbinada.com/main/en/node/1511
The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted

map.

Post by Michael Van Canneyt

Post by Serguei TARASSOV
Any explanations and suggestions are welcome.

As a general solution, using generics will always be slower than 'native'

classes.

Post by Michael Van Canneyt
The reason is that any generic implementation which does not place

restrictions on the types used,

Post by Michael Van Canneyt
will be forced to use CompareMem() and Move() for its implementation (as

do the classes in fgl). Even when the generic class is otherwise optimally
programmed, calling these routines will always be slower than a direct
comparision in 2 CPU registers in case of integers/pointers.
You are wrong with your assumptions regarding TFPGList: it does not use
Move(). The base implementation in TFPSList does, but the generic class
overrides that with an assignment. Otherwise for example Strings wouldn't
work with TFPGList.

I am glad it is not as bad as I thought :)

I had the same feeling as you when I looked at the code the first time, but
then I found out how it is really intended ^^

Regards,
Sven

Serguei TARASSOV

2015-07-21 12:22:40 UTC

Permalink

The problem was in the proper generation of random keys for testing.
Now it looks better.
I added statistics with different key's density ("normal" usage is
unique keys)

Subject: [fpc-pascal] Sorted map vs hash map ?
Content-Type: text/plain; charset=windows-1252; format=flowed
Hi all,
I did a small test to compare performance of TFPGMap and TFPHashList in
sequential and random accessing values by keys.
http://arbinada.com/main/en/node/1511
The results are not the same than expected.
In theory, the hash map should give O(1) and O(log2 N) for the sorted map.
Any explanations and suggestions are welcome.
Regards,
Serguei

_______________________________________________
fpc-pascal maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal