Overclockaholics Forums - View Single Post

Sanmayce · #23 03-27-2012

Thanks a lot Neuromancer, I regret that didn't say exactly how to gather results, you did a lot of editing but there is no need of any, sorry for misleading you.

Something wrong with the test qpress: Process Time = 0.483 = 339% which suggests 4 threads?!
Is this AMD with 6cores or 4cores? AMD says that 1090T has 6cores.
http://shop.amd.com/us/All/ModelsPer...henomiix6black

You gave me some valuable information about AMD Phenom II X6 Black (45nm, 6 cores, 512KB L2 6144KB L3), it was a missing and needed test. I am still an AMD's fan despite their recent decline.

Some quick notes:

1]
Roughly speaking I have had some illusions about shining of Railgun_Quadruplet_7Hasherezade (using hashed approach), again the wonderful BNDM_64 eclipses the rest, I need the full dump in order to examine the exact behavior of all 4 functions through different patterns, though.

>... also seems wierd that the more times it found a phrase the worse performance was ...
The number of hits is not important but the length (and the TYPE mainly) of the phrase, this is the cause of my affection toward fine MEMMEM tuning - it needs careful analysis taking in account different string ranges/lengths.

2]
Sadly for some reason (I am puzzled here) Yappy test shows bad news?!
YAPPY: [b 256K] bytes 206908949 -> 95947973 46.4% comp 48.3 MB/s uncomp 1038.5 MB/s
1038.5 MB/s vs 1385.9 MB/s (on i7 2600K tested by rickss69), nah.

3]
Kazuya_PTHREADed: DEFAULT_THREAD_COUNT: 6
Kazuya_PTHREADed: DEFAULT_COMPRESSION_LEVEL: 3
Kazuya_PTHREADed: DEFAULT_COMPRESS_CHUNK_SIZE: 524288
Kazuya_PTHREADed: Decompression RAM-to-RAM performance: 2525MB/s

Sight for sore eyes, very pleasing indeed but I am awfully greedy I need 4400MB/s, why? That is why:
One of nifty benefits from Lasse's light-fast Lempel-Ziv library is to boost the sequential external RAM reads (HDDs, SSDs). For example if you have 520MB/s burst read (SATA III SSD) then you need xMB/s in order to double the burst load/read into physical/main RAM. The calculation is simple: assume we have those 520MB/s then in order to traverse OSHO.TXT(197MB) it would take 197/520=0.378s, when running qpress: OSHO.TXT.qp(75MB) it would take 75/520 + 197/2525 = 0.222s or ((0.378-0.222)/0.222)*100% = 70.2% boosting. Now I want 2x520MB/s this requires 0.378s/2=0.189s or the above mentioned 75/520 + 197/x = 0.189 which equals x = 197/(0.189-(75/520))=4400MB/s, a dream soon to come true.

And all this performed when using qpress (PTHREADed QuickLZ) in the dummy synchronous mode being slower than asynchronous.

4]
Intel's memcpy():
Simplicius says for 'memcpy' performance: 2676 MB/s

Microsoft's memcpy():
Simplicius says for 'memcpy' performance: 2782 MB/s

The pancake is turned - on Intel CPUs first result (Intel compiler used) is better than the second (Microsoft compiler used).

I don't know whether the forum allows it but the easiest way is to attach a ZIP file (of all resultant text files which are in your NOTEPAD) it is less than 64KB, or to email me this ZIP file to sanmayce@sanmayce.com, in future revisions (I want to gather results on some really overclocked monsters) my plan is to create a single HTML file (similar to the EVEREST's report) out of all (7 so far) resultant text files with a simple C written tool, in this way I will eliminate the torture you went through.