Free Rainbow Tables | Forum

Home of the Distributed Generator and Cracker
It is currently 24 Apr 2014, 05:13

All times are UTC + 1 hour [ DST ]




Post new topic Reply to topic  [ 112 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7, 8  Next
Author Message
PostPosted: 21 Nov 2008, 21:20 
Offline
Perfect Table
User avatar

Joined: 12 May 2008, 11:02
Posts: 809
im not requesting you to stop posting in this board.

i just think that this thread is about principial optimisation. and not about cuda or not.
i also think that everyone agrees, we'll have to make the step to cuda some day. but due to the problems which will be caused by the new platform it would be more usefull to first have a stable ground, which would be a good hash function.

_________________
Image


Top
 Profile  
 
 Post subject:
Posted: 21 Nov 2008, 21:21 


Top
  
 
PostPosted: 21 Nov 2008, 21:21 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
As a reply to both the above posts, all I can say is i would love to have a go at it, but am completely broke to get myself a new card... :oops:
I'm praying for some dough on Christmas... :roll:

Regarding "worrying about CPU implementations", why not?
All I have done was implementing the MD4/MD5 as per RFC, and then a few optimizations for small lengths (0 to 16) - the full, clean source code for MD4 is ~300 lines and it's way faster than openssl's implementation...


Top
 Profile  
 
PostPosted: 22 Nov 2008, 00:14 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
This just popped into my mind:

Code:
aMask32  = crc32(nChainPos) ^ nTableIndex;
aMask128 = _mm_set1_epi32(aMask32);


Simple, yet effective!


Update: adding bench/corrected values :)

$ gcc -O3 -march=prescott -msse2 crc32.c -o crc32
./crc32
Code:
Running benchmark, please wait...
Hash performance:  68.293M/sec


Top
 Profile  
 
PostPosted: 22 Nov 2008, 01:14 
Offline
Site Admin

Joined: 11 Oct 2007, 21:17
Posts: 1618
Location: Copenhagen, Denmark
the_drag0n wrote:
it is true but as there is noone to write cuda stuff we cant use it.



Not exactly true. As Bitweasil says its not that hard.
The reason not to port the rainbowcrack code to CUDA is because
1) I was busy with the DistrRTgen client (believe me it took a lot of time!)
2) The current RainbowCrack code doesn't translate very well to cuda. Doing 64 bit divides on CUDA is almost impossible according to my results.

So there won't be CUDA code before we find a new reduction function, which is work in progress.


Top
 Profile  
 
PostPosted: 22 Nov 2008, 16:21 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
Here are preliminary bench results for "octo" MD5 (SSE2), single core C2D 4500, 2.2Ghz

./md5_sse
Code:
Running benchmark, please wait...
Hash performance:  2.347M/sec (18.779M/sec total)


My guess is a fast quad core CPU could do about 100M/s - which is in the same order of magnitude as a CUDA implementation.

Will update with md4 results soon


Top
 Profile  
 
PostPosted: 22 Nov 2008, 16:53 
Offline
Guesser

Joined: 15 Nov 2008, 20:45
Posts: 34
Hi,
will you post your implementation of md4 as well? Would be interesting to see how better it performs against the one from Bitweasil. If you say, you optimize for lentgths <16, do you mean it's not possible to use lengths>=16, or just that it's slower?
Corni


Top
 Profile  
 
PostPosted: 22 Nov 2008, 17:08 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
Corni wrote:
Hi,
will you post your implementation of md4 as well? Would be interesting to see how better it performs against the one from Bitweasil. If you say, you optimize for lentgths <16, do you mean it's not possible to use lengths>=16, or just that it's slower?
Corni


Works for any length, though 0 thru 15 is faster.
The code isn't "production ready" (needs cleaning up, merge of the 4/8 hash SSE variants, etc), so no actual release yet, but PM for a link to the current working version.

I'd like to benchmark the one from Bitweasil, if i can find it...


Top
 Profile  
 
PostPosted: 22 Nov 2008, 17:40 
Offline
Guesser

Joined: 15 Nov 2008, 20:45
Posts: 34
you'll get a PM in a minute ;)
here's Bitweasils implementation:
viewtopic.php?f=6&t=904&start=0#p7940
edit: you have disallowed PM's by normal members...


Top
 Profile  
 
PostPosted: 22 Nov 2008, 17:56 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
Corni wrote:
you'll get a PM in a minute ;)
here's Bitweasils implementation:
viewtopic.php?f=6&t=904&start=0#p7940
edit: you have disallowed PM's by normal members...


Oh, thats just the "reference" algorithm. not a full implementation. If you want a full MD4 (and not a version with reversed steps) you pretty much have to follow that...
The optimization part is done on preparing variables, moving memory from one place to another, order of instructions, etc...


Top
 Profile  
 
PostPosted: 22 Nov 2008, 23:52 
Offline
Rainbow Table

Joined: 02 Aug 2008, 08:09
Posts: 230
I can benchmark code on a Core 2 Quad 2.33, if anyone is interested. I've got access to a few of those. PM me.

My code is just the reference implementation, with unused inputs zeroed out, and only a single pass. I didn't do anything fancy with it.

With the reduction function, you're probably looking at around 80M links/s on a quadcore, which is quite respectable and a very significant improvement on the RainbowCrack implementation.


Top
 Profile  
 
PostPosted: 23 Nov 2008, 00:42 
Offline
Developer

Joined: 30 Mar 2008, 15:37
Posts: 865
i have a core2quad 2.66 @ 3.2 Ghz on winxp 32bit if you want to run a test ;)


Top
 Profile  
 
PostPosted: 23 Nov 2008, 01:21 
Offline
Rainbow Table

Joined: 02 Aug 2008, 08:09
Posts: 230
... far, FAR too many people run Windows on otherwise perfectly good hardware.

I suppose I need a copy of Visual Studio.


Top
 Profile  
 
PostPosted: 23 Nov 2008, 01:37 
Offline
Shoulder Surfer

Joined: 19 Sep 2008, 04:42
Posts: 15
Visual Studio Express is free, and should have everything you need on it.


Top
 Profile  
 
PostPosted: 23 Nov 2008, 16:10 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
Bitweasil wrote:
... far, FAR too many people run Windows on otherwise perfectly good hardware.

I suppose I need a copy of Visual Studio.


So true
I just XP on VirtualBox if i really need to, takes 3 seconds from click to working desktop :D
I do my coding on Linux, also need to get VS installed for building/testing.


Top
 Profile  
 
PostPosted: 25 Nov 2008, 01:22 
Offline
Dictionary

Joined: 05 Nov 2007, 01:55
Posts: 99
Just to sum up a few things on possible optimizations :

  • New reduction function
    - one possible method is a XOR mask, along with a fixed plaintext Len and "mapping" of at most 16 bytes to plain at that charset position.
    - another method would be using a PRNG, would need more details on specific implementation
    - expected gain: depends on the hash algo being used, but a figure of 30% for whole chain isn't too far off.
  • Computation of chains in parallel
    - To allow the use of SSE-optimized (and possibly CUDA) versions, computation of hashes in parallel needs to be used with different chains.
    - Also, the same app could be multi-threaded, using all the CPU cores in 1 single instance.
    - expected gain: from some benchmarks of an 8-in-1 MD5 SSE2/3 version, expect from 200% to 350% gain for a single core (for the hashing alone).
  • Optimized hash routines
    - Implementation of optimized hash routines and dropping of openssl.
    - speed gains for single-hash versions vary, but i would expect anything between 10 to 100%. Also, see above for multiple-hash routines.

A note on the first line:
- that idea for a new reduction function would mean fixed-len plaintexts... not sure how this would impact others (1 table per len)
- the relative speed gains can be much higher, if the speed of hashing (using new hash routines) is also higher - as the current reduction function would be "relatively slower" (not sure if what i typed makes any sense, hopefully it does)

Feel to discuss these points / add more ideas.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 112 posts ]  Go to page Previous  1 ... 3, 4, 5, 6, 7, 8  Next

All times are UTC + 1 hour [ DST ]


Who is online

Users browsing this forum: Google [Bot] and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group