| lh | 9ed821d | 2023-04-07 01:36:19 -0700 | [diff] [blame] | 1 | The following is the README for UFC-crypt, with those portions deleted | 
|  | 2 | that are known to be incorrect for the implementation used with the | 
|  | 3 | GNU C library. | 
|  | 4 |  | 
|  | 5 |  | 
|  | 6 | UFC-crypt: ultra fast 'crypt' implementation | 
|  | 7 | ============================================ | 
|  | 8 |  | 
|  | 9 | @(#)README	2.27 11 Sep 1996 | 
|  | 10 |  | 
|  | 11 | Design goals/non goals: | 
|  | 12 | ---------------------- | 
|  | 13 |  | 
|  | 14 | - Crypt implementation plugin compatible with crypt(3)/fcrypt. | 
|  | 15 |  | 
|  | 16 | - High performance when used for password cracking. | 
|  | 17 |  | 
|  | 18 | - Portable to most 32/64 bit machines. | 
|  | 19 |  | 
|  | 20 | - Startup time/mixed salt performance not critical. | 
|  | 21 |  | 
|  | 22 | Features of the implementation: | 
|  | 23 | ------------------------------ | 
|  | 24 |  | 
|  | 25 | - On most machines, UFC-crypt runs 30-60 times faster than crypt(3) when | 
|  | 26 | invoked repeated times with the same salt and varying passwords. | 
|  | 27 |  | 
|  | 28 | - With mostly constant salts, performance is about two to three times | 
|  | 29 | that of the default fcrypt implementation shipped with Alec | 
|  | 30 | Muffets 'Crack' password cracker. For instructions on how to | 
|  | 31 | plug UFC-crypt into 'Crack', see below. | 
|  | 32 |  | 
|  | 33 | - With alternating salts, performance is only about twice | 
|  | 34 | that of crypt(3). | 
|  | 35 |  | 
|  | 36 | - Requires 165 kb for tables. | 
|  | 37 |  | 
|  | 38 | Author & licensing etc | 
|  | 39 | ---------------------- | 
|  | 40 |  | 
|  | 41 | UFC-crypt is created by Michael Glad, email: glad@daimi.aau.dk, and has | 
|  | 42 | been donated to the Free Software Foundation, Inc. It is covered by the | 
|  | 43 | GNU library license version 2, see the file 'COPYING.LIB'. | 
|  | 44 |  | 
|  | 45 | NOTES FOR USERS OUTSIDE THE US: | 
|  | 46 | ------------------------------ | 
|  | 47 |  | 
|  | 48 | The US government limits the export of DES based software/hardware. | 
|  | 49 | This software is written in Aarhus, Denmark. It can therefore be retrieved | 
|  | 50 | from ftp sites outside the US without breaking US law. Please do not | 
|  | 51 | ftp it from american sites. | 
|  | 52 |  | 
|  | 53 | Benchmark table: | 
|  | 54 | --------------- | 
|  | 55 |  | 
|  | 56 | The table shows how many operations per second UFC-crypt can | 
|  | 57 | do on various machines. | 
|  | 58 |  | 
|  | 59 | |--------------|-------------------------------------------| | 
|  | 60 | |Machine       |  SUN*  SUN*   HP*     DecStation   HP     | | 
|  | 61 | |              | 3/50   ELC  9000/425e    3100    9000/720 | | 
|  | 62 | |--------------|-------------------------------------------| | 
|  | 63 | | Crypt(3)/sec |  4.6    30     15         25        57    | | 
|  | 64 | | Ufc/sec      |  220   990    780       1015      3500    | | 
|  | 65 | |--------------|-------------------------------------------| | 
|  | 66 | | Speedup      |   48    30     52         40        60    | | 
|  | 67 | |--------------|-------------------------------------------| | 
|  | 68 |  | 
|  | 69 | *) Compiled using special assembly language support module. | 
|  | 70 |  | 
|  | 71 | It seems as if performance is limited by CPU bus and data cache capacity. | 
|  | 72 | This also makes the benchmarks debatable compared to a real test with | 
|  | 73 | UFC-crypt wired into Crack. However, the table gives an outline of | 
|  | 74 | what can be expected. | 
|  | 75 |  | 
|  | 76 | Optimizations: | 
|  | 77 | ------------- | 
|  | 78 |  | 
|  | 79 | Here are the optimizations used relative to an ordinary implementation | 
|  | 80 | such as the one said to be used in crypt(3). | 
|  | 81 |  | 
|  | 82 | Major optimizations | 
|  | 83 | ******************* | 
|  | 84 |  | 
|  | 85 | - Keep data packed as bits in integer variables -- allows for | 
|  | 86 | fast permutations & parallel xor's in CPU hardware. | 
|  | 87 |  | 
|  | 88 | - Let adjacent final & initial permutations collapse. | 
|  | 89 |  | 
|  | 90 | - Keep working data in 'E expanded' format all the time. | 
|  | 91 |  | 
|  | 92 | - Implement DES 'f' function mostly by table lookup | 
|  | 93 |  | 
|  | 94 | - Calculate the above function on 12 bit basis rather than 6 | 
|  | 95 | as would be the most natural. | 
|  | 96 |  | 
|  | 97 | - Implement setup routines so that performance is limited by the DES | 
|  | 98 | inner loops only. | 
|  | 99 |  | 
|  | 100 | - Instead of doing salting in the DES inner loops, modify the above tables | 
|  | 101 | each time a new salt is seen. According to the BSD crypt code this is | 
|  | 102 | ugly :-) | 
|  | 103 |  | 
|  | 104 | Minor (dirty) optimizations | 
|  | 105 | *************************** | 
|  | 106 |  | 
|  | 107 | - combine iterations of DES inner loop so that DES only loops | 
|  | 108 | 8 times. This saves a lot of variable swapping. | 
|  | 109 |  | 
|  | 110 | - Implement key access by a walking pointer rather than coding | 
|  | 111 | as array indexing. | 
|  | 112 |  | 
|  | 113 | - As described, the table based f function uses a 3 dimensional array: | 
|  | 114 |  | 
|  | 115 | sb ['number of 12 bit segment']['12 bit index']['48 bit half index'] | 
|  | 116 |  | 
|  | 117 | Code the routine with 4 (one dimensional) vectors. | 
|  | 118 |  | 
|  | 119 | - Design the internal data format & uglify the DES loops so that | 
|  | 120 | the compiler does not need to do bit shifts when indexing vectors. | 
|  | 121 |  | 
|  | 122 | Revision history | 
|  | 123 | **************** | 
|  | 124 |  | 
|  | 125 | UFC patchlevel 0: base version; released to alt.sources on Sep 24 1991 | 
|  | 126 | UFC patchlevel 1: patch released to alt.sources on Sep 27 1991. | 
|  | 127 | No longer rebuilds sb tables when seeing a new salt. | 
|  | 128 | UFC-crypt pl0:	  Essentially UFC pl 1. Released to comp.sources.misc | 
|  | 129 | on Oct 22 1991. | 
|  | 130 | UFC-crypt pl1:    Released to comp.sources.misc in march 1992 | 
|  | 131 | * setkey/encrypt routines added | 
|  | 132 | * added validation/benchmarking programs | 
|  | 133 | * reworked keyschedule setup code | 
|  | 134 | * memory demands reduced | 
|  | 135 | * 64 bit support added |