Code Poetry
and Text Adventures

by catid posted (15 days ago) 9:35pm Sun. Feb 28th 2010 PST
I've been working restlessly on the server-side of my online game for months now, building up a good framework from scratch.  That effort has just started to pay off:

32-bit Splane Chat Client: http://catid.org/code/admin_client32.zip (Windows XP+)
64-bit Splane Chat Client: http://catid.org/code/admin_client64.zip (Windows XP+)

While a lot of the framework is forward-... (more)
last edit by catid edited (2 days ago) 9:39pm Sat. Mar 13th 2010 PST
by catid posted (>30 days ago) 7:10pm Mon. Jan 25th 2010 PST
I threw together a little command-line tool for Windows that can proxy an active-mode FTP connection (where the client PC acts as a server for data transfer) and dumps the data transfers (GET/PUT, etc) to disk in hex to help debug a problem at work.  Source included: http://catid.org/code/ftpproxy.zip
last edit by catid edited (>30 days ago) 7:12pm Mon. Jan 25th 2010 PST
by catid posted (>30 days ago) 12:54pm Sat. Jan 23rd 2010 PST
User logins are authenticated with two factors of identification: Password and key file.

The login process is performed after a secure tunnel is formed with the remote server.  So, as long as the server's public key is known ahead of time, man-in-the-middle attacks are nearly impossible.  The attacks against user authentication that remain only work if the server is compromised or impersonated.... (more)
last edit by catid edited (>30 days ago) 2:22pm Sat. Jan 23rd 2010 PST
by catid posted (>30 days ago) 9:53pm Mon. Jan 18th 2010 PST
Here's an application I wrote to grab and examine all of the ban ids from an earlier post.



You will note that it attempts to determine if the ID is strong enough to be used as a unique ban id, or if it is weak and cannot be used to uniquely identify a user by itself.  The difference is usually whether or not there is a serial numb... (more)
last edit by catid edited (>30 days ago) 10:36am Wed. Jan 20th 2010 PST
by catid posted (>30 days ago) 3:35pm Mon. Jan 18th 2010 PST
Continuing from the previous post (http://catid.mechafetus.com/news/news.php?view=209):

If a game developer has implemented most or all of the ban ids from the previous post, then chances are the best way to bypass a ban in the game is to modify the client code.  It is normally a lot less work to modify the client code to change the values being sent across than to write a lot of cod... (more)
last edit by catid edited (>30 days ago) 4:34pm Mon. Jan 18th 2010 PST
by catid posted (>30 days ago) 1:11am Mon. Jan 18th 2010 PST
When players get themselves into trouble by being obnoxious, the only sensible thing to do is ban them for a while.

Players don't like being banned, so they'll fight back by trying to get around the ban.  I have come up with an exhaustive list of all of the ways you can make this difficult for someone on a Windows PC.  A lot of these methods will apply to other operating systems.


First off... (more)
last edit by catid edited (>30 days ago) 5:42am Wed. Jan 20th 2010 PST
by catid posted (>30 days ago) 9:24am Fri. Jan 15th 2010 PST
There will be a few game servers to write:

Login Server (handles user logins and passes users off to zone servers)
Login Database Server (stores user names and passwords)
Zone Server (for actual gameplay)
Zone Database Server (stores user state between sessions)

For each of these servers, I need to write an administrator tool, which will then be distributed to the moderators for the syste... (more)
last edit by catid edited (>30 days ago) 11:33am Mon. Jan 18th 2010 PST
by catid posted (>30 days ago) 10:26am Mon. Jan 11th 2010 PST
I just noticed a problem with UDP-based game servers that use one port to handle many users:

The kernel's receive buffer size set by SO_RCVBUF is allocated once per port.  If someone sends a ton of UDP data to that port it will eat the receive buffer space and cause lots of packet loss for legitimate users.

I don't see any way for the server to filter out excessive data if someone is able to... (more)
last edit by catid edited (>30 days ago) 8:54pm Mon. Jan 11th 2010 PST
by catid posted (>30 days ago) 10:23am Tue. Jan 5th 2010 PST
Based on what I have envisioned needing for my game, the following protocol is being used after spending weeks looking at alternatives and false-starts.  It provides signaled disconnects, fragmentation, MTU discovery, time synchronization, three reliable ordered streams, one unordered reliable stream, and unreliable delivery.

The Transport object that implements a sender/receiver in the protoco... (more)
last edit by catid edited (>30 days ago) 7:40pm Mon. Jan 11th 2010 PST
by catid posted (>30 days ago) 5:58am Tue. Jan 5th 2010 PST
http://www.loria.fr/~zimmerma/mca/pub226.html

I find the HAC to be more friendly.
last edit by catid edited (>30 days ago) 1:54pm Tue. Jan 5th 2010 PST
by catid posted (>30 days ago) 5:08pm Sun. Jan 3rd 2010 PST
Once a link is established between two endpoints on the Internet, the link MTU must be known in order to maximize transport performance.  With higher MTU, transmitting data requires fewer packets, so less overhead per packet.  Furthermore, many Internet routers balance bandwidth usage with the number of packets sent rather than the number of bytes sent, so fewer packets is better.

Online games ... (more)
last edit by catid edited (>30 days ago) 7:11am Wed. Jan 6th 2010 PST
by catid posted (>30 days ago) 8:23pm Mon. Dec 21st 2009 PST
or "Actors in blue suits playing actors in blue suits"










last edit by catid edited (>30 days ago) 9:52pm Mon. Dec 21st 2009 PST
by catid posted (>30 days ago) 9:45am Tue. Dec 15th 2009 PST
To hash an IP address and a port to get a hash table key, I have found the following function leads to nearly perfect uniform insertion across the hash table:

[code]
u32 ConnectionMap::hash_addr(IP ip, Port port, u32 salt)
{
  u32 a = salt ^ ip;

  // Thomas Wang's integer hash function
  // http://www.cris.com/~Ttwang/tech/inthash.htm
  a = (a ^ 61) ^ (a >> 16);
  a = a + (a << 3);
  a = a ... (more)
last edit by catid edited (>30 days ago) 7:31am Wed. Dec 16th 2009 PST
by catid posted (>30 days ago) 7:48am Sat. Dec 12th 2009 PST
Reconstruct a 32-bit or 64-bit counter that increments by one each time, given a truncated sample of its low bits, and the last accepted value of the counter:

[code]
template<int BITS, typename T> CAT_INLINE T ReconstructCounter(T last_accepted_count, u32 partial_low_bits)
{
  const u32 IV_MSB = (1 << BITS); // BITS < 32
  const u32 IV_MASK = (IV_MSB - 1);

  s32 diff = partial_low_bits - (u... (more)
last edit by catid edited (>30 days ago) 8:23am Sat. Dec 12th 2009 PST
by catid posted (>30 days ago) 2:16pm Tue. Dec 1st 2009 PST
I squeezed about 8% more performance out of the handshake by implementing a better squaring algorithm.  The new algorithm is recursive.  It performs two 4x4 Karatsuba squares, one 4x4 schoolbook multiply, and then shifts a product and adds it in to implement an 8x8 Karatsuba square (8 x 32 = 256 bits).  And it's written in C++ so all 32-bit platforms will benefit from this change.

The schoolboo... (more)
last edit by catid edited (>30 days ago) 2:25pm Tue. Dec 1st 2009 PST
by catid posted (>30 days ago) 5:35pm Wed. Nov 18th 2009 PST
On a 32-bit Intel Xeon E5420 @ 2.50 GHz running Windows XP, I get these timing results from MSVC2008:

Public key handshake algorithms:
Key pair generation = 671 usec (server, one-time only)
Generating challenge = 565 usec (client)
Challenge processing = 624 usec (server)
Validating answer = 721 usec (client)

Encryption algorithms:
Decryption and MAC = 14 usec (varies with packet length,... (more)
last edit by catid edited (>30 days ago) 2:25pm Tue. Dec 1st 2009 PST
by catid posted (>30 days ago) 6:25pm Fri. Nov 6th 2009 PST
http://catid.org/code/QtDemo5.zip

This one is a sphere-normalized cube like the Windows Flower Box screen-saver except without the lighting.  Uses VBO geometry and quaternion camera rotation.

I did this mainly because the vertex coordinate generation can be optimized very far:

[code]
R = number of vertices across one edge of the cube, should be an even number
S = R / 2

#de... (more)
last edit by catid edited (>30 days ago) 6:35pm Fri. Nov 6th 2009 PST
by catid posted (>30 days ago) 3:34pm Thu. Nov 5th 2009 PST
http://catid.mechafetus.com/code/QtDemo4.zip

This one combines the last two to display 10,000 cubes moving in real time to some simple modulation.  Check the release folder for the executables... I included two for different effects that I played with.

Working on generating good spheres right now, and integrating OpenAL for music.
by catid posted (>30 days ago) 11:31am Mon. Nov 2nd 2009 PST
http://catid.mechafetus.com/code/QtDemo3.zip

This demo displays two spinning cubes using VBO when possible.  I'm using GLee to manage my OGL extensions (http://www.elf-stone.com/glee.php).  It's slick and easy to use.
last edit by catid edited (>30 days ago) 11:33am Mon. Nov 2nd 2009 PST
by catid posted (>30 days ago) 5:58am Mon. Nov 2nd 2009 PST
http://catid.org/code/QtDemo2.zip

This one looks about the same except it's using the new linear algebra classes in LibCat to generate the camera rotation matrix.  As an aside, QtCreator fully supports using CMakeLists.txt as a native project file format, so it is trivial to build and link LibCat in QtCreator.
by catid posted (>30 days ago) 6:52am Fri. Oct 30th 2009 PDT
http://www.derkeiler.com/pdf/Newsgroups/sci.crypt/2008-12/msg00326.pdf

I want to be able to write this type of generator.

If I was doing it, I would use the McEliece cryptosystem using the recent work by Bernstein and Lange (http://cr.yp.to/papers.html#mceliec and http://cr.yp.to/papers.html#goppalist) and the signature construction from [url]http://www.crypt... (more)
last edit by catid edited (>30 days ago) 6:40pm Fri. Nov 6th 2009 PST
by catid posted (>30 days ago) 9:21pm Sun. Oct 18th 2009 PDT
Something simple: Two rotating cubes. The one on the left is multisampled and the one on the right is not.

http://catid.org/code/QtDemo1.zip
The binaries are under ./release/.

I did this demo to understand the basics of 3D with OpenGL and Qt Creator.  I've ordered some books on OpenGL:

"OpenGL Distilled" Paul Martz
"OpenGL SuperBible: Comprehensive Tutorial and Reference (4th... (more)
last edit by catid edited (>30 days ago) 6:02am Thu. Oct 22nd 2009 PDT
by catid posted (>30 days ago) 12:16pm Fri. Oct 16th 2009 PDT
Since I have decided to use 32-bit floats as the primary data type for my new game engine's graphical code, I have taken the time to re-familiarize myself with the many pitfalls of floating-point arithmetic.  The floating-point performance of the C language is another great reminder why game developers like to use C.

Here are some interesting links related to floating-point math:

Quick guide... (more)
last edit by catid edited (>30 days ago) 8:39am Sat. Oct 17th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:22am Wed. Oct 14th 2009 PDT
I was going to do more with this but since I never did, here's my original The World hypertext game:

http://catid.mechafetus.com/the_world
by (Anonymous) posted (>30 days ago) 10:19am Tue. Oct 13th 2009 PDT
last edit by catid edited (>30 days ago) 7:50am Thu. Nov 5th 2009 PST
by (Anonymous) posted (>30 days ago) 10:04am Wed. Oct 7th 2009 PDT
Bono is an international humanitarian of celebrity scale and he puts on one hell of a great concert.


Bono on the Space Station monitor.  In his way he says they built "this.. insanity up there.. to bring ourselves closer to you.  As foolish as that sounds it really is what we wanted."

The big draw for this concert was... (more)
last edit by (Anonymous) edited (>30 days ago) 12:40pm Wed. Oct 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 9:36am Wed. Oct 7th 2009 PDT
The Atlanta Botanical Gardens are located in Piedmont Park in Atlanta, Georgia.  The gardens are large and impressive.  They include an extensive greenhouse environment that trills with Phantasmal Poison Frogs, orchids, high altitude jungle, and low altitude desert plants.

Henry Moore's large industrial and abstract sculptures are one of the primary features of the gardens.  Of the sculptures h... (more)
last edit by (Anonymous) edited (>30 days ago) 5:57pm Wed. Oct 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 9:14am Wed. Oct 7th 2009 PDT
The Georgia Renaissance Faire is a sprawling medieval tribute located just south of Atlanta.  It is a great venue for jewelry crafters, tailors, acting and comedy troupes, and of course pirates are ALWAYS welcome.  You can walk around with a turkey leg and some mead and enjoy shows from noon to night without seeing the same one twice.  Of course there is a jousting tournament as well, which was aw... (more)
last edit by (Anonymous) edited (>30 days ago) 9:24am Wed. Oct 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:14am Mon. Oct 5th 2009 PDT
Been playing in QtDesigner recently, experimenting with the Qt wrappers for OpenGL.  My impression of QtDesigner so far is that it will become my favorite C++ IDE, even though I've managed to crash it already.

Qt provides a complete development environment that is easy to port between common desktop operating systems.  Porting to the iPhone should be done specially anyhow, so I don't see a disa... (more)
by (Anonymous) posted (>30 days ago) 9:33pm Fri. Sep 25th 2009 PDT
Fucking lame Luddite perversion of sci-fi.

I can see why this movie was made.  It's because World of Warcraft has made online role-play a part of the public consciousness.  Suddenly terms like "mangina" make sense to your average moviegoer.  And people will get a joke about a fat guy controlling a cute girl.

It's not about robots -- it's about avatars.

Robots require power, they require i... (more)
by (Anonymous) posted (>30 days ago) 5:58am Tue. Sep 22nd 2009 PDT
Here's an MMORPG that has had a huge development cycle.  Check this out, it's a video from 2006 showing that it was almost finished by most standards: http://www.youtube.com/watch?v=U1LeEpB52o8.


They're still working on it.


I've ordered this game.  Aion looks amazing in videos.  I want to play this game mainly to appreciate the excellent 3D artwork and animation.  On top of th... (more)
last edit by (Anonymous) edited (>30 days ago) 7:41am Mon. Sep 28th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:10am Sun. Sep 20th 2009 PDT


The heroes of the Trine are an unlikely trio: a boisterous warrior, a female thief, and a wizard who cannot shoot fireballs.  Through the power of the Trine they have been combined into a single body allowing just one of them to exist at a time.  They journey through side-scrolling worlds that are artistically rendered and lighted, sol... (more)
last edit by (Anonymous) edited (>30 days ago) 11:58am Sun. Sep 20th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:53pm Thu. Sep 10th 2009 PDT
Since the whole story is never told in the movie,
the audience is left to fill in the blanks of what actually happened to lead up to this.  Here's what I got from it:

So there's this scientist who creates the first sentient AI, by copying his own brain into a box.  He also has somehow created a talisman that is able to manipulate the human soul (extract it, store it, give it to somethin... (more)
by (Anonymous) posted (>30 days ago) 3:26pm Wed. Sep 2nd 2009 PDT
I found these interesting stories on Schneier's blog over at http://www.schneier.com/blog/.

Reminds me of someone I knew a few years ago:
http://www.rollingstone.com/news/story/29787673/the_boy_who_heard_too_much/print

Here's a blog entry from a biohacker, which draws some fascinating parallels between biology and computing:
[url]http://www.bunniestudios.com/blog/?p=3... (more)
last edit by (Anonymous) edited (>30 days ago) 3:27pm Wed. Sep 2nd 2009 PDT
by (Anonymous) posted (>30 days ago) 6:04am Mon. Aug 31st 2009 PDT
Here are some timing results for a 1.2 GHz Core 2 with DDR2 @ 800, running 64-bit Linux:

Signatures:
Server: Signature generation time = 419 usec
Client: Verifying signature generation time = 479 usec

Key agreement:
Server: Processing challenge took 475 usec

The 64-bit version of LibCat is pretty damn fast, even on modest hardware!


On my Core i7 it takes about 341,994 cycles to:
... (more)
last edit by (Anonymous) edited (>30 days ago) 6:20am Tue. Sep 1st 2009 PDT
by (Anonymous) posted (>30 days ago) 8:03pm Thu. Aug 27th 2009 PDT
So I am trying to figure out if a curve over a 256-bit GF(p) with a generator point of order q (large prime) is secure against Tate-Lichtenbaum pairing, which allows the ECDLP to be solved on an equivalent curve in GF(p^k) using Weil descent.

The value of k is determined using the following process, according to how I am reading the Handbook of Elliptic and Hyperelliptic Curve Cryptography:

... (more)
last edit by (Anonymous) edited (>30 days ago) 3:33pm Fri. Aug 28th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:34am Sun. Aug 23rd 2009 PDT
by (Anonymous) posted (>30 days ago) 9:55am Thu. Aug 20th 2009 PDT
256-bit, 384-bit and 512-bit curves are now secure again, and I have added code to the ECC test that verifies the security of the curve parameters.

Schnorr signatures are implemented and are pretty fast.

The new Tabby protocol has been implemented and optimized to be almost twice as fast as before on the server, by re-using ephemeral keys with some lockless code.


[b]New timing results:[... (more)
last edit by (Anonymous) edited (>30 days ago) 7:59pm Tue. Aug 25th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:07am Wed. Aug 19th 2009 PDT
When using the friendly pseudo-Mersenne modulii of the form p = 2^256 - c...

Something I haven't noticed in any papers yet on Twisted Edwards curves over Fp but is important in practice is the fact that the addition and doubling laws have no exceptions for a=-1 and any suitable d that generates a large prime order subgroup with cofactor 4 so long as the field prime p = 1 mod 4.  Any p = 3 mod 4... (more)
last edit by (Anonymous) edited (>30 days ago) 5:44pm Sat. Aug 22nd 2009 PDT
by (Anonymous) posted (>30 days ago) 5:14pm Tue. Aug 18th 2009 PDT
I slapped these settings together really quick based on the OrenEllenbogen_DarkSchema.  It's dark and mellow -- easy on the eyes and not bright.  I added some more colors and changed the font.

http://catid.org/code/CatidMellow.vssettings
by (Anonymous) posted (>30 days ago) 9:31pm Mon. Aug 17th 2009 PDT
So, I had some trouble while implementing Schnorr Signatures (http://en.wikipedia.org/wiki/Schnorr_signature) for LibCat.  At first I thought there was something wrong with my interpretation of the algorithm.  It turned out to be much worse!

Right now I am searching for new secure curve parameters...

(Right-click : View ... (more)
last edit by (Anonymous) edited (>30 days ago) 6:08pm Wed. Aug 19th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:22pm Sun. Jul 26th 2009 PDT
This humor in this game withstands the test of time, but the art and music do not...

Fortunately, LucasArts has released a Special Edition of the game with completely revamped artwork, music, sound, and voice-overs.  It's hilarious!  Try it, or taste the fizzy wrath of my root beer!

More info at the official site: http://www.lucasarts.com/games/monkeyisland/

Hint: While playing... (more)
last edit by (Anonymous) edited (>30 days ago) 8:28pm Sun. Jul 26th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:06pm Wed. Jul 22nd 2009 PDT
Chromium Developers have invented a novel way to keep their users up to date with the latest version of their browser.  It's called Courgette (http://dev.chromium.org/developers/design-documents/software-updates-courgette), a disassembler and assembler that allows them to improve compression of executable file updates by about 88% over a naive transmit-the-difference approach.

When i... (more)
last edit by (Anonymous) edited (>30 days ago) 6:08pm Wed. Jul 22nd 2009 PDT
by (Anonymous) posted (>30 days ago) 11:23am Tue. Jul 21st 2009 PDT
The new site is hosted by Google Code, here: http://libcatid.googlecode.com/.  The LibCat link on the right will now navigate to this new site.

I am maintaining a wiki, bug tracker, and news for the project at the new address.

The code is now under the Lesser GNU General Public License, allowing usage in proprietary software.
last edit by (Anonymous) edited (>30 days ago) 5:21am Wed. Jul 22nd 2009 PDT
by (Anonymous) posted (>30 days ago) 6:07am Tue. Jul 21st 2009 PDT
by (Anonymous) posted (>30 days ago) 8:45am Mon. Jul 20th 2009 PDT
I have designed and implemented an efficient protocol for key agreement between a user on the Internet and a central server with a published public key.  I have named it "Tabby."

Tabby builds on the canonical 3-pass Elliptic Curve Diffie-Hellman (EC-DH) key exchange that incorporates proofs that both sides know the key.  Tabby improves on the security guarantee of EC-DH by adding Forward Secrec... (more)
last edit by (Anonymous) edited (>30 days ago) 10:04am Mon. Jul 20th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:10am Fri. Jul 17th 2009 PDT
Based on Fortuna algorithm from "Practical Cryptography" section 10.3
Published 2003 by Niels Ferguson and Bruce Schneier

Fortuna supplements the operating system (OS) pseudo-random number generator (PRNG)
by incorporating additional entropy into the seeds that the OS provides.

Modified for use with Skein in PRNG mode, sharing the strengths of both algorithms.

My implementation of Fortu... (more)
last edit by (Anonymous) edited (>30 days ago) 11:20am Fri. Jul 17th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:02am Tue. Jul 14th 2009 PDT
Some background

Skein is a SHA-3 candidate that looks very promising.  I use it as a KDF and a MAC for the key agreement part of my Tunnel protocol.  Skein is a 256, 512 and 1024 bit hash.  I use the 256-bit mode for the 256-bit security level of Tunnel, and I use the 512-bit mode for 384 and 512-bit security levels of Tunnel.

More information here: [url]http://catid.mechafetus.com/ne... (more)
last edit by (Anonymous) edited (>30 days ago) 11:39am Tue. Jul 14th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:09am Mon. Jul 13th 2009 PDT
Notes

The following timings are for the entire server computation, including validation, point multiplication, and reduction to affine coordinates.  I feel it is a very fair comparison to timing of other public key cryptosystems.

Also note that the algorithms used are all resistant to timing attack.  This includes my implementation of memcmp(), point operations (ie. add, double, multi... (more)
last edit by (Anonymous) edited (>30 days ago) 10:03am Tue. Jul 14th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:56pm Sun. Jul 12th 2009 PDT
When the client sends its public key to the server for key exchange, the time the server takes to respond given the particular input the client provided should be constant.  More loosely, the runtime should ideally be completely independent of the input.

I did some testing and found that the input can change the runtime by < 1%, less than a fraction of a microsecond, which is a safe margin if y... (more)
last edit by (Anonymous) edited (>30 days ago) 8:02pm Sun. Jul 12th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:25pm Sun. Jul 12th 2009 PDT
In a finite field Fp, where p is a prime number, there is an efficient way to compute the multiplicative inverse of a number X:

X^-1 = X ^ [E(p) - 1], where E(p) is Euler's totient function.  Since p is a prime, E(p) = p - 1.

So to compute X ^ -1, you just have to compute X ^ (p - 2).

A great feature of this approach is that it is resistant to timing attacks while being very fast (just 16... (more)
last edit by (Anonymous) edited (>30 days ago) 8:06pm Sun. Jul 12th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:39am Sat. Jul 11th 2009 PDT
On a 3.2 GHz Core i7 running 64-bit Windows 7 and compiled under Visual Studio 2008 with best optimization and hand-written assembly inner loops it now takes 100 microseconds on average to perform a 256-bit point multiplication.  There is still room for improvement.

It used to take 250 usec to do the same on a 4.0 GHz Core i7, so it is about 312.5% faster than before.


Here's a thought expe... (more)
last edit by (Anonymous) edited (>30 days ago) 12:10pm Sat. Jul 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:52am Fri. Jul 10th 2009 PDT
I have spent the last week refactoring and fixing bugs in the new math code.  The entire math library was rewritten from scratch so that I can simply recompile to use 64-bit legs on 64-bit machines and that caused a ton of problems.

It is about 18% slower on 32-bit machines now.  This is expected and accepted.  Servers will be running the 64-bit version and clients hide their processing time in... (more)
by (Anonymous) posted (>30 days ago) 8:02pm Sun. Jun 28th 2009 PDT
It took me 9 hours but I have successfully implemented 64-bit assembly code that links with my Visual C++ code to implement Comba squaring and multiplication.  Squaring is about 85% of the runtime of multiplication, which is what I was expecting.  The hardest part is over, and the rest is refactoring what I have written already.

The new math library is pretty slick.  It's all wrapped in C++ cla... (more)
last edit by (Anonymous) edited (>30 days ago) 9:44pm Sun. Jun 28th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:50am Sun. Jun 28th 2009 PDT
Do it on the server, but not the client.

The client computer running my network protocol does not need to be particularly fast.  Its public key operations are pipelined with network latency so even if it takes 100 milliseconds for the client to do its two ECC point multiplications, the user won't notice any extra delay.  For the client it makes sense to write portable C++ code to do the math, a... (more)
last edit by (Anonymous) edited (>30 days ago) 9:27am Sun. Jun 28th 2009 PDT
by (Anonymous) posted (>30 days ago) 4:52pm Tue. Jun 23rd 2009 PDT
last edit by (Anonymous) edited (>30 days ago) 7:25am Mon. Jul 13th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:36am Thu. Jun 18th 2009 PDT
The new SHA-3 candidate cryptographic hash function from Bruce Schneier, Skein, is available here: http://www.skein-hash.info/
It was recently patched to version 1.1 to fix a bug.

It is one of the top two fastest SHA-3 candidates, beaten only by Blue Midnight Wish.  I decided to implement Skein first because it is well documented, has some clever example code, has resisted initial c... (more)
last edit by (Anonymous) edited (>30 days ago) 11:34am Fri. Jun 19th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:03pm Wed. Jun 10th 2009 PDT
Following advice from Tom (of Tom's Fast Math Library), I've taken the first steps to making my common code re-usable by other developers.  Here's version 1.0 of libcat:

http://catid.org/code/libcat-1.0.zip

The current release of libcat includes a friendly Tunnel protocol for building a secure tunnel over the internet with minimal effort.  I have included an example server and cli... (more)
last edit by (Anonymous) edited (>30 days ago) 8:42pm Mon. Jun 15th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:00pm Thu. Jun 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:24pm Thu. Jun 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:17pm Thu. Jun 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:54am Tue. Jun 9th 2009 PDT
My ECC implementation (see http://catid.mechafetus.com/news/news.php?view=141) uses a Twisted Edwards curve as defined by axx + yy = 1 + dxxyy over Fp.  This curve was introduced by D. J. Bernstein, Peter Birkner, et al in their paper "Twisted Edwards Curves" (2008) [1].

My addition and doubling formulas are based on the Extended Projective coordinates introduced by H. Hisil, K. Wong... (more)
last edit by (Anonymous) edited (>30 days ago) 10:03pm Mon. Aug 17th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:54am Mon. Jun 8th 2009 PDT
This is old news... The ECC code is now much faster, more secure and part of LibCat.

I've successfully implemented a 256-bit Elliptic Curve Diffie-Hellman key exchange library from scratch.

256-bit ECC is part of the NSA's Suite B of algorithms for protecting SECRET documents.  They require 384-bit ECC for TOP SECRET documents.

Here's the output of my VS2008 (release mode w/ full o... (more)
last edit by (Anonymous) edited (>30 days ago) 10:09pm Mon. Aug 17th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:02am Sat. Jun 6th 2009 PDT
Here's the point multiplication code using w-MOF and Extended Twisted Edwards coordinates:

[code]
PACKED struct {
  u8 add_index; // nth odd number to add: 0=0,1=1,2=3,3=5,4=7,...
  u8 doubles_after; // number of doubles to perform after add
} MOF_LUT[128] = {
{0,0},{0,1},{1,0},{0,2},{2,0},{1,1},{3,0},{0,3},
{4,0},{2,1},{5,0},{1,2},{6,0},{3,1},{7,0},{0,4},
{8,0},{4,1},{9,0},{2,2},{10,0},{5... (more)
last edit by (Anonymous) edited (>30 days ago) 10:37am Thu. Jun 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:46pm Thu. Aug 6th 2009 PDT
by (Anonymous) posted (>30 days ago) 3:03pm Fri. Jun 5th 2009 PDT
When doing w-MOF, choosing a window size changes how much precomputation is done versus additions while walking the bits of the number.  Ignoring zeroes where addition can be skipped, the cost of a window size w in point additions is:

cost = 2^(w-2) + bits/w

This cost function depends only on the window size and the number of bits.  We can optimize it:

[code]
d(work,w) = 2^(w-2)ln(2)dw +... (more)
last edit by (Anonymous) edited (>30 days ago) 1:50pm Sat. Jun 6th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:54pm Tue. Jun 2nd 2009 PDT
I consider w-MOF to be an improvement upon w-NAF, since it is faster and requires no pre-computation but provides the same quality result.

http://www.sdl.hitachi.co.jp/crypto/mof/index-e.html

Here's a great paper that compares it to w-NAF: http://www.cdc.informatik.tu-darmstadt.de/reports/reports/Fan.diplom.wNAF_wMOF_final.pdf

It is simple enough that you won't want ... (more)
last edit by (Anonymous) edited (>30 days ago) 2:01pm Sat. Jun 6th 2009 PDT
by (Anonymous) posted (>30 days ago) 2:13pm Tue. Jun 2nd 2009 PDT
Mersenne primes are prime numbers of the form 2^N - 1.

For elliptic curve cryptography, the only suitable Mersenne primes are 2^127-1 and 2^521-1.  Of those two, the smaller prime can be used with the GLV method over an extension field, allowing for ~128-bit security.  The larger prime allows for ~256-bit security without any tricks.  One day I want to actually understand how the GLV method wor... (more)
last edit by (Anonymous) edited (>30 days ago) 2:14pm Tue. Jun 2nd 2009 PDT
by (Anonymous) posted (>30 days ago) 9:31am Tue. Jun 2nd 2009 PDT
I wrote an optimized, portable multiprecision addition algorithm that doesn't use 64-bit registers just to see if the performance is decent:

[code]
  // lhs += rhs, return carry out
  // precondition: lhs_limbs >= rhs_limbs
  u32 Add(u32 *lhs, int lhs_limbs, const u32 *rhs, int rhs_limbs)
  {
    u32 a, sum;
    int ii = 0;

    for (;;)
    {
      a = lhs[ii];

      if ((lhs[ii++] = a + rhs[ii]) >... (more)
last edit by (Anonymous) edited (>30 days ago) 9:52am Tue. Jun 2nd 2009 PDT
by (Anonymous) posted (>30 days ago) 10:15am Mon. Jun 1st 2009 PDT
Here's my first YouTube video with the new video camera too:

http://www.youtube.com/watch?v=IpkZQpi0zco

The darker colors show up much better in HD.

I learned a few things about posting on YouTube through this, like how the "related videos" feature is based partially on the name of the video file, so I'll be sure to include some keywords next time in the file name.

This vide... (more)
last edit by (Anonymous) edited (>30 days ago) 10:19am Mon. Jun 1st 2009 PDT
by (Anonymous) posted (>30 days ago) 9:34pm Sat. May 30th 2009 PDT
I implemented the algorithm for special modulii with either high or low Hamming weight (number of 1s in the binary representation of the number) specified in the Handbook of Applied Cryptography (http://www.cacr.math.uwaterloo.ca/hac/).  The results are pretty good, though I was hoping for a little more.

Run time ratios = {Special Modulus : Montgomery Reduce : Barrett Modulus} = {1 :... (more)
last edit by (Anonymous) edited (>30 days ago) 9:46pm Sat. May 30th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:56am Sat. May 23rd 2009 PDT
I've implemented an algorithm to generate a dictionary of the most common words in a set of sample text data.  It works like this:

[code]
Prepare a hash table of 8-bit integers, 9999991 (a prime) entries in size.

For each string of 6 characters from the text, compute a hash H.  This can be accelerated using a rolling hash, so for each character you just have to subtract the oldest contribut... (more)
by (Anonymous) posted (>30 days ago) 1:33pm Tue. May 19th 2009 PDT
This paper discusses real, working lock-less software design from first principles, and has been very helpful:
http://www.cs.rochester.edu/u/scott/papers/1996_PODC_queues.pdf

By the way, CAS *is* supported on 64-bit architectures.  Visual Studio 2008 has this intrinsic function: http://msdn.microsoft.com/en-us/library/bb514094.aspx.  I've used it to port my own lock-less ... (more)
last edit by (Anonymous) edited (>30 days ago) 1:44pm Tue. May 19th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:21am Sun. May 17th 2009 PDT
http://www.youtube.com/watch?v=mC_ODpxMA10

Rodrigo Rosenberg is a hero and a martyr, and he will change the future of Guatemala.


The text reads: "The country goes from Guatabad to Guatabetter", as opposed to the usual old saying.

Here's my hero worshiping tee-shirt that has led to some conversations about this great man:
[i... (more)
last edit by (Anonymous) edited (>30 days ago) 9:10am Wed. Oct 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 12:57pm Sat. May 16th 2009 PDT
So, I took the time to try higher order static statistics (order-2 instead of order-1) to improve upon the chat text compression ratio.

It works, but I don't like the results:

Compression rate = 34.1 MB/s
Decompression rate = 20.3 MB/s
Compression ratio = 46.16%
Table size = 6.8 MB

So, for 5% better compression it takes almost 7 MB instead of just 64KB, and both operations go ~10% slow... (more)
by (Anonymous) posted (>30 days ago) 5:33pm Thu. May 14th 2009 PDT
http://people.redhat.com/drepper/cpumemory.pdf

A co-worker pointed me at this excellent article.  It goes through pretty much every aspect of modern memory in computers, from DDR3 to L1/L2 cache and everything in between, including how multi-core and hyperthreaded processors interact with memory.

It's longer than I can read in one sitting, but it makes a great reference and explai... (more)
last edit by (Anonymous) edited (>30 days ago) 5:49am Fri. Aug 21st 2009 PDT
by (Anonymous) posted (>30 days ago) 4:20am Thu. May 14th 2009 PDT
Following from the last post: http://catid.mechafetus.com/news/news.php?view=126

New performance results:

Single-threaded performance on my high-end consumer desktop (64-bit Windows 7, Core i7 @ 4 GHz, RAM @ 1920MHz):
Encoding rate: 38.5 MB/s
Decoding rate: 23.8 MB/s (Up from 9 MB/s)
Table size: 69.71KB (Previous tables were < 12KB)

The extra memory is going to an 8-b... (more)
last edit by (Anonymous) edited (>30 days ago) 5:00am Thu. May 14th 2009 PDT
by (Anonymous) posted (>30 days ago) 3:37pm Sun. May 10th 2009 PDT
Continued from Part 1: http://catid.mechafetus.com/news/news.php?view=125



Results

Single-threaded performance on my high-end consumer desktop (32-bit Vista, Core i7 @ 4 GHz, RAM @ 1920MHz):
Encoding rate: 13.5 MB/s
Decoding rate: 6.6 MB/s

Single-threaded performance on my high-end consumer desktop... (more)
last edit by (Anonymous) edited (>30 days ago) 8:00pm Wed. May 13th 2009 PDT
by (Anonymous) posted (>30 days ago) 2:51pm Sun. May 10th 2009 PDT
Compressing short chat messages is difficult.  I've looked at Lempel-Ziv(LZ) and Range/Arithmetic coding as ways to encode chat message text and player names for an online game.  I have developed an algorithm that runs quickly and produces 50% compression ratio on average even for short messages.


Why is it hard?

The online game has a central server with, say, thousands of connected ... (more)
last edit by (Anonymous) edited (>30 days ago) 3:37pm Sun. May 10th 2009 PDT
by (Anonymous) posted (>30 days ago) 4:15pm Thu. Apr 30th 2009 PDT
Here's a cool application of mathematics to detect financial fraud:

http://en.wikipedia.org/wiki/Benford%27s_Law

This technique works when the lying data is generated using a uniform distribution instead of a log-normal one.  This is likely to be the case for naive crooks since it is harder to do.

It's valuable to know how to generate log-normal-distributed random numbers with ... (more)
last edit by (Anonymous) edited (>30 days ago) 4:22pm Thu. Apr 30th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:08am Sun. Apr 26th 2009 PDT
Over UDP, non-deterministic (user-controlled) object motion updates need to be sent immediately to get the lowest possible error in position.   Here is my analysis:

Object position = POS pixels.
Object velocity = VELO pixels per second.
Object max acceleration = ACCEL pixels per second, per second.

On each simulation step:
POS += VELO * t.
VELO += ACCEL * t.
So it's approximating POS = ... (more)
last edit by (Anonymous) edited (>30 days ago) 7:49am Mon. Apr 27th 2009 PDT
by (Anonymous) posted (>30 days ago) 12:08pm Fri. Apr 24th 2009 PDT
So I decided to play with range coding, which is a technique used to nearly optimally compress a message given the statistical frequency of each character or sequence of characters.

Building upon my experience doing multiprecision arithmetic, I came up with the following simple implementation, which has a few trivial preconditions on the input.

[code]
int range_encoder(u8 *out, u8 *in, int ... (more)
last edit by (Anonymous) edited (>30 days ago) 11:11am Sat. Apr 25th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:49am Tue. Apr 21st 2009 PDT
I decided to revisit the square root algorithm to see how far off it is for non-perfect squares.  Surprisingly, the Newton iteration method is more accurate than I had thought.  With the previous algorithm it is right about half the time, and one less than it should be the rest of the time.

I determined this is because the halving operation should be rounded up; when dividing by two, if the low... (more)
last edit by (Anonymous) edited (>30 days ago) 11:58am Tue. Apr 21st 2009 PDT
by (Anonymous) posted (>30 days ago) 10:19pm Sat. Apr 18th 2009 PDT
I read over at Penny-Arcade that Tycho had watched at least a few episodes of this anime series and thought it was post-worthy, so I decided to check it out.  I downloaded the whole thing and set aside a Saturday to watch it all in a marathon.  And I did finish it because I hate to leave things unfinished: Either start something and go all the way, or don't do it at all.

Long story short, it's ... (more)
last edit by (Anonymous) edited (>30 days ago) 10:21pm Sat. Apr 18th 2009 PDT
by (Anonymous) posted (>30 days ago) 3:30pm Thu. Apr 16th 2009 PDT
To do division between two large numbers modulo some prime number, as is done during ECC point multiplication, the best method is to take the inverse of the denominator and then multiply by it.  I call this the InvMod() function for multiplicative modular inverse, in my code.

My copy of "Handbook of Elliptic and Hyperelliptic Curve Cryptography" arrived yesterday.  I was leafing through it idly... (more)
last edit by (Anonymous) edited (>30 days ago) 2:36pm Fri. Apr 24th 2009 PDT
by (Anonymous) posted (>30 days ago) 12:24pm Wed. Apr 15th 2009 PDT
NAF is an approach for doing bigint exponentiation (RSA) and also point multiplication (ECC).

In the case of RSA, NAF improves the speed of calculation by introducing a new operation: multiplication by the modular inverse (or division).

If you want to exponentiate by b"1111" it is a lot cheaper to instead exponentiate by b"10000" and then divide once.  ie. (b ^ 16) * (base ^ -1) is ch... (more)
last edit by (Anonymous) edited (>30 days ago) 1:34pm Wed. Apr 15th 2009 PDT
by (Anonymous) posted (>30 days ago) 11:19am Tue. Apr 14th 2009 PDT
So I've been working on implementing ECC.

The curves used tend to be somewhat like: y^2 = x^3 + ax + b, known as Weierstrass curves.  One of the nice features of this curve is that to represent one point on the curve, you just need to know the x coordinate and the sign of the y coordinate.

People call this point compression and use it to reduce the bit length of public keys by almost one hal... (more)
last edit by (Anonymous) edited (>30 days ago) 11:59am Tue. Apr 14th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:38am Fri. Aug 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:21am Fri. Aug 7th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:49pm Mon. Apr 13th 2009 PDT
Barrett's method for calculating the modulus of a number is useful when you need to perform the operation many times with the same modulus.  The method requires pre-computing the quotient of the division of R by the modulus, where R is 2^(bits in product).  I assume that the modulus has the high bit set (it's usually a big prime number that fills all the words).

Here's my implementation:

[co... (more)
last edit by (Anonymous) edited (>30 days ago) 2:59pm Thu. Apr 16th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:05pm Sun. Apr 12th 2009 PDT
The Handbook is the best reference I've been able to find on ECC.  It was published in 2006 and remains relevant to the most recent advances by GLS late last year.

It doesn't go over protocols like ECMQV but those are pretty easy to find information about online.  This handbook is more about solid fundamentals and good algorithms for doing the basic operations (like point multiplication).
last edit by (Anonymous) edited (>30 days ago) 7:06pm Sun. Apr 12th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:12pm Sun. Apr 12th 2009 PDT
The computer had to guess the PAL passcode digits to fire our nukes.  What people didn't know at the time that movie was made was that the launch codes for all our missiles had been set to all zeroes ("00000000") since the silos were first installed.  Actually they were set that way all through the Cold War until 1977.  Basically any rogue agents could have triggered those.

Source: [url]http://... (more)
last edit by (Anonymous) edited (>30 days ago) 8:35pm Sun. Apr 12th 2009 PDT
by (Anonymous) posted (>30 days ago) 10:24am Sun. Apr 12th 2009 PDT
Here's the Montgomery Modular Inverse algorithm I came up with after reading that paper from the last post (http://catid.mechafetus.com/dump/mon_inv_j52moinv.pdf):

[code]
  // Compute inverse of non-Montgomery input in Montgomery domain
  void MonInputInverse(
    int limbs,    // Number of limbs in modulus and a
    const u32 *a,  // Input number (NOT a Montgomery residue)
    u32 *r,      /... (more)
last edit by (Anonymous) edited (>30 days ago) 1:17pm Sun. Apr 12th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:47pm Sat. Apr 11th 2009 PDT
So I set out this weekend to implement elliptic curve cryptography (ECC).  It's well explored now and I've waited far too long to hop on board the ECC train for my personal projects!

ECC has a ton of awesome features that make it nicer to work with than RSA.  The basic operation "feels" the same -- you implement modular point multiplication for ECC with window optimizations just like you... (more)
last edit by (Anonymous) edited (>30 days ago) 6:52pm Sat. Apr 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 9:46am Fri. Apr 10th 2009 PDT
I've been pretty lazy about updating the server-side scripts for my site over the last... year or so.  And in the meantime spam bots have been having a field day.

I've attempted to obfuscate the posting script; we'll see if that helps.

And here's some fun with gongs: http://catid.mechafetus.com/music/gongsong.wma
last edit by (Anonymous) edited (>30 days ago) 6:20pm Fri. Apr 10th 2009 PDT
by (Anonymous) posted (>30 days ago) 1:03pm Wed. Apr 15th 2009 PDT
by (Anonymous) posted (>30 days ago) 7:37pm Wed. Mar 25th 2009 PDT
by (Anonymous) posted (>30 days ago) 6:27am Wed. Mar 25th 2009 PDT
My benchmark results:



The 550 MB/s number comes from the fact that most of these drives aren't used, yet.  Just pay attention to the wiggly line that dips down to 400 MB/s minimum and seems to center around 500 MB/s.  That is probably where the average read would be if the whole drive was in use.  It dips again at 160 GB because ... (more)
last edit by (Anonymous) edited (>30 days ago) 3:06pm Tue. Apr 14th 2009 PDT
by (Anonymous) posted (>30 days ago) 5:29am Sat. Mar 21st 2009 PDT


by (Anonymous) posted (>30 days ago) 7:51pm Wed. Mar 11th 2009 PDT
last edit by (Anonymous) edited (>30 days ago) 8:52pm Wed. Mar 11th 2009 PDT
by (Anonymous) posted (>30 days ago) 8:41am Sun. Mar 8th 2009 PDT
Watchmen was impressive. It has the cerebral quality of a Kubrik film, which is exactly what it deserved, combined with a lucidity that I was not expecting. Pretty much everything that I got from the graphic novel is easily accessible on the screen. Plus the movie goes beyond the graphic novel and brings in even more themes.

It was a long movie (3 hours) but it felt much shorter to me.

Spoi... (more)
last edit by (Anonymous) edited (>30 days ago) 9:13am Sun. Mar 8th 2009 PDT
by (Anonymous) posted (>30 days ago) 9:47am Thu. Jan 29th 2009 PST
by (Anonymous) posted (>30 days ago) 7:01am Fri. Jan 16th 2009 PST
last edit by (Anonymous) edited (>30 days ago) 7:23am Sun. Feb 22nd 2009 PST
by (Anonymous) posted (>30 days ago) 7:36am Wed. Dec 31st 2008 PST

(Right-click the image for options to view it at full size)

I have seen a few people over at xtremesystems.org pushing these chips to 4.2 GHz and running stable for 8 hours, so there may be room for improvement.
last edit by (Anonymous) edited (>30 days ago) 7:38am Wed. Dec 31st 2008 PST
by (Anonymous) posted (>30 days ago) 6:23am Fri. Dec 19th 2008 PST
Kitt is the finished build:



Slideshow: http://flickr.com/photos/lamenta3/sets/72157611470346740/show/

Loop is connected in this order:
Reservoir: XSPC Passive 250mm Reservoir
Pump: Swiftech MCP655 12VDC 5-speed
Radiator: Feser X-Changer Single 120mm with Scythe Ultra Kaze 3000 RPM fan and rubber gro... (more)
last edit by (Anonymous) edited (>30 days ago) 3:05pm Thu. Jan 8th 2009 PST