Bitzi
home of the
Bitpedia
digital media encyclopedia

About, Products, Download, Search, Browse, Discuss, BitSocieties, Help




Bitzi works
best with Bitzi-Powered Applications.
Register or Sign In 

Bitzi Developer Discussion: TigerTree adoption in newsposts and mail.

Main Site : bboard : Bitzi Developer Discussion : One Message

Message:

TigerTree adoption in newsposts and mail.   [forward as email]
In the few last days I exchanged few posts with Curt Welch on news.software.nntp

The topic is the adoption of TigerTree like standard Merkle Hash Tree for posting binary files on Usenet and identify them.

My aims is to have compatible systems in this new standard and existing standards (like in Bitzi, Gnutella, etc.).

This will enable to develop a bridge software from Usenet to Gnutella (the reverse will be more difficult).

There is anything you can do for this?

Under I post the last post:

"Mirco Romanato" <painlord2k@yahoo.it> wrote:

> TTH http://sourceforge.net/projects/tigertree/

> here there is the standard Java and C implementation (8 days old) with

> tha last fixes.

> You are lucky, they must change the code 3 times in 6 months.

The only difference I spot in that code is that they prepend a 0 valued

byte to the leaf blocks (so that makes it 1025 bytes to hash) before

calculating the hash for the block, and a 1 valued byte in front of the two

hash values for each of the nodes. This is the stuff you pointed out

before which I didn't understand and we decided wasn't in the tigertree

code we had seen. It prevents the construnction of obvious collisions by

simply creating a file which contins the two hash values from the top node

of the hash tree.

It does notthing to solve the file size problem that we have with the

Usenet application of this (or which any network would have if you were

going to try and transport partial files securly though the network.

> I reccoment to use the project above like standards, because they are

> used like standards for Gnutella (and eMule).

> I think others network will use the same code in future, because it work

> well.

Unfortunately, they have missed the file size problem when you try to

securely transport partial file segments and don't submit the entire file

to the network for transport.

-- Curt Welch http://CurtWelch.Com/

curt@kcwc.com Webmaster for http://NewsReader.Com/

 
-- painlord2k, April 22, 2003 02:39 am

Replies:

Re: TigerTree adoption in newsposts and mail.   [forward as email]
Thanks for bringing this conversation to our attention.

FYI, the TigerTree algorithm has only been changed once in its history, earlier this year, to address the "trivial missized collision" issue that was first brought to our attention last November. We made the change after consulting with a number of cryptographic experts as to the best solution. The THEX specification was updated in March, and the code in the 'tigertree' and 'bitcollider' projects at Sourceforge since then.

I don't quite know what Curt Welch means by "the file size problem" that he believes remains, and browsing over more of the conversation at news.software.nntp hasn't helped me understand.

TigerTree (or other THEX tree-arranged hash) should be ideal for the issue of Usenet posts split into many parts. If it is necessary to know exactly where a part fits into the whole, and the size of the whole, with every part, then you should include (a portion of) the full tree with each part.

That portion could be (1) all generations of the serialized tree from the node which includes the sent part on up to the root (in which case you could use the THEX serialized form); or (2) all the nodes in the same generation as the node which includes the sent part; or (3) an even more compact "proof", which only contains the barest minimum other node values to "paint in" the sent part's value, up to the root.

So considering a file broken into 8 parts, A B C D E F G H, and assuming you're sending part C:

The tree is:

.......O.......
...M...+...N...
.I.+.J...K.+.L.
A+B.C+D.E+F.G+H
Under strategy (1), you'd send all interim values A B C D E F G H I J K L M N O (the full THEX tree) along with the data for part C. The recipient would know for certain the whole file's size, the whole file's treehash, C's size, where C fits in, and every interim hash at C's level and higher.

Under strategy (2), you'd send just the bottom interim values A B C D E F G H along with the data for part C -- because I J K L M N O are all trivially calculatable from the bottom generation. Again, the recipient would know for certain the whole file's size, the whole file's treehash, C's size, where C fits in, and every interim hash at C's level and higher.

Under strategy (3), you'd only send the barest minimum of other values to prove where C fits in -- D I N -- along with the data for C. (D with C is enough to calculate J, J with I is enough to calculate M, M with N is enough to calculate O.) Here, the recipient would know for certain the whole file's size, the whole file's treehash, C's size, where C fits in, and the interim hashes between C and the root value.

-- Even the "send the whole (top of) the tree" strategy #1 is a small amount of overhead compared to large files: Splitting a 1GB file into 1024 1MB segments only requires a full tree of 2048 interim values, ~48K. That's under 5% overhead per 1MB part. (For strategy #2, that'd be 1024 interim values: 24K, <2.5% overhead. For strategy #3, that'd be just 10 interim values, 240bytes, <0.03% overhead -- but much more complicated code as random parts come in.)

Hope this helps,

 
-- gojomo, April 22, 2003 08:59 am
[ Post a reply ]

© 2009 The Bitzi Corporation | Policies | Company Info | In The Press | Link To Us

296,642 bitizens have contributed 15,877,632 tags about 3,196,049 files.