{"body":"So summary of merkle grinding.\n\nSo the header format is ￼https://en.bitcoin.it/wiki/Block_hashing_algorithm\n\nversion(4bytes) prevBlock (32bytes) merkleRoot (32bytes) time (4bytes)\nbits (4bytes) nonce (4bytes) = 80 bytes.\n\nsha256 works on 64 byte chunks so that will be processed in two chunks.\n\nthe 64-bit message length is appended to the data after 1 or more\n0bytes to pad to 64 bytes so what is actually hashed is:\n\nthere is an inner hash and an outer hash.  inner first, data hashed is\n\ninner hased data =\nversion(4bytes) prevBlock (32bytes) merkleRoot (32bytes) time (4bytes)\nbits (4bytes) nonce (4bytes) <40bytes of 0> loCount (4byte value 80)\nhiCount (4bytes)\n\nhiCount is always 0.\n\nIV is magic constants.\n\nstateA = transform call A( IV, version || prevBlock[0-31] || merkleRoot[0-27]￼ )\n\ninner digest = transform call B( stateA, merkleRoot[28-31] || time ||\nnonce || <40bytes of0> || loCount || <4bytes 0> )\n\nouter hashed data = <inner digest> || <28bytes 0> || loCount (4 byte\nvalue 32) || <4bytes 0>\n\nouter = transform call C( IV, <inner digest> || <28bytes 0> || loCount\n(4 byte value 32) || <4bytes 0> )\n\nif target outer bits == 0 found proof of work.\n\n\nstateA is precomputed and transform call 1 only done when extraNonce\nchanges, which changes merkleRoot.\n\nso the most work is repeating call B by changing nonce (and maybe some\nlow order bits of time) and then calling transform call C.\n\n\nnow transform itself is in two parts.\n\nW array = transform_part1( data )\nstate = transform_part2( state, W )\n\npart1 does 13 operations of various things rightrotate, rightshift,\nxor, 32bit unsigned add 48 times.  importantly transform_part1 does\nnot depend on state and so doesnt depend on the first block.\n\npart2 does 23 operations of various rightrotate, xor, and, 32-bit\nunsigned add 64 times.  it costs more than part1.\n\nnow if we precompute multiple merkleRoots that have the same last\n4bytes, then transform_part1 in transform call 2 can be reused like\nthis:\n\nexpensive precompute eg FPGA\n(mrA,mrB,mrC,mrD) = precompute_merkle_collision()\nsuch that mrA[28..31]==mrB[28..31]==mrC[28..31]==mrD[28..31]\n\ncheap precompute\n\nstateA1= transform call A( IV, prevBlock, mrA[0-27] )\nstateB1= transform call A( IV, prevBlock, mrB[0-27] )\nstateC1= transform call A( IV, prevBlock, mrC[0-27] )\nstateD1= transform call A( IV, prevBlock, mrD[0-27] )\n\nthen repeat in loop changing 4 byte nonce, and some low bits of time maybe.\n\ninner W = transform_part1( mrA[28-31] ||  || time || nonce || <40bytes\nof0> || loCount || <4bytes 0> )\n\ninner digest A1=transform_part2( stateA1, inner W )\ninner digest B1=transform_part2( stateB1, inner W )\ninner digest C1=transform_part2( stateC1, inner W )\ninner digest D1=transform_part2( stateD1, inner W )\n\nouterA = transform call C( IV, <inner digest A1> || <28bytes 0> ||\nloCount (4 byte value 32) || <4bytes 0> )\nouterB = transform call C( IV, <inner digest B1> || <28bytes 0> ||\nloCount (4 byte value 32) || <4bytes 0> )\nouterC = transform call C( IV, <inner digest C1> || <28bytes 0> ||\nloCount (4 byte value 32) || <4bytes 0> )\nouterD = transform call C( IV, <inner digest D1> || <28bytes 0> ||\nloCount (4 byte value 32) || <4bytes 0> )\n","name":"","extension":"txt","url":"https://www.irccloud.com/pastebin/JjfwEylh","modified":1491435148,"id":"JjfwEylh","size":3175,"lines":89,"own_paste":false,"theme":"","date":1491435148}