Bulletpoints on bulletproofs

Posted by: Adam Gibson | in Bitcoin, Cryptography | 11 months, 3 weeks ago |

Bulletproofs

The recent publication of this paper by Benedikt Bünz of Stanford (and coauthors) has caused quite a stir amongst Bitcoin people. It's kind of obscure for a non- or semi- technical audience, but most people who heard about it got the gist: this may be the way to finally make Confidential Transactions (transactions that hide the amounts) practical for Bitcoin.

If you just want the effects of this on a potential future Bitcoin-or-sidechain-with-CT, Greg Maxwell has you covered here. What I write below is for those curious and interested at a more technical/mathematical level (but read Greg's post anyway!).

So in this post I want to go over four things: (1) .. wait, let's use bullet points!

  • The history, and the problem with Confidential Transactions
  • Pedersen commitments extended; an inner product proof
  • A very rough outline of how to use inner product proofs to make rangeproofs smaller
  • The scaling of this new version of CT; aggregation

My role here is as an enthusiastic amateur; I mention this not because there's anything wrong with being an amateur per se, but more because really fully grokking the in-depth details here is, at least so far, a bit beyond me, so this is somewhat of a skim, not a real exposition. Needless to say, all mistakes are my own.

Having said that, I have put together a very basic and rough proof of concept Python implementation here; although it's only an implementation of the simplest case (non-aggregated, see below). This helped me to confirm my understanding of the algebraic execution of the algorithm; it may help you also if that's something that interests you.

History

Confidential Transactions was introduced by Greg Maxwell, and outlined for the general reader here. If you haven't read that, you probably should. Greg gave a talk on it here; I wrote an in-depth investigation here (aimed at those who don't know much about the ECC and other primitives, but builds up the full picture, so it's rather long). A shorter and more elegant overview was given here, although it assumes a tadge more knowledge from the reader (and there are a couple of niggles in terms of how it's presented there, but I don't think they matter). As for implementations, there is that in the Elements project, and it was also folded into Monero's RingCT design as outlined here.

A stupidly high level description would be: use Pedersen commitments to the amounts in the outputs of transactions instead of the plaintext amounts, then use the homomorphism of the Pedersen commitments (the fact that addition is preserved under the commitments) to check that the transaction is valid. Lastly, add in a "rangeproof" to ensure that the amounts (which perforce are integers modulo \(N\), where \(N\) is the size of the secp256k1 group ) don't overflow.

These "rangeproofs" are the big fly in the ointment of the construction - they, in the original implementation, are large. A typical output would have to be about 2500 bytes to encapsulate such a rangeproof, which is ten times larger than a typical Bitcoin transaction, in its entirety, today.

Incremental improvements were found (in particular, by Adam Back), and it's also true that the size was dependent on the granularity; the 2500 figure above was for 32 bit amounts, e.g. ~ 43btc with 1 satoshi granularity, but there wasn't a combination of (decent range of values to hide under) and (reasonable size of rangeproof) that seemed feasible for CT on Bitcoin, given how much of a premium there is, naturally, for space on-chain.

Note that this is part of a broad theme; if, even with tweaks, privacy-enabled transactions cost a lot of space, and therefore a lot of money, it degrades the anonymity set so extremely that it makes the privacy enhancement much, much weaker (albeit, non-zero still for sure!).

Pedersen commitments and a direction to a new solution

As a quick reminder, the basic structure of a Pedersen commitment is (using CT style notation):

\(C = xG + aH\)

where \(x\) is a "blinding" factor, and a is the amount under proof, while \(G\) is secp256k1's generator and \(H\) is an alternate generator (sometimes called "basepoint"), the EC-discrete log for which is not known. I went into some considerable detail in describing all of this in my CT document linked above (section 2.1-2.2). As a recap, note that a commitment must have two properties: hiding, enabled by the blinding factor \(x\), which is random, and binding, which means you can't open the commitment to a value different to \(a\); this requires that the creator does not know the discrete log (=private key) of \(H\); if he did then he could do arithmetic on \(a\) and \(x\) and change them; if he doesn't, he can only open the commitment with exactly those two numbers and no other pair. If you think about it, that requires that nobody knows such a private key, for the system as a whole to work, this is addressed with the concept of "nothing up my sleeve" or NUMS.

Vector Pedersen commitments

The first step forward is to consider multiple such NUMS generators. I have a memory back in 2015 of quipping on IRC about "vector spaces in ECC" (also called linear spaces), because after a while looking at these types of things you start to wonder ... since \(xG + aH\) is so useful (not just in CT), wouldn't \(xG + aH + bJ \ldots \) be useful too? The quip about vector spaces is observing that the structure \(x\textbf{e}_1 + y\textbf{e}_2,\ x,y \in \mathbb{R}\) defines a 2-D space spanned by linearly independent basis vectors \(\textbf{e}_1,\textbf{e}_2\), and similarly for N-dimensional; but .. this is weird! The group of points on an elliptic curve is obviously only a 1-D space (each point is defined by a single scalar \(\in \mathbb{Z}_n\)), and yet we can kind of "pretend" it's N-dimensional, in this context, precisely because we don't have a discrete log to convert between any two of these NUMS points.

Indeed, this kind of construction has already been used, afaik, in things like Confidential Assets (Blockstream impl., Chain impl.). I had an intuition it should be possible/useful to leverage it in making another kind of range proof, although I had no idea how to do it.

So the vector pedersen commitment is the following natural extension of the previous:

\(C = xG + v_1H_1 + v_2H_2 + \ldots + v_nH_n\)

One blinding value is still sufficient here; the \(H_{i}\)s are just as many NUMS generators as we like. They can be created using a "coerce-hash-to-point" approach (again, see my CT doc for details; also see here in my bulletproofs code for one simple-enough way to create a whole set of them). This effectively creates a commitment to a vector of dimension N. And we can go further (indeed, bulletproofs does), and make a commitment to multiple vectors:

\(C = xU + v_1G_1 + v_2G_2 + \ldots + v_nG_n + w_1H_1 + w_2H_2 + \ldots + w_nH_n\)

It's not difficult to imagine that this much richer structure can allow more interesting algebraic constructions. But it has one very big drawback: opening the commitment means communicating the values committed to, which here would be \(2n + 1\) scalar values; in the case of a curve like secp256k1 that means communicating \((2n+1) \times 32\) bytes - a lot of data!

There's a subtletly here about commitments vs simply communicating data: if the commitment is part of a larger protocol, it may actually be acceptable to reveal the contents of the commitments (e.g. because the values inside are blinded with random values), but the commitment itself may still be important as part of that larger protocol (e.g. it is combined with other commitments). In this situation, transferring the contents (here it would be \(v_1, v_2, ..., w_n\) is an acceptable step to include, but the trouble is it may be a huge amount of data, for large values of \(n\).

More specifically, in Bulletproofs, we're going to focus on a way of proving that the dot product (or scalar product, or inner product) of the two vectors committed to has a certain value. Again, this can be achieved by simply revealing all the vector elements, but that's what we're going to try to avoid for compactness. Concretely, the commitment is structured as:

\(C = \mathbf{v} \cdot \mathbf{w} U + v_1G_1 + v_2G_2 + \ldots + v_nG_n + w_1H_1 + w_2H_2 + \ldots + w_nH_n\)

and a proof is created that the coefficient of \(U\) is indeed the dot product of the two vectors that were committed to against the pre-agreed generator set (\(U, \mathbf{G},\mathbf{H}\)) (bolding means vectors, even of EC points here).

Making the proof smaller

So how to avoid just sending all the data? The way that Bulletproofs constructs this proof involves halving the dimensions of these vectors by changing the basis repeatedly, until they boil down to single scalars, adding in an extra (2) commitments in each step. This reduces the data that needs to be transferred from \(\simeq n \times\ \) the size of the vectors to \(\simeq \textrm{log}_{2}(n) \times\ \).

This central concept is a breakthrough from UCL cryptographers from last year (2016), Bootle et al. as per this paper. That paper explains the core idea in section 4.1, but in a form more complicated than that in Bulletproofs (section 3 and Protocol 1,2). Bootle also wrote a without-mathematics-gist blog post here which is quite helpful for the casual reader to get an idea.

How to understand this? Consider this problem: I want to prove to you I know 2 values, let's say. Suppose I already have a Pedersen-style commitment of the form \(C = rG + aH_1 + bH_2\), then if I want to prove to you I know \(a\) and \(b\), I have to send you \(a\) and \(b\), right? Even though proving I know them is not quite the same thing as revealing them, I can't do better, can I? Revealing them means sending 2 x 32 bytes in our scenario. Wouldn't it be nice to just send 1x32 bytes? Well yeah but if I constructed \(rG + (a+b)H_1\) as the commitment instead, I haven't proven knowledge of \(a, b\) but only their sum.

But let's try to follow a well-known pattern in cryptography (see sigma protocol) - the verifier can send a random challenge to the prover. I send the commitment \(C\), you (verifier) send me a challenge random value \(x\). How can I use the \(x\) to reduce the amount of data I need to send to prove knowledge of the opening (\(a,b\), also known as the "witness") of the commitment?

A natural way is to send just \(ax + b\), but this is no good, because how does the verifier relate that to the originally sent \(C\)? It's no good if I can "open" a commitment to whatever I like! It must be binding. We need a construction, some \(f(a, b, x)\), that allows an algebraic reconstruction of \(C\) by the verifier, when combined with some function \(g(H_1,H_2)\). This is how the Bulletproofs paper ends up naturally with this arrangement:

\(a' = ax + bx^{-1}\quad ,\ H' = x^{-1}H_1 + xH_2\)

\(\therefore C' = (ax + bx^{-1})(x^{-1}H_1 + xH_2)\)

\(=aH_1 + bH_2 + x^2aH_2 + x^{-2}bH_1 = C + x^2L + x^{-2}R\)

(Note that these formulae ignore the blinding factor \(rG\), we will mention why in the next section).

What this means is that, proving knowledge of the opening of \(C\) means sending not \(a, b\) but instead \(a', L, R\) (note that the verifier has the challenge value \(x\) to so can recreate the final equation above).

But wait! - I hear you cry. We were trying to reduce the amount of data sent, but now we have to send two EC points \(L, R\) along with the now one scalar value \(a'\). True, this is not helpful; but the marvellous thing is that if you extend from single points \(a, b\) to entire vectors \(\textbf{a},\textbf{b}\) and use vector Pedersen commitments, you can reproduce the above process! The difference will be that, instead of "reducing" the transfer from \(a, b\) to \(ax + bx^{-1}, L, R\), you'll be reducing the transfer from \(a_1, a_2, \ldots a_{n}\) to \(xa_1+x^{-1}a_{\frac{n}{2}+1}, \ldots xa_{\frac{n}{2}} + x^{-1}a_{n}, L, R\). Note how this is a new vector \(\textbf{a}'\) which is half the size of the original one.

At the end of that transformation, you've got a new vector, half the length of the first one, for which you're trying to prove knowledge of the entries; that's just a reduced version of the problem you started with, so you apply the protocol recursively. If you started with a power of 2, you can keep repeating the process until you have only 2 values you want to prove knowledge of, then one final step is required (as for the values \(a, b\) we started with). (see eqns 28 and 29 in the paper).

This means that you only have to send one pair of L and R points for each halving of the size of the vector, along with the final multiplied result.

Using this trick for an inner product proof

It's only a bit more complexity than the idea above, to include that the "blinding" factor in the vector pedersen commitment is actually the inner product of the two vectors, i.e. to prove that the commitment had structure:

\(C = \mathbf{v} \cdot \mathbf{w} U + v_1G_1 + v_2G_2 + \ldots + v_nG_n + w_1H_1 + w_2H_2 + \ldots + w_nH_n\)

The gist of it is to include the necessary cross-terms in the formulae for the points \(L, R\) at each halving of the size of the vectors, such that at each step, it is always the case that \(c = \mathbf{v} \cdot \mathbf{w}\), and so in the final verification step (when vectors of length 2 are reduced to single values \(v, w\), the value is now \(c = v \times w\) as a direct multiplication. The details are in Protocol 1 in the paper, and you can see my simple implementation of proving and verifying, using recursive functions, here and here.

Protocol 2 is just a final tweak to allow not just proving that a commitment \(C\) has the above pattern, but that specifically the inner product is exactly a provided value \(c\).

Quick aside - Fiat-Shamir

The above section explained crudely how you can use a "challenge value" \(x\), sent from verifier to prover, to condense the size of the final proof; in all such cases in cryptography we can replace such an interaction with the use of a cryptographic hash function applied to the previous steps of communication between prover and verifier, to make the proof non-interactive. See here for details. This is of course used and is briefly mentioned in section 4.4 of the paper.

Rangeproofs from inner-product-proofs

This is actually the more complex part of the construction, compared to the above!

My description will be quite skimmy, partly because it's really a lot of algebra, and more importantly, because I don't understand it perfectly myself, yet.

We're going to assume that, as in CT currently, there is an existing Pedersen commitment of form \(C = xG + vH\), where \(v\) is the value of the transaction output in satoshis, and we want to prove that it's in range and doesn't overflow; typically we might want to prove that \(v \in 0 .. 2^{32}-1 \), i.e. a 32 bit integer. In this case \(n=32\) in the below.

Bulletpoints time again!

  • Start by making a vector called \(\textbf{a}_L\) which is a representation of \(v\) in bits, e.g. the value 3 might be represented by the vector [0,0,0,1,1] for proving it's in range 0-31 (this would be \(n=5\). Note that this means that \(\textbf{a}_L \cdot \textbf{2}^n = v\) (notation: \(\textbf{k}^n\) means the vector [\(1,k,k^2,\ldots ,k^{n-1}\)] ).
  • Make a complementary vector \(\textbf{a}_R\) for which each component is the corresponding component of \(\textbf{a}_L\), minus 1. This means that \(\textbf{a}_L \cdot \textbf{a}_R = 0\).
  • Commit to these vectors with a commitment \(A\), and do the same for blinding vectors, creating \(S\). The verifier can respond to these commitments with two challenge values (Fiat-Shamir comment applies of course).
  • The prover then builds two vector-valued linear polynomials of a single scalar variable \(X\) - that is to say, the polynomials have coefficients that are vectors, so that when you plug in specific scalars for \(X\), you get vectors out - using the above variables. These polynomials are called \(l(X)\) and \(r(X)\). They are constructed so that their inner product \(t(X) = l(X) \cdot r(X)\), which is a quadratic polynomial in \(X\), has the following special property: its constant term \(t_0\) is a specific function of only \(v\) and the challenge values, if and only if the vectors \(\textbf{a}_L\), \(\textbf{a}_R\) were constructed honestly so that the previously mentioned properties hold.
  • The prover also constructs commitments to the other coefficients \(t_1, t_2\) of the polynomial and this data is sent to the verifier, who constructs more challenge values, and the data sent to the verifier is reduced by carefully constructing the commitment so that the verifier can be convinced of the fact that \(v \in 0..2^n -1\) by a proof that the value of t, for the challenged \(X\) value, is indeed the dot product: \(t(X) = l(X) \cdot r(X)\); and for this, the prover provides an inner product proof of the type described in the previous section, whose size is logarithmic in \(n\). Since it's not directly obvious that the constructed EC point in the paper (equation (62)) is indeed the required commitment, I've added some algebraic working - see1 Footnote 1, to help if you're trying to figure it out.

Overall the data sent to the verifier (you can see concretely in my code here), is something close to:

\(33\times 4 + 32\times 3 + (32\times 2 + 33\times 2\times \textrm{log}_{2}(n))\)

 in bytes. The log part is coming from the \(L, R\) points, which as you may remember, must be sent for each halving of the sizes of the vectors whose inner product we're proving. Notice that this is dramatically smaller than a rangeproof of the former type (based on Borromean ring signatures over the digits of the amount); plugging in \(n=32\) gives you 622 bytes instead of 2000-2500. Of course the real gains occur at higher range sizes, since the growth is logarithmic.

Aggregation

What neither my POC code, nor this brief writeup, have covered is the very important point: these rangeproofs can be aggregated, which can mean that a transaction with multiple outputs can create a proof of all of those multiple outputs being in range, with an additional size which is only logarithmic, effectively a constant extra term \(\simeq k \times \textrm{log}_{2}(m)\) where \(m\) is the number of outputs being proved. A reduction from 2500 to 650, say, is already big enough that CT may go from being impractical to practical; but aggregation could create scenarios where the number of bytes required per output in a transaction is really very small indeed; even 100 bytes or less, as we scale up to bigger and bigger numbers of outputs.

Note in particular that this dramatically economically incentivizes coinjoin; it makes it significantly cheaper to do your transaction along with someone else, at least if you compared with a CT model where rangeproofs were not aggregatable. The aggregation is of course interactive but as Greg notes in his overview on the mailing list:

This scheme also has a straightforward and efficient method for multi-party computation, which means that the aggregates can be used in all-party-private coinjoins ...

And the general positive point about CT + coinjoin still applies, which is that there is no need to match amounts; just coinjoin with anyone at all and you get the same effect. I think if you can also achieve the trick of making coinjoin non-interactive (see previous blog post on SNICKER), it might make coinjoin the default transaction type.

Footnotes

1. Demonstration that the constructed EC point in (62) of the Bulletproofs paper is the same as the commitment in (63) under honest prover conditions. Before we start, note that, following the convention in the paper (except here I use EC rather than exponent notation), a term like \(\textbf{aG}\) (note both scalar and point terms are bolded), actually refers to \(a_1G_1 + a_2G_2 + \ldots + a_nG_n\). Also recall as mentioned above that \(\textbf{k}^n\) (note exponent is not bolded), means the vector \([1, k, k^2, \ldots k^{n-1}]\). We begin the construction in (62):

\(A = \left(a_{L1}G_1 + a_{L2}G_2 + \ldots + a_{Ln}G_n\right) + \left(a_{R1}H_1 + a_{R2}H_2 + \ldots + a_{Rn}H_n\right) + \alpha H \)

\(xS = \left(xs_{L1}G_1 + xs_{L2}G_2 + \ldots + xs_{Ln}G_n\right) + \left(xs_{R1}H_1 + xs_{R2}H_2 + \ldots + xs_{Rn}H_n\right) + x \rho H\)

\(\because \mu = \alpha + \rho X \therefore\)

\(A + xS -z\textbf{G} = \mu H + \left(\left(a_{L1} + xs_{L1} -z\right)G_1 + \ldots + \left(a_{Ln} + xs_{Ln} -z\right)G_n\right)\)

\(\quad + \left(\left(a_{R1} + xs_{R1}\right)H_1 + \ldots + \left(a_{Rn} + xs_{Rn}\right)H_n\right)\)

We can now note that the term \(\left(\left(a_{L1} + xs_{L1} -z\right)G_1 + \ldots + \left(a_{Ln} + xs_{Ln} -z\right)G_n\right)\) in the above, is exactly the vector \(\textbf{l}(x)\)

In addition to the above, the point \(P\) in (62) has the additional term \(\left(z\textbf{y}^n + z^{2}\textbf{2}^n\right) \textbf{H}'\). The definition of \(\textbf{H}'\) is:

\(\textbf{H}' = [1H_1, y^{-1}H_2, y^{-2}H_3, \ldots , y^{-(n-1)}H_n]\)

So the additional term has form (note \(\circ\) is the Hadamard product as defined in the paper):

\(z\textbf{H} + \textbf{y}^{-n} \circ \left(z^{2}\textbf{2}^n\right)\textbf{H}\)

So we now have:

\(A + xS -z\textbf{G} + \left(z\textbf{y}^n + z^{2}\textbf{2}^n\right) \textbf{H}' = \mu H + \textbf{l}(x)\textbf{G} + \ldots\)

\(\quad \left(\textbf{a}_R + x\textbf{s}_R + z.\textbf{1}^n + \textbf{y}^{-n} \circ \left(z^{2}\textbf{2}^n\right)\right)\textbf{H} \)

, the last term of which, according to the definition of \(\textbf{H}'\), can be rewritten as:

\(\left(\textbf{y}^n \circ \left(\textbf{a}_R + x\textbf{s}_R + z.\textbf{1}^n\right) + z^{2}\textbf{2}^n\right)\textbf{H}'\)

but this is exactly: \(\textbf{r}(x)\textbf{H}'\) . So finally we have that the RHS of (62) can be written as:

\(P = \mu H + \textbf{l}(x)\textbf{G} + \textbf{r}(x)\textbf{H}' \)

, which is the same form as (63). Final note: to convert this into the required the verifier must subtract \(\mu H\) and add back in \(t U\) to convert into the required form for verification of the inner product proof.

Current rating: 4.9