Joinmarket.me archivehttps://joinmarket.me/2020-10-25T00:00:00+02:00Joinmarket update for Oct 20202020-10-25T00:00:00+02:002020-10-25T00:00:00+02:00Adam Gibsontag:joinmarket.me,2020-10-25:/blog/blog/oct-2020-update/<p>Joinmarket update Oct 2020</p><h2>About this post</h2>
<p>It seems like a good idea to start using this blog to spread a little bit more information
to users and other interested parties, about Joinmarket, in particular about how it might
change.</p>
<p>First, please note this is a <em>personal</em> blog, there is nothing "official" here (and the same
would go for anyone else's blog about Joinmarket! - this is an open source project).</p>
<p>Second, please note that for years now I have been microblogging <a href="https://x0f.org/web/accounts/1077">here</a>; so,
if you're interested to keep in touch with what I'm doing (and often, reading, or just thinking) day by day,
you're welcome to follow that account. I personally like keeping track of people over RSS with <a href="https://fraidyc.at">fraidyc.at</a>,
but whatever suits you. Just know that Joinmarket related announcements are often made there first (I don't and will not use any corporate-owned social media sites).</p>
<h3>Joinmarket status.</h3>
<p>0.7.1 of Joinmarket was released 12 days ago, and introduced <em>receiving</em> BIP78 payjoins, on the GUI and on command line.</p>
<p>In the next few days 0.7.2 will be released. It is principally a bugfix release.</p>
<p>(Although there will be one small
new feature - not-self broadcasting is finally reimplemented. You'll want to be careful about using it, especially
to start with (since it'll only work with counterparties that also have the latest release); there will of course
be advice about this in the release notes. Consider it an advanced feature, and consider using tor-only in your
Core node if the base level of privacy in broadcasting transactions isn't enough for you.)</p>
<p>The bugs fixed are things that came out of interoperability tests on BIP78.</p>
<p>Over the last few weeks I, Kristaps Kaupe and some people on other dev teams have been running a variety
of testnet, mainnet, regtest tests of Payjoin functionality between btcpayserver, Wasabi and Joinmarket.</p>
<p>We found various edge cases, like hex instead of base64 being transferred (not in spec but people were doing
it anyway), incorrectly shuffled output ordering (my bad!), combinations of parameters in the HTTP request
that <em>I</em> interpreted the BIP as saying was not allowed, but btcpayserver was sending anyway (but: not always! -
testing can be a real pain sometimes!) and a few more.</p>
<p>Remember two things about Payjoin though:</p>
<ol>
<li>It is a protocol designed to accept a failure to negotiate as a common event - <em>the payment goes through anyway, it's just not a coinjoin then</em>.</li>
<li>The most common incompatibility between wallets will be different address types. Then nothing can be done, as it would be slightly silly to do a Payjoin like that - we fall back, as per (1).</li>
</ol>
<p>So hopefully we will have some wallets that can send and receive Payjoins up and running by .. well, now actually! It is already possible and working, we are just smoothing out edge cases here.</p>
<p>If you didn't get a chance, please watch this demo video of sending and receiving payjoins between Joinmarket wallets (note: the dialog is now improved, as I comment here):</p>
<p><a href="https://video.autizmo.xyz/videos/watch/7081ae10-dce0-491e-9717-389ccc3aad0d">JM-JM Payjoin demo video</a></p>
<p>It only has 31 views, many of which were me, so I guess not many people saw it :)</p>
<p>About point (2) above, note that you'll probably need to be using a Joinmarket bech32 wallet (yes, we've had them for quite a while!) if you want to send or receive with Wasabi. So, more on that next:</p>
<h3>Joinmarket future plans (tentative!)</h3>
<h4>Bech32 in 0.8.0</h4>
<p>We have <a href="https://github.com/JoinMarket-Org/joinmarket-clientserver/pull/656">this</a> PR open from jules23 and it represents a very impactful (but happily, not large technically) change that is proposed: to switch to a "bech32 orderbook", by which we mean making changes like this</p>
<ul>
<li>The default wallet changes to native segwit (bech32, bc1.. addresses)</li>
<li>Joinmarket coinjoins (i.e. maker/taker coinjoins) are offered as <code>sw0reloffer</code>, <code>sw0absoffer</code> in the trading pit</li>
</ul>
<p>Both of these changes would not be "mandatory", just as when we switched to segwit in 2017, it was not mandatory, but would be default in the new version. The fees for coinjoins will be significantly reduced from the current "wrapped segwit" addresses, and we would gain better compatibility with Wasabi and a number of other modern wallets that default to bech32.</p>
<p>The general problem with these updates (which we've only done once before) is that they cause a "liquidity split" temporarily, as not everyone migrates to the new address type at the same time. This is unfortunate, but I feel less concerned about it than last time, as the amount of maker liquidity is <em>much</em> larger (more on that below, about IRC).
Another reason to be slightly unsure about this update is that taproot activation may be coming quite soon, but it seems unlikely that the real activation on the live network will take less than 1 year from now (does it)?, so probably we should do this anyway. That's my opinion.</p>
<p>The general idea would be to make a new 0.8.0 version next after this, including this change. More testing is needed, but it's mostly ready. If you have opinions about the technical implementation of this, feel free to discuss on the above github PR thread. For more general discussion I'd suggest using #joinmarket on freenode.</p>
<h4>New message channel implementations vs IRC</h4>
<p>This part is far more speculative. We have had several discussions about message channels over the years. As early as 2016/17 I abstracted out the message channel "layer" so that IRC was just an implementation (see <code>jmclient/jmclient/message_channel.py</code>) of a few key methods. Alternative implementations have always been possible, but nobody either found time, or found a practical way, to make an alternative implementation. This issue is becoming more pressing. As a simple example, only this week we had IRC ops come to us complaining (very politely, it wasn't a disaster) that about 450 bots had suddenly shown up in our joinmarket test pit. This is in some ways less interesting than the real scalability problem: Joinmarket uses broadcast for offers, but also a sort of "anti-broadcast" mechanism: when a new Taker shows up, they ask <em>every</em> Maker for their current offers, and the Makers <em>all</em> send them at the same time, to that one Taker. So this doesn't scale very well and IRC as a messaging layer doesn't like it; this is the main reason negotiation of a Joinmarket coinjoin takes ~ 1 minute instead of 1-5 seconds (we have to deliberately throttle/slow down messages).</p>
<p>We rather badly need a more scalable messaging layer. I'd appeal for help on this, and I'd also appeal for public discussion of ideas on github (we've had such threads in the past, but nothing really happened).</p>
<p>Let's not forget that related to all that is DOS. Depending on implementation, DOS attacks can be a real problem. Chris Belcher's fidelity bond wallets were implemented within Joinmarket's code already, earlier this year, see <a href="https://github.com/JoinMarket-Org/joinmarket-clientserver/blob/c1f34f08c52452c229319e7421bfd930f8d70a7c/docs/fidelity-bonds.md">here</a> for documentation explaining this, but implementing it as a requirement for Makers is another step, and it might be an important part of the puzzle of getting a scalable messaging layer right.</p>
<p>Getting this right won't just help Joinmarket coinjoins, but also various other systems we might want to integrate over time (SNICKER? CoinjoinXT? CoinSwap? something else?).</p>The 445BTC gridchain case2020-06-15T00:00:00+02:002020-06-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2020-06-15:/blog/blog/the-445-btc-gridchain-case/<p>analysis of gridchain blockchain analysis and implications for Joinmarket usage.</p><h3>The 445 BTC gridchain case</h3>
<p>For those time-constrained or non-technical, it may make sense to read
only the <a href="index.html#summary">Summary</a> section of this article. It goes
without saying that the details do matter, and reading the other
sections will give you a much better overall picture.</p>
<h2>Contents</h2>
<p><a href="index.html#background">Background - what is the "gridchain case"?</a></p>
<p><a href="index.html#change-peeling">Toxic change and peeling chains</a></p>
<p><a href="index.html#change-joinmarket">Change outputs in a Joinmarket context</a></p>
<p><a href="index.html#toxic-recall">The toxic recall attack</a></p>
<p><a href="index.html#size-factor">The size factor</a></p>
<p><a href="index.html#sudoku">Joinmarket sudoku</a></p>
<p><a href="index.html#maker-taker">Reminder on the maker-taker tradeoff</a></p>
<p><a href="index.html#address-reuse">Address reuse</a></p>
<p><a href="index.html#summary">Summary; lessons learned; advice to users</a></p>
<p><a href="index.html#already">Already implemented improvements</a></p>
<p><a href="index.html#still-needed">Still needed improvements</a></p>
<p><a href="index.html#recommendations">Recommendations for users</a></p>
<h2 id="background">Background - what is the "gridchain case"?</h2>
<p>This is a reflection on a case of reported theft as outlined
<a href="https://old.reddit.com/r/Bitcoin/comments/69duq9/50_bounty_for_anybody_recovering_445_btc_stolen/">here</a>
on reddit in early 2017 by user 'gridchain'.</p>
<p>What I won't do here is discuss the practical details of the case;
things like, whether it was a hack or an inside job, nor anything like
network level metadata, all of which is extremely important in an actual
criminal investigation. But here I'm only focusing on the role played
by Joinmarket specifically and blockchain level activity of the coins,
generally.</p>
<p>The reason for this blog post was
<a href="https://research.oxt.me/the-cold-case-files/1">this</a>
recent report by OXT Research - specifically by analyst
<a href="https://bitcoinhackers.org/@ErgoBTC">ErgoBTC</a>
(they require an email for signup to read the full report, otherwise you
only see the summary).</p>
<p>A short note of thanks here to ErgoBTC and LaurentMT and others
involved, since this kind of detailed analysis is badly needed, I hope
will we see more, specifically in public, over time (we cannot hope for
such from the deeply unethical blockchain analysis companies).</p>
<p>I'm [not]{style="text-decoration: underline;"} going to assume here
that you've read that report in full, but I am going to be referring to
its main set of conclusions, and analyzing them. Obviously if you want
to properly assess my statements, it's rather difficult - you'd need
full knowledge of Joinmarket's operation <em>and</em> full details of the OXT
Research analysis - and even then, like me, you will still have some
significant uncertainties.</p>
<p>So the case starts with the claimed theft in 2 parts: 45 BTC in <a href="https://blockstream.info/tx/2f9bfc5f23b609f312faa60902022d6583136cc8e8a0aecf5213b41964963881">this
txn</a>
(note I will use blockstream.info for my tx links because I find their
presentation easiest for single txs specifically; note that oxt.me 's
research tool is of course a vastly superior way to see a large network
of txs, which plays a crucial role in this analysis), and a
consolidation of 400BTC in <a href="https://blockstream.info/tx/136d7c862267204c13fec539a89c7b9b44a92538567e1ebbce7fc9dd04c5a7f0">this other
txn</a>
.</p>
<p>We'll assume that both of these utxos are under the control of a single
actor/thief, henceforth just <em>A</em>.</p>
<p>Setting aside the (in some ways remarkable) timing - that <em>A</em> did not
move the coins for about 2 years - let's outline roughly what happened,
and what the report tells us:</p>
<ul>
<li>The 400BTC went into joinmarket as a maker, and did a bunch (11 to
be precise) of transactions that effectively "peeled down" (more
on this later) that 400 to perhaps 335 BTC (with the difference
going into coinjoins).</li>
<li><em>A</em> then switched to a taker role for a while, focusing on higher
denominations, ranging from \~ 6BTC to as high as \~58BTC. Many of
these coinjoins had very low counterparty numbers (say 3-5 being
typical).</li>
<li>At some point some maker activity is seen again in this same
"peeling chain"; the report terms this phase as "alternating",
but it's hard to say for sure whether some particular script is
running, whether <em>A</em> is just randomly switching roles, or what.</li>
</ul>
<p>Be aware that this simplified narrative suggests to the careless reader
that one can easily trace all the coins through all the coinjoins, which
of course is not true at all - each subsequent transaction moves some
portion into a "mixed state", but (a) we'll see later that just
"moved into mixed state" is not the end of the story for some of those
coins and (b) while this narrative is misleading for Joinmarket in
general, it is not <em>as</em> misleading in this particular case.</p>
<p>The distinction between the "second" and "third" phase as listed in
those bullet points is pretty much arbitrary, but what is not in doubt
as important is: that second phase marks a clear jump in coinjoin amount
average size (this could be read as impatience on <em>A</em>'s part - but
that's just speculation), and this resulted in small anonymity sets in
some txs - 4 and 3 in two txs, in particular. Let's continue:</p>
<ul>
<li>Within the second regime, the OXT analysis narrows in on those small
anon set, large denomination txs - can they figure out which equal
sized output belongs to <em>A</em> here? The "toxic replay attack"
(explained below) allows them to identify one coinjoin output
unambiguously - but that goes into another coinjoin. But in a
second case it allows them to reduce the anonymity set (of the equal
sized coinjoin outputs) to 2, and they trace forwards both of those
outputs.</li>
<li>One of those 2 coinjoin outputs (<a href="https://blockstream.info/tx/2dc4e88685269795aafe7459087ab613878ce7d857dd35760eefeb9caf21371b">this
txn</a>
, output index 2) pays, after several hops, into a Poloniex deposit
address in <a href="https://blockstream.info/tx/ab1e604cd959cc94b89ab02b691fe7d727d30637284e5e82908fb28b8db378f4">this
txn</a>
). Although this is several hops, and although it does not deposit
all of that \~58BTC into Poloniex (only about half of it),
nevertheless this can be (and is) treated as a significant lead.</li>
<li>So the next step was to trace back from that specific Poloniex
deposit address, which it turned out had a bunch of activity on it.
See
<a href="https://blockstream.info/address/16vBEuZD54NzqnnSStPYxFF2aktGhhuaf1">16vBEuZD54NzqnnSStPYxFF2aktGhhuaf1</a>
. Indeed several other deposits to that single address are connected
to the same Joinmarket cluster, and specifically connected to those
smaller-anon set taker-side coinjoins. In total around 270BTC is
eventually linked from <em>A</em>'s joinmarket coinjoins to that deposit
address. Even though some of those connections are ambiguous, due to
address reuse the evidence of co-ownership appears very strong.</li>
<li>Some further evidence is provided (though I am still fuzzy on the
details, largely just because of the time needed to go through it
all) linking more of the coins to final destinations, including some
from the 45BTC original chunk. The claim is that 380BTC is linked at
final destinations to the original 445BTC set. In the remainder
I'll focus on what is already seen with this 270BTC set and only
peripherally mention the rest - there is already a lot to chew on!</li>
</ul>
<h2 id="change-peeling">Toxic change and peeling chains</h2>
<p>The general idea of a "peeling chain" on the Bitcoin blockchain isn't
too hard to understand. Given 100 BTC in a single utxo, if I have to
make a monthly payment of 1 BTC and never use my wallet otherwise, then
clearly the tx sequence is (using (input1, input2..):(output1,
output2..)) as a rudimentary format): ((100):(1,99), (99):(1, 98),
(98:(1, 97)...). Ignoring fees of course. What matters here is that I
just always have a single utxo and that on the blockchain <em>my</em> utxos
<em>may</em> be linked as (100-99-88-97...) based on a <a href="https://en.bitcoin.it/wiki/Privacy#Change_address_detection">change
heuristic</a>
such as "round amount for payment". To whatever extent change
heuristics work, then to that extent ownership can be traced through
simple payments (especially and mostly if transactions have exactly two
outputs, so that the very <em>idea</em> of change, let alone a change
heuristic, applies straightforwardly).</p>
<p><img alt="peeling chain simple
example" src="https://web.archive.org/web/20200713230834im_/https://joinmarket.me/static/media/uploads/.thumbnails/PeelingChain1.png/PeelingChain1-418x296.png">{width="418"
height="296"}</p>
<p>In peeling chains, sometimes, the primary heuristic is the <strong>size</strong> of
the output. If you start with 1000 btc and you peel 0.1 btc hundreds of
times, it's obvious from the "size pattern" what the change is (and
indeed it's this case that gives rise to the name <em>peel chain</em> because
"peel" refers to taking off a <em>small</em> part of something, usually its
surface). The above diagram is more similar (but not the same, exactly)
as the initial flow in the gridchain case, with one very large utxo
gradually getting peeled off.</p>
<p>In some cases timing may factor in; sometimes hackers will do hundreds
of such peels off a main originating utxo in a short time.</p>
<p>You can think of a peeling chain as the lowest effort ownership
obfuscation out there. Notice how literally any, even the simplest,
Bitcoin wallet, has to offer the feature required to carry this out -
just make a vanilla payment, for which there is (almost always, but not
always) a change output, back to your wallet.</p>
<p>So in Bitcoin's history, this technique has very often been seen used -
by hackers/thieves moving coins "away" from the original site of the
theft (I remember the <a href="http://www.techienews.co.uk/973470/silk-road-like-sheep-marketplace-scams-users-39k-bitcoins-worth-40-million-stolen/">case of Sheep
Market</a>
for example). Each "peel" raises additional uncertainty; the
non-change output is going somewhere, but who owns that? But the change
outputs represent a link allowing someone, in theory, to keep tracing
the activity of the original actor. Notice here how we talk about one
branch (our ((100):(1,99), (99):(1, 98), (98:(1, 97)...) example
illustrates it); but one could keep tracing the payment outputs (the
'1's in that flow) and see if they themselves form other peel chains,
leading to a tree.</p>
<p>We mentioned a 'change heuristic' element to this - which is the
"main branch" if we're not sure which output is the change?</p>
<h3 id="change-joinmarket">Change outputs in a Joinmarket context</h3>
<p>A reader should from this point probably be familiar with the basics of
Joinmarket's design. Apart from the
<a href="https://github.com/Joinmarket-Org/joinmarket-clientserver">README</a>
and <a href="hhttps://github.com/JoinMarket-Org/joinmarket-clientserver/blob/master/docs/USAGE.md">usage
guide</a>
of the main Joinmarket code repo, the diagrams showing the main
Joinmarket transaction types
<a href="https://github.com/AdamISZ/JMPrivacyAnalysis/blob/master/tumbler_privacy.md#joinmarket-transaction-types">here</a>
may be useful as a refresher or a reference point for the following.</p>
<p>We have: \(N\) equal outputs and \(N\) or \(N-1\) non-equal change
outputs, where \(N-1\) happens when the taker does a "sweep",
emptying the mixdepth (= account; joinmarket wallets have 5 accounts by
default) without a change output. [This last feature is specific to
Joinmarket, and specific to the taker role: there's no other coinjoin
out there that provides the facility to sweep an arbitrary amount of
coins out to an equal-sized output, with no
change.]{style="text-decoration: underline;"} (I am emphasizing this not
for marketing, but because it's crucial to this topic, and not widely
understood I think).</p>
<p>As an example of why it's important, here is one line from the OXT
Research article:</p>
<blockquote>
<p><em>Fees taken directly in a mix transaction result in deterministic
links ("unmixed change").</em></p>
</blockquote>
<p>This is false as an absolute statement; fees can be paid by a taker,
inside the transaction, with no unmixed change for the taker (this is
the Joinmarket 'sweep'). Deterministic links between inputs and change
outputs <em>do</em> result from change, and fees <em>do</em> create an additional flag
that can help make those linkages, in cases where there would be more
ambiguity. But a zero fee coinjoin with change outputs still has
deterministic links, usually.</p>
<p>Why does the OXT Research article heavily focus on <em>toxic unmixed
change</em> as a concept and as a key weakness of such protocols as
Joinmarket, and why do I disagree?</p>
<p>As we discussed peeling chains offer a low quality of obfuscation, and
to unpack that: the problem is that if you have any relatively viable
change heuristic (it doesn't <em>have</em> to be large amounts as discussed),
it can let you keep knowledge of ownership of a whole chain of
transactions. That basically gives the blockchain analyst (we'll call
<em>B</em>) a very large attack surface. He can look at <em>all</em> the information
flowing out of, or associated with, a whole chain of transactions. Any
later recombination of outputs from that "large attack surface" is
either a coinjoin or a "smoking gun" that different outward paths were
actually under the control of one owner (this comes back to that central
heuristic - common input ownership, and all the nuance around that).</p>
<p>In Joinmarket or any other coinjoin protocol that does allow change
outputs, "change heuristic" doesn't really apply, it kind of morphs
into something else: it's very obvious which outputs are change, but it
is only <em>in some cases</em> easy to disentangle which change outputs are
associated to which inputs, and that's actually what you need to know
if you want to trace via the change (as per "peeling chains"
description above). In high anonymity sets, it starts to get difficult
to do that disentangling, but more on that ("sudoku") later.</p>
<p>The analysis done in the OXT Research report smartly combines a long
peeling chain with other specific weaknesses in the way <em>A</em> acted, which
we will discuss in the next section.. So all this is very valid in my
view.</p>
<p>[But I think going from the above to the conclusion "coinjoins which
have unmixed change are fundamentally inferior and not viable, compared
to coinjoins without unmixed change" is just flat out
wrong]{style="text-decoration: underline;"}. Consider yourself in the
position of <em>A</em>. You have let's say 400BTC in a single utxo. If you run
a coinjoin protocol that insists on no change always, and without a
market mechanism, you are forced to use a fixed denomination, say 0.1
BTC (an example that seems common), now across thousands of
transactions. In order to create these fixed denomination utxos you are
faced with the same problem of trying to avoid a trivial peeling chain.
By insisting on no deterministic links within the coinjoin, you simply
move the problem to an earlier step, you do not remove it.</p>
<p>Fixed denomination does not solve the problem of having an unusually
large amount to mix compared to your peers.</p>
<p>Having said that, fixed denomination with no change at all, does create
other advantages - I certainly don't mean to disparage that model!
Without going into detail here, consider that a large set or network of
all-equal-in all-equal-out coinjoins can create similar effects to a
single, much larger, coinjoin (but this is a topic for another article).</p>
<h2 id="toxic-recall">The toxic recall attack</h2>
<p>Earlier we explained that one of the steps of the OXT Research analysis
was to identify a low liquidity regime where <em>A</em> was acting as taker,
and we mentioned the "toxic recall attack" was used to reduce the
anonymity sets of the coinjoin outputs, during this, to a level low
enough that simple enumeration could find good candidates for final
destinations of those coins.</p>
<p>Embedded in this was a crucial piece of reasoning, and I think this was
a both excellent, and very important idea:</p>
<ul>
<li><strong>Joinmarket does not allow co-spending of utxos from different
accounts</strong></li>
<li>That means that if a coinjoin output <em>X</em> is spent along with a utxo
from the "peeling chain" (i.e. they are both inputs to the same
tx), then <em>X</em> is not owned by <em>A</em> (assuming correct identification
of <em>A</em>'s peeling chain)</li>
<li>Every time such an event occurs, that <em>X</em> can be crossed off the
list of coinjoin outputs that <em>A</em> might own, thus reducing the
anonymity set of that earlier coinjoin by 1.</li>
</ul>
<p>The reasoning is not perfectly watertight:</p>
<p>First, as the report observes: the first assumption behind it is "A
user can only run one mixing client at a time." This is clearly not
literally true, but like many things here, a good-enough guess is fine,
if it eventually leads to outcomes that further strengthen the case. And
that is definitely true here: while a smart operator probably would be
running more than one instance of Joinmarket code, it is not default
behaviour and requires both a little coding and some careful thought.
Most people would not do this.</p>
<p>(Second, nothing stops a user from making a coinjoin to an address in
the same mixdepth (at least in the current software). It's just that
(a) that is heavily discouraged and (b) it's not easy to see a good
reason why someone would <em>try</em> to do that. Still it is possible as a
mistake. But I don't think this is a reason to doubt the effectiveness
of the "toxic recall attack", just, it should be noted.)</p>
<p>So overall the bolded sentence is the most interesting - Joinmarket's
intention is to prevent co-spending outputs which would ruin the effect
of any single coinjoin - i.e. it tries (caveat: above parenthetical) to
prevent you using both a coinjoin output and the change output (or any
other utxo in the same account as the change output and the original
inputs) together. And this small element of 'rigidity' in how coins
are selected for spending is actually another 'bit' of information
that <em>B</em> can use to make deductions, at least some of the time.</p>
<p>The following diagram tries to illustrate how these conditions lead to
the possibility of the attack, to reduce the anonymity set of coinjoin
outputs:</p>
<p><img alt="Toxic recall attack
illustration" src="https://web.archive.org/web/20200713230834im_/https://joinmarket.me/static/media/uploads/.thumbnails/ToxicRecall1.png/ToxicRecall1-692x490.png">{width="692"
height="490"}</p>
<p>So in summary we see 4 really important factors leading to the attack's
viability:</p>
<ol>
<li>Joinmarket's strict account separation</li>
<li>Linkability via change - as we'll describe in the next section
"Joinmarket sudoku", this is <em>usually</em> but not always possible, so
while (1) was 99% valid this is more like 75% valid (entirely vague
figures of course).</li>
<li>Reusing the same peers in different coinjoin transactions</li>
<li>Low number of peers</li>
</ol>
<p>Of course, 3 and 4 are closely tied together; reuse of peers happened a
lot precisely because there were so few peers available for large
coinjoin sizes (to remind you, it was between 6 and 58 BTC, and the
average was around 27, and there are/were few Joinmarket peers actually
offering above say 10BTC).</p>
<h2 id="size-factor">The size factor</h2>
<p>This is a thread that's run through the above, but let's be clear
about it: in practice, typical Joinmarket coinjoins run from 0.1 to 10
BTC, which is unsurprising. There are a fair number of much smaller
transactions, many just functioning as tests, while <em>really</em> small
amounts are not very viable due to the fees paid by the taker to the
bitcoin network. Larger than 10 BTC are certainly seen, including up to
50 BTC and even beyond, but they appear to be quite rare.</p>
<p>The actions of <em>A</em> in this regard were clearly suboptimal. They started
by taking 4 x 100 BTC outputs and consolidating them into 1 output of
400 BTC. This was not helpful, if anything the opposite should have been
done.</p>
<p>Second, as a consequence, they placed the entirety of this (I'm
ignoring the 45 BTC output for now as it's not that crucial) in one
mixdepth. For smaller amounts where a user is just casually offering
coins for joining, one output is fine, and will rapidly be split up
anyway, but here this very large size [led to most of the large-ish
joining events forming part of one long peeling
chain<em>.</em>]{style="text-decoration: underline;"} This part probably isn't
clear so let me illustrate. A yield generator/maker usually splits up
its coins into random chunks pretty quickly, and while as a maker they
do <strong>not</strong> get the crucial "sweep, no change" type of transaction
mentioned above, they nevertheless do get fragmentation:</p>
<p><code>Initial deposit --> After 1 tx --> After 2 txs --> After many txs</code></p>
<p><code>0: 1BTC --> 0.800 BTC --> 0.800 BTC --> 0.236 BTC</code></p>
<p><code>1: 0 BTC --> 0.205 BTC --> 0.110 BTC --> 0.001 BTC</code></p>
<p><code>2: 0 BTC --> 0.000 BTC --> 0.100 BTC --> 0.555 BTC</code></p>
<p><code>3: 0 BTC --> 0.000 BTC --> 0.000 BTC --> 0.129 BTC</code></p>
<p><code>4: 0 BTC --> 0.000 BTC --> 0.000 BTC --> 0.107 BTC</code></p>
<p>(Final total is a bit more than 1BTC due to fees; the reason it gets
jumbled, with no ordering, is: each tx moves coinjoin output to <em>next</em>
mixdepth, mod 5 (ie it wraps), but when a new tx request comes in it
might be for any arbitrary size, so the mixdepth used as <em>source</em> of
coins for that next transaction, could be any of them. This is
illustrated in the 'after 2 txs' case: the second mixdepth was chosen
as input to the second tx, not the first mixdepth).</p>
<p>This dynamic does <strong>not</strong> remove the "peeling chain" or "toxic
change" dynamic emphasized in OXT Research's report - because every tx
done by the maker still has its change, [precisely because as maker you
don't have the privilege of choosing the
amount]{style="text-decoration: underline;"}.</p>
<p>But it does result in more so to speak "parallelisation" of the mixing
activity, instead of the largest chunk all being in one long chain.</p>
<p>A question remains, if we imagine that we use much smaller amounts - can
the analyst always follow the "peeling chain of each mixdepth" (to
coin a phrase which at this point hopefully makes sense)?</p>
<p>I think actually the answer is more 'no' than you might at first
think. The next section will illustrate.</p>
<h2 id="sudoku">Joinmarket sudoku.</h2>
<p>This concept including its origination is covered in some detail in my
earlier article
<a href="https://github.com/AdamISZ/JMPrivacyAnalysis/blob/master/tumbler_privacy.md#jmsudoku-coinjoin-sudoku-for-jmtxs">here</a>.
Essentially we are talking about making unambiguous linkages between
change outputs and the corresponding inputs in any given Joinmarket
coinjoin. I reproduce one transaction diagram from that article here to
help the reader keep the right idea in mind:</p>
<p><img alt="Coinjoin
canonical" src="https://web.archive.org/web/20200713230834im_/https://joinmarket.me/static/media/uploads/cjmtx.svg">{width="550"
height="389"}</p>
<p>So to effect this "sudoku" or disentangling, let's suppose you don't
have any sophistication. You're just going to iterate over every
possible subset of the inputs (they're randomly ordered, of course) and
see if it matches any particular change output (you assume that there is
exactly one change output per participant). In case it wasn't obvious,
"matches" here means "that change output, plus the coinjoin size (3
btc in the diagram above), equals the sum of the subset of inputs".</p>
<p>Now none of them will <em>actually</em> match because there are fees of two
types being paid out of (and into) the change - the bitcoin network fees
and the coinjoin fees (which add to most and subtract from one, at least
usually). So since you don't know the exact values of those fees, only
a general range, you have to include a "tolerance" parameter, which
really complicates the issue.</p>
<p><a href="https://gist.github.com/AdamISZ/15223a5eab940559e5cf55e898354978">This
gist</a>
is a quick and dirty (in the sense it's barely a 'program' since i
just hardcoded the values of the transaction) example of doing such a
Joinmarket sudoku for one of the transactions in the OXT Research
analysis of flows for this case. The pythonistas out there might find of
interest particularly this code snippet for finding the "power set"
(the set of all subsets):</p>
<p><code>def power_set(l):</code>\
<code>iil = range(len(l))</code>\
<code>return list(chain.from_iterable(combinations(iil, r) for r in range(len(iil)+1)))</code></p>
<p>As per a very beautiful piece of mathematical reasoning, the power set
of a set of size \(N\) is \(2\^{N}\) (every member of set is either
in, or not in, each subset - think about it!). So this matters because
it illustrates, crudely, how we have here an exponential blowup.</p>
<p>That particular transaction had 24 inputs, so the power set's
cardinality would be \(2\^{24}\) - but the beginning of the analysis
is to take a subset, of size 4, you already conclude to be linked, thus
reducing the size of the search space by a factor of 16. Now, there's a
lot more to it, but, here's what's interesting: <strong>depending on the
tolerance you choose, you will often find there are multiple sudoku
solutions</strong> if the size of the set of inputs is reasonably large (let's
say 20 and up, but it isn't possible to fix a specific number of
course). In the first couple of attempts of finding the solution for
that transaction, I found between 3 and 7 different possible ways the
inputs and outputs could connect; some of them involve the pre-grouped 4
inputs acting as taker (i.e. paying fees) and some involve them acting
as maker.</p>
<p>Now, if this ambiguity isn't enough, there's another significant
source of ambiguity in these sudokus: previous equal-sized coinjoin
outputs. For example take <a href="https://blockstream.info/tx/5f8747a3837a56dd2f422d137b96b1420fd6885be6d1057f3c4dca102a3138b6?output:5">this
txn</a>:</p>
<p><img alt="un-sudoku-able
tx" src="https://web.archive.org/web/20200713230834im_/https://joinmarket.me/static/media/uploads/.thumbnails/tx5f8747.png/tx5f8747-849x617.png">{width="849"
height="617"}</p>
<p>There are 21 inputs, which is already in the "problematic" zone for
sudoku-ing, as discussed, in that it will tend to lead to multiple
possible solutions, with a reasonable tolerance parameter. But in this
case a full sudoku is fundamentally impossible: notice that inputs index
7 and 21 (counting from 0) both have amount 6.1212 . This means that any
subset that includes the first is identical to a subset that includes
the second. Those two outputs are, unsurprisingly, from the same
previous Joinmarket coinjoin (they don't have to be, though).</p>
<p>In any long "peeling chain" these ambiguities will degrade, perhaps
destroy, the signal over time - unless there is some very strong
watermark effect - such as huge size, which is precisely what we see
with <em>A</em>.</p>
<p>To summarize, we these key points about the Sudoku concept for
identifying chains of ownership:</p>
<ul>
<li>As long as you don't sweep, a Joinmarket account, thus not emptied,
will keep creating this chain of ownership via change - though the
size of that linked amount dwindles over time.</li>
<li>Thus makers (who cannot sweep) have no guarantee of not having that
specific ownership trace persist, for each of their 5 accounts (but
<em>not</em> across them - the 5 accounts will not be connected on chain,
at least not in a trivial way).</li>
<li>If you use a very large size then this acts as a strong enough
watermark that such tracing is pretty much guaranteed to work (i.e
the Sudoku works much more reliably if you put in a 400BTC utxo and
everyone else in the coinjoin only uses 10BTC at max).</li>
<li>Otherwise, and in general, such tracing is a bit unreliable, and
over a long series of transactions it becomes very unreliable (but
again - this is no kind of privacy guarantee! - we just observe that
there will be increasing uncertainty over a long chain, including
really fundamental ambiguities like the transaction above).</li>
<li>Whenever you <em>do</em> sweep, you create what I called in the previous
article a <a href="https://github.com/AdamISZ/JMPrivacyAnalysis/blob/master/tumbler_privacy.md#joinmarket-wallet-closures">"completed mixdepth
closure"</a>;
there is no change for you as taker, and so an end to that
"chain". This only exists for takers. (you can of course sweep
<em>without</em> coinjoin at all, also).</li>
</ul>
<h3 id="maker-taker">Reminder on the maker-taker tradeoff</h3>
<p>This illustrates another aspect of the more general phenomenon -
Joinmarket almost by definition exists to serve takers. They pay for
these advantages:</p>
<ul>
<li>As coordinator, they do not reveal linkages to their counterparties.
Makers must accept that the taker in each individual coinjoin <em>does</em>
know <em>their</em> linkages (the maker's), even if they're OK with that
over a long period because there are many disparate takers; that's
a weakness.</li>
<li>They choose the time when the coinjoin happens (within a minute or
so, it's done, if all goes well)</li>
<li>They choose the amount of the coinjoin, so can have a payment as a
coinjoin outpoint.</li>
<li>Corollary of the above: they can control the size of their change,
in particular, reducing it to zero via a "sweep"</li>
<li>Since they run only when they want to coinjoin, they have a smaller
time footprint for attackers (makers have an "always on hot
wallet" <em>which responds to requests rather than initiates them</em> ,
so it's more like a server than a client, which is by definition
difficult to keep properly secure).</li>
</ul>
<p>These 4+ advantages are what the Taker pays for, and it's interesting
that in practice that the <em>coinjoin</em> fee has fallen to near-zero
<strong>except for larger sizes</strong> .I will point to my earlier thoughts on low
fees
<a href="https://x0f.org/web/statuses/104123055565241054">here</a>
to avoid further sidetrack.</p>
<p>Therefore the cool sounding idea "oh I have this bunch of bitcoin
sitting around, I'll just passively mix it for a while and actually get
paid to do it!" (I have noticed people <em>mostly</em> get interested in
Joinmarket from this perspective) is more limited than it seems.</p>
<h2>Address reuse</h2>
<p>This will probably be the shortest section because it's so obvious.</p>
<p>The fact that 270BTC of the 445 BTC going "into" Joinmarket ended up
at <code>16vBEuZD54NzqnnSStPYxFF2aktGhhuaf1</code>is kind of a big facepalm moment;
I don't think anyone reading this blog would have trouble understanding
that.</p>
<p>I don't dismiss or ignore that such things happen for a reason, and
that reason is mainly actions of centralized exchanges to deliberately
reduce the privacy of their customers ("KYC/AML"). Sometimes, of
course, sheer incompetence is involved. But it's the exception rather
than the rule, since even the most basic consumer wallets do not
generally reuse addresses nowadays. I'll consider these real world
factors out-of-scope of this article, although they will matter in your
practical real life decisions about keeping your privacy (consider <em>not</em>
using such exchanges).</p>
<p>What has to be said though: 270 does not equal 445 (or 400); it is not
impossible to imagine that such a set of deposits to one address may not
be traced/connected to the original deposit of 400 (+) into Joinmarket
(although it would really help if that total wasn't so very large that
there are only a few Joinmarket participants in that range anyway). And
indeed, my own examination of the evidence tells me that the connections
of each individual final deposit to
`16vBEuZD54NzqnnSStPYxFF2aktGhhuaf1```back to that original 445 is <em>not</em>
unambiguous. The problem is of course the compounding effect of
evidence, as we will discuss in the next, final section.</p>
<h2 id="summary">Summary; lessons learned; advice to users</h2>
<p>So we've looked into details, can we summarize what went wrong for <em>A</em>?
Albeit we don't actually know with certainty how much of the
attributions in the OXT Research are correct, they appear to be <em>broadly
correct</em>.</p>
<ol>
<li>400 BTC is a very large amount to move through a system of perhaps
at best a couple hundred users (100 makers on the offer at once is
typical), most of whom are not operating with more than 10 BTC.</li>
<li>One large chunk of 400 is therefore a way worse idea than say 10
chunks of 40 across 10 Joinmarket wallets (just common sense really,
although starting with 400 is not in itself a disaster, it just
makes it harder, and slower). This would have been more hassle, and
more fees, but would have helped an awful lot.</li>
<li>Running passively as a maker proved too slow for <em>A</em> (this is an
assumption that the report makes and that I agree with, but not of
course a 'fact'). This is Joinmarket's failing if anything; there
are just not enough people using it, which relates to the next
point:</li>
<li>When switching to a taker mode (which in itself was a very good
idea), <em>A</em> decided to start doing much larger transaction sizes, but
found themselves unable to get more than a few counterparties in
some cases. This should have been a sign that the effect they were
looking for might not be strong enough, but it's very
understandable that they didn't grok the next point:</li>
<li>The "toxic replay attack" very heavily compounds the low anonymity
set problem mentioned above - reuse of the same counterparties in
successive transactions reduced the anonymity set from "bad" to
"disastrously low" (even down to 1 in one case).</li>
<li>Even with the above failings, all needn't really be lost; repeated
rounds are used and the '1' anonymity set mentioned output was
sent to another coinjoin anyway. The first chunk of coins identified
to be sent to Poloniex address (first to be identified, not first in
time) was in an amount of about 28 BTC via several hops, then part
of the 76 BTC in <a href="https://web.archive.org/web/20200713230834/https://joinmarket.me/blog/blog/the-445-btc-gridchain-case/%22https://blockstream.info/tx/ab1e604cd959cc94b89ab02b691fe7d727d30637284e5e82908fb28b8db378f4">this
txn</a>,
and even the first hop only had a 50% likelihood assigned. So it's
a combination of (a) the address being marked as in the POLONIEX
cluster, the size of the deposit and then the reuse allowing tracing
back to other transactions, that caused a "high-likelihood
assignment of ownership", which leads into ...</li>
<li>Address reuse as discussed in the previous section is the biggest
failing here. If all the deposits here were to different exchange
addresses, these heuristics would not have led to any clear
outcomes. A few guesses here and there would exist, but they would
remain guesses, with other possibilities also being reasonable.</li>
<li>Circling back to the beginning, notice how making educated guesses
about deposits on exchanges a few hops away from Joinmarket might
already be enough to get some decent guesses at ownership, if the
sizes are large enough compared to the rest of the Joinmarket usage.</li>
</ol>
<p>So overall the post mortem is: <strong>a combination of at least three
different things leads to a bad outcome for <em>A</em> : large (much bigger
than typical JM volume) size not split up, heavy address reuse (on a
centralized exchange) and a small anonymity set portion of the
sequence.</strong></p>
<p>This issue of "combination of factors" leading to a much worse than
expected privacy loss is explained well on the bitcoin wiki Privacy page
<a href="https://en.bitcoin.it/wiki/Privacy#Method_of_data_fusion">here</a>.</p>
<h3 id="already">Already implemented improvements</h3>
<p>When running as a taker and using the so-called <a href="https://github.com/JoinMarket-Org/joinmarket-clientserver/blob/master/docs/tumblerguide.md">tumbler
algorithm</a>
users should note that in 2019 a fairly meaningful change to the
algorithm was implemented - one part was to start each run with a sweep
transaction out of each mixdepth containing coins as the first step
(with longer randomized waits). This makes a peeling chain direct from a
deposit not possible (you can always try to guess which coinjoin output
to hop to next of course, with the concomitant difficulties).
Additionally average requested anonymity sets are increased, which, as
an important byproduct tends to create larger input sets which are
harder to sudoku (and more likely to have substantial ambiguity). There
are several other minor changes like rounding amounts, see <a href="https://gist.github.com/chris-belcher/7e92810f07328fdfdef2ce444aad0968">Chris
Belcher's document on
it</a>
for more details.</p>
<h3 id="still-needed">Still needed improvements</h3>
<p>Clearly the toxic recall attack concept matters - it is going to matter
more, statistically, as the anonymity set (i.e. the number of coinjoin
counterparties) is reduced, but it matters per se in any context -
reusing the same counterparties <strong><em>in a sequence of coinjoins from the
same mixdepth closure</em></strong> reduces the anonymity set. Notice there are a
couple of ways that situation could be remediated:</p>
<ol>
<li>Reduce the number of coinjoin transactions within the same mixdepth
closure - but this is not clear. If I do 1 coinjoin transaction with
10 counterparties and it's a sweep, closing the mixdepth closure,
is that better than doing 2 coinjoin transactions from it, each of
which has 6 counterparties, if there is a 10% chance of randomly
choosing the same counterparty and thus reducing the anonymity set
of the second coinjoin by 1? That is pretty profoundly unclear and
seems to just "depend". 1 transaction with 12 counterparties <em>is</em>
clearly better, but very large sets like that are very difficult to
achieve in Joinmarket today (particularly if your coinjoin amount is
large).</li>
<li>Actively try to prevent reusing the same counterparty for multiple
transactions in the same mixdepth closure (obviously this is for
takers; makers are not choosing, they are offering). Identification
of bots is problematic, so probably the best way to do this is
simply for a taker to keep track of its earlier txs (especially
within a tumbler run, say) and decide to not include makers when
they provide utxos that are recognized as in that set. This is still
a bit tricky in practice; makers don't want their utxos queried all
the time, but takers for optimal outcomes would like full
transparent vision into those utxo sets - see <a href="https://web.archive.org/web/20200713230834/https://joinmarket.me/blog/blog/poodle/">earlier discussion of
PoDLE</a>
and
<a href="https://web.archive.org/web/20200713230834/https://joinmarket.me/blog/blog/racing-against-snoopers-in-joinmarket-02/">here</a>
on this blog for the tricky points around this.</li>
</ol>
<p>(2) is an example of ideas that were discussed by Joinmarket
developers years ago, but never really went anywhere. Takers probably
<em>should</em> expand the query power given them by the PoDLE tokens to have a
larger set of options to choose from, to gauge the "quality" of what
their counterparties propose as join inputs, but it's a delicate
balancing act, as mentioned.</p>
<h3 id="recommendations">Recommendations for users</h3>
<p>For the final section, some practical advice. Joinmarket can be a
powerful tool - but it's unfortunately not very easy to understand what
you <em>should</em> do, precisely because there is a lot of flexibility.</p>
<ol>
<li>The taker role, and in particular the tumbler role, are designed to
be used to actively improve your privacy. We explain above that it
gives certain advantages over the maker role. So: <strong>use it!</strong> - with
at least the default settings for counterparty numbers and
transaction numbers, and long waits - in fact, increase these
factors above the defaults. Note that tumbles can be safely
restarted, so do it for a day, shut it down and then restart it a
few days later - that's fine. See the docs for more on that. Be
sensitive to bitcoin network fees - these transactions are very
large so they'll be more palatable at times when the network is
clearing 1-5 sats/byte. However ...</li>
<li>... mixing roles definitely has advantages. The more people mix
roles the more unsafe it is to make deductions about which coinjoin
output belonged to the taker, after it gets spent (consider what you
can deduce about a coinjoin output which is then spent via an
ordinary wallet, say to a t-shirt merchant).</li>
<li>The maker role isn't useless for privacy, it's rather best to
think of it as (a) limited and (b) taking a long time to have an
effect. It's most suitable if your threat model is "I don't want
a clear history of my coins over the long term". It also costs
nothing monetarily and brings in some very small income if your size
is large (if small, it's likely not worth mentioning) - but in that
case, take your security seriously.</li>
<li>Consider sizing when acting as a taker. We as a project should
perhaps create more transparency around this, but you can gauge from
your success in arranging big size coinjoins: if you can't easily
find 6+ counterparties to do a coinjoin at a particular size, it may
not be a good idea to rely on the outcomes, as you may be mixing in
too small of a crowd (whether that's at 10 BTC or 20 BTC or 50+ BTC
just depends on market condition).</li>
<li>Make good use of (a) the accounts (mixdepths) feature, (b) the coin
freeze feature and (c) the sweep feature (taker only). These three
things allow you to better isolate coins going to different
destinations - your cold wallet, your mobile spending wallet, an
exchange etc etc. Accounts let you have the assurance that coins in
one aren't linked with coins in another; you can't accidentally
co-spend them. The freeze feature (see the "Coins" tab on Qt) lets
you spend individual utxos, where that's important to you for some
reason, without connection to others. And the sweep feature lets you
make a coinjoin without any change, breaking a link to future
transactions.</li>
</ol>
<p>We soon (in 0.7.0; the code is basically already done) hope to have more
helpful features, in particular Payjoin as defined in BIP 78, along with
very basic PSBT support.</p>Schnorrless Scriptless Scripts2020-04-15T00:00:00+02:002020-04-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2020-04-15:/blog/blog/schnorrless-scriptless-scripts/<p>a new ECDSA single-signer adaptor signature construction.</p><h3>Schnorrless Scriptless Scripts</h3>
<h2>Introduction</h2>
<p>The weekend of April 4th-5th 2020 we had a remote "Lightning
Hacksprint" organized by the ever-excellent Fulmo, one Challenge was
related to "Payment Points" (see
<a href="https://wiki.fulmo.org/index.php?title=Challenges#Point_Time_Locked_Contracts_.28PTLC.29">here</a>;
see lots more info about the hacksprint at that wiki) and was based
around a new innovation recently seen in the world of adaptor
signatures. Work was led by Nadav Kohen of Suredbits and Jonas Nick of
Blockstream; the latter's API for the tech described below can be seen
currently as a PR to the secp256k1 project
<a href="https://github.com/jonasnick/secp256k1/pull/14">here</a>.
The output from Suredbits was a demo as show
<a href="https://www.youtube.com/watch?v=w9o4v7Idjno&feature=youtu.be">here</a>
on their youtube, a PTLC (point time locked contract, see their
<a href="https://suredbits.com/payment-points-part-1/">blog</a>
for more details on that).</p>
<p>I will not focus here on either the proof of concept code, nor the
potential applications of this tech (which are actually many, not only
LN, but also Discreet Log contracts, various design of tumbler and
others), but entirely on the cryptography.</p>
<h2>What you can do with Schnorr adaptors</h2>
<p>Previous blog posts have covered in some detail the concept of adaptor
signatures, how they are simply realizable using the Schnorr signature
primitive. Also noted here and elsewhere is that there are techniques to
create the same effect using ECDSA signature, but involving considerable
additional crypto machinery (Paillier homomorphic encryption and certain
zero knowledge (range) proofs). This technique is laid out in
<a href="https://lists.linuxfoundation.org/pipermail/lightning-dev/attachments/20180426/fe978423/attachment-0001.pdf">this</a>
brief note, fleshed out fully in
<a href="https://eprint.iacr.org/2017/552">this</a>
cryptographic construction from Lindell, paired with
<a href="https://eprint.iacr.org/2018/472">this</a>
paper on multihop locks (which represents a very important theoretical
step forward for Lightning channel construction). The problem with that
"tech stack" is the complexity in the Lindell construction, as
mentioned.</p>
<p>A recent
<a href="https://github.com/LLFourn/one-time-VES/blob/master/main.pdf">paper</a>
by Lloyd Fournier represents a very interesting step forward, at least
in a certain direction: it allows "single signer" ECDSA adaptor
signatures. The scare quotes in the previous sentence represent the fact
that the use of such adaptor signatures would not literally be single
signer - it would be in the context of Bitcoin's <code>OP_CHECKMULTISIG</code>,
most typically 2 of 2 multisig, so the same as the current
implementation of the Lightning network, in which a contract is enforced
by having funds controlled by both parties in the contract. Here, what
is envisaged is not a cooperative process to construct a single
signature (aggregated), but each party can individually create adaptor
signatures with signing keys they completely control. That this is
possible was a big surprise to me, and I think others will be unclear on
it too, hence this blog post after a week or so of study on my part.</p>
<p>Let's remember that the Schnorr adaptor signature construction is:</p>
<p>\(\sigma'(T, m, x) = k + H(kG+T||xG||m)x\)</p>
<p>where \(k\) is the nonce, \(x\) is the private (signing) key and
\(T\) is the 'adaptor point' or just adaptor. The left-hand-side
parentheses are important: notice that <strong>you don't need the discrete
log of the point T to construct the adaptor signature</strong>. But you <em>do</em>
need the signing key \(x\). Or wait .. do you?</p>
<p>As I explained last year
<a href="https://x0f.org/web/statuses/102897691888130818">here</a>
it's technically not the case: you can construct an adaptor signature
for signing pubkey \(P\) for which you don't know \(x\) s.t.
\(P=xG\), with a fly in the ointment: you won't be able to predict
the adaptor \(T\) or know its discrete log either (this makes it
un-dangerous, but still an important insight; I was calling this
"forgeability" but more on that later).</p>
<p>How you ask? To summarize the mastodon post:</p>
<p>\(\stackrel{\$}{\leftarrow} q, Q=qG, \quad \mathrm{assume}\quad
R+T =Q\)</p>
<p>\(\Rightarrow \sigma' G= R + H(P,Q,m)P\)</p>
<p>\(\stackrel{\$}{\leftarrow} \sigma' \quad \implies R = s'G -
H(P,Q,m)P \implies T = Q-R\)</p>
<p>Thus anyone can publish an adaptor signature \((T, \sigma')\) on any
message \(m\) for any pubkey \(P\) at any time. It <em>really</em> isn't a
signature.</p>
<p>And equally obvious is that this does not allow the "forger" to
complete the adaptor into a full signature (\(\sigma = \sigma' +
t\)) - because if he could, this would be a way to forge arbitrary
Schnorr signatures!</p>
<p>With the caveat in the above little mathematical vignette aside, we note
that the bolded phrase above is the crucial point: adaptors can be
created by non-secret owners, for secret owners to complete.</p>
<h2>Adaptors in ECDSA with less wizardry</h2>
<p>I was alerted to this trick via <a href="https://lists.linuxfoundation.org/pipermail/lightning-dev/2019-November/002316.html">this mailing list
post</a>
and the work of the Suredbits guys, in particular Nadav Kohen, who blogs
on payment points, DLCs and related topics
<a href="https://suredbits.com/payment-points-part-1/">here</a>.
The idea can be summarised as "tweak the nonce multiplicatively instead
of linearly". Take the following notation for the base (complete) ECDSA
signature:</p>
<p>\(\sigma = k^{-1}\left(\mathbb{H}(m) + R_{\mathrm{x}}x\right)
\)</p>
<p>Here we're using the most common, if sometimes confusing notation. As
usual \(k\) is the nonce (generated deterministically usually),
\(R=kG\), \(m\) is the message and \(x\) is the private signing
key whose public key by convention is \(P\). Meanwhile
\(R_{\mathrm{x}}\) indicates the x-coordinate of the curve point
\(R\), with the usual caveats about the difference between the curve
order and the order of the finite field from which the coordinates are
drawn (feel free to ignore that last part if it's not your thing!).</p>
<p>Now clearly you cannot just add a secret value \(t\) to the nonce and
expect the signature \(\sigma\) to be shifted by some simple factor.
Multiplication looks to make more sense, since after all the nonce is a
multiplicative factor on the RHS. But it's not so simple, because the
nonce-<em>point</em> appears as the term \(R_{\mathrm{x}}\) inside the
multiplied factor. The clever idea is how to get around this problem. We
start by defining a sort-of "pre-tweaked" nonce:</p>
<p>\(R' = kG\)</p>
<p>and then the real nonce that will be used will be multiplied by the
adaptor secret \(t\):</p>
<p>\(R = kT = ktG\)</p>
<p>Then the adaptor signature will be published as:</p>
<p>\(\sigma' = k^{-1}\left(\mathbb{H}(m) + R_{\mathrm{x}}x\right)
\)</p>
<p>... which may look strange as here the RHS is identical to what we
previously had for the <em>complete</em> signature \(\sigma\). The
difference of course is that here, the terms \(k\) and \(R\) don't
match up; \(R\) has private key \(kt\) not \(k\). And hence we can
easily see that:</p>
<p>\(\sigma = t^{-1} \sigma'\)</p>
<p><em>will</em> be a valid signature, whose nonce is \(kt\).</p>
<p>However, we do not operate in a world without adversaries, so to be sure
of the statement "if I get given the discrete log of \(T\), I will be
able to construct a fully valid \(\sigma\)", we need a proof of that
claim. This is the key innovation, because this can be done <em>very</em>
simply with a proof-of-discrete-log, or a "PoDLE" as was described in
one of the first <a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/poodle/">blog
posts</a>
here. To prove that \(R'/G = R/T = k\), where we somewhat abuse / to
mean "elliptic curve discrete log", you just create an AND of two
\(\Sigma\)-protocols, using the same commitment (i.e., nonce), let's
call it \(k_2\) and output a schnorr style response \(s = k_2 +
ek\), where the hash e covers both points \(k_2 G\ ,\ k_2 T\) as
has been explained in the just-mentioned PoDLE blog post and also in a
bit more generality in the <a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/ring-signatures/">post on ring
signatures</a>.</p>
<p>It's thus intuitive, though not entirely obvious, that an "adaptor
signature" in this context is really a combination of the same idea as
in Schnorr, but with additionally a PoDLE tacked-on:</p>
<p>Input:</p>
<p>an adaptor point \(T\), a message \(m\), a signing key \(x\)</p>
<p>Output:</p>
<p>adaptor signature \((\sigma', R, R')\), adaptor signature PoDLE:
\((s, e)\)</p>
<p>Verification for non-owner of adaptor secret \(T\):</p>
<p>1. Verify the PoDLE - proves that \(R, R'\) have same (unknown)
discrete log w.r.t. \(T, G\) respectively.</p>
<p>2. Verify \(\sigma' R' \stackrel{?}{=} \mathbb{H}(m) +
R_{\mathrm{x}} P\)</p>
<h2>Swapping ECDSA coins with this method</h2>
<p>Fundamentally, if not exclusively, adaptor signatures as originally
conceived, and still here, allow the swap of a coin for a secret (in
that broadcast of a spending transaction necessarily implies broadcast
of a signature which can be combined with a pre-existing adaptor
signature to reveal a secret), and the crudest example of how that can
be used is the coinswap or atomic swap, see
<a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/coinswaps/">these</a>
<a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/flipping-the-scriptless-script-on-schnorr/">previous</a>
blog posts for a lot of detail on pre-existing schemes to do this, both
with and without the Schnorr signature primitive that was previously
thought to be near-required to do adaptor signatures.</p>
<p>The ECDSA scheme above can be used in a slightly different way than I
had originally described for Schnorr adaptor signatures, but it appears
that was partly just oversight on my part: the technique described below
<em>can</em> be used with Schnorr too. So the advantage here is principally
that we can do it right now.</p>
<p>1. Alice spends into a 2/2 A1 B1 after first negotiating a timelocked
refund transaction with Bob, so she doesn't risk losing funds.</p>
<p>2. Bob does the same, spending into a 2/2 A2 B2 after negotiating a a
timelocked refund tranasction with Alice, so he also doesn't risk, but
his timelock is closer.</p>
<p>3. Alice creates an adaptor \(\sigma_{1}^{'}\) spending with key
A1 to Bob's destination and adaptor point \(T\) for which she knows
discrete log \(t\).</p>
<p>4. Bob verifies \(\sigma_{1}^{'}\) and the associated data
mentioned above, including crucially the PoDLE provided.</p>
<p>5. Bob creates an adaptor \(\sigma_{2}^{'}\) spending with key B2
to Alice's destination and adaptor point \(T\) for which he does
<strong>not</strong> know the \(t\).</p>
<p>6. Alice can now safely complete the adaptor she receives: \(\sigma_2
= t^{-1}\sigma_{2}^{'}\) and co-sign with A2 and broadcast,
receiving her funds.</p>
<p>7. Bob can see on the blockchain (or communicated directly for
convenience): \(t = \sigma_{2}^{'}\sigma_{2}^{-1}\) and use it
to complete: \(\sigma_{1} = t^{-1}\sigma_{1}^{'}\), and co-sign
with B1 and broadcast, receiving his funds.</p>
<h3>Comparisons to other coinswaps:</h3>
<p>This requires 2/2 P1 P2 type scriptPubKeys; these can be p2sh multisig
or p2wsh multisig using, as mentioned, <code>OP_CHECKMULTISIG</code>. Notice that
in a future Taproot/Schnorr world, this will still be possible, using
the linear style adaptor signatures previously described. However in
that case a musig-style combination of keys will almost certainly be
preferred, as it will create transaction styles that look
indistinguishable from single or any other script types. For now, the
system above does share one very valuable anonymity set: the set of
Lightning channel opens/closes, but doesn't share an anonymity set with
the full set of general single-owner ECDSA coins (which includes both
legacy and segwit).</p>
<p>For now, this method has the principal advantage that the only failure
mode is the timelocked backout, which can be a transaction that looks
entirely normal - having a non-zero <code>nLockTime</code> somewhere around the
current block is actually very normal. While the atomic enforcement part
is, just like Schnorr adaptors, entirely invisible. So apart from the
smaller anonymity set (2-2, so mostly LN), it has excellent privacy
properties.</p>
<h2>Reframing adaptors \(\rightarrow\) otVES</h2>
<p>The aforementioned
<a href="https://github.com/LLFourn/one-time-VES/blob/master/main.pdf">paper</a>
of 2019 by Lloyd Fournier is titled "<em>One Time Verifiably Encrypted
Signatures A.K.A. Adaptor Signatures</em>" - at first this new name
(henceforth otVES) seemed a bit strange, but after reading the paper I
came away pretty convinced. Both the conceptual framework is very clean,
but also, this links back to earlier work on the general concept of
Verifiably Encrypted Signatures. Most particularly the work of the same
guys that brought us BLS signatures from bilinear pairing crypto, in
<a href="http://crypto.stanford.edu/~dabo/papers/aggreg.pdf">this
paper</a>
(namely, Boneh, Lynn, Shacham but also Gentry of FHE fame). The context
considered there was wildly different, as Fournier helpfully explains:
this earlier work imagined that Alice and Bob wanted to fairly exchange
signatures that might be useful as authorization for some purpose. To
achieve that goal, they imagined trusted third party acting between
them, and that an encrypted-to-third-party-adjudicator but still
<em>verifiable</em> signature could serve as the first step of a fair protocol,
assuming honesty of that third party. However what makes the Bitcoin
use-case special is that signatures <strong>are useable if and only if
broadcast</strong><em>. </em>All of this coinswap/HTLC/second layer stuff relies on
that property. In this scenario, having not only a VES but an otVES is
exactly desirable.</p>
<p>Why is one-time desirable here? It's a little obtuse. For those
familiar with cryptography 101 it'll make sense to think about the <a href="https://en.wikipedia.org/wiki/One-time_pad">one
time
pad</a>.
The absolutely most basic concept of encryption (which also happens to
be perfectly secure, when considered in the most <a href="https://en.wikipedia.org/wiki/Spherical_cow">spherical
cow</a>
kind of way): take a plaintext \(p\) and a key \(k\), bitstrings of
the exact same length. Then make the ciphertext \(c\):</p>
<p>\(c = p \oplus k\)</p>
<p>and the thing about this that makes it perfect is exactly also something
that can be considered a "bug": the symmetry of the \(\oplus\)
(xor) operation is such that, given both the plaintext and the
ciphertext, the key can be derived: \(k = c \oplus p\). So any
broadcast of \(p\), after an earlier transfer of \(c\) (to Bob,
let's say), means that the secret key is revealed.</p>
<p>The same is true in our adaptor signature or VES scenario: the adaptor
signature \(\sigma'\) is an "encrypted signature", and is
verifiable using the verification algorithm already discussed, by anyone
who has that encrypted signature and the adaptor "public key" which we
called \(T\). Notice how this is analogous to <em>public</em> key encryption,
in that you only need a public key to encrypt; but also notice that the
one-time pad is <em>secret key </em>encryption, which is why the plaintext and
ciphertext are enough to reveal the key (note: more developed secret key
algorithms than OTP handle this problem). This is some kind of hybrid of
those cases. Once the "plaintext" signature \(\sigma\) is revealed,
the holder of the "encrypted" signature \(\sigma'\) can derive the
private key: \(t\).</p>
<p>So hopefully this makes clear why "one-time-ness" is not so much in
itself desirable, as what is implied by it: that the "private key"
(the <em>encryption</em> key, not the <em>signing</em> key, note!) is revealed on one
usage.</p>
<h2>Security properties - deniability, forgeability, validity, recoverability ...</h2>
<p>At a high level, what security properties do we want from these
"encrypted signatures''? I think there's a strong argument to focus
on two properties:</p>
<ul>
<li>Handing over such encrypted signatures should not leak any
information to any adversary, including the recipient (it may or may
not be needed to keep the transfer private, that is not considered
in the model).</li>
<li>Given an encrypted signature for a message and key, I should be able
to convince myself that when the plaintext signature is revealed, I
will get the secret key \(t\), or complementary: when the secret
key \(t\) is revealed, I should be able to recover the plaintext
signature.</li>
</ul>
<p>We'll deal with both of these points in the following subsections.</p>
<h3>Deniability</h3>
<p>The Schnorr version of the otVES is deniable in the specific sense that
given an unencrypted signature, a corresponding encrypted signature for
any chosen key (\(t\)) can be claimed, as was explained
<a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/flipping-the-scriptless-script-on-schnorr/">here</a>
("Deniability" subsection). For anyone familiar with the basic
construction of zero knowledge proofs, this will be immediately
recognized as being the definition of a "Simulator", and therefore
proves that such an adaptor signature/encrypted signature leaks zero
information to recipients.</p>
<p>It is interesting to observe that the same trick does <strong>not</strong> work with
the ECDSA variant explained above:</p>
<p>Given \(\sigma, R\) satisfying \(\sigma R = \mathbb{H}(m)G +
R_{\mathrm{x}}P\) for the verifying pubkey \(P\), you can try to
assert that \(k = tk_2\) but <strong>you have no way to generate a PoDLE for
\(R, R'\) if you don't know k</strong> - this means that such a
"retrofitted" encrypted signature (which by definition <em>includes</em> the
PoDLE) is not possible for a party not knowing the original secret
nonce, and thus the simulator argument (the argument that an external
observer <em>not knowing the secret</em> can create fake transcripts with a
distribution indistinguishable from the real transcripts) is not
available, hence we cannot claim that such encrypted signatures are
fully zero knowledge. More on this shortly.</p>
<h3>Forgeability</h3>
<p>I am abusing terms here, because unforgeability is the central property
of a valid signature scheme, but here let's talk about the forgeability
of an <em>encrypted</em> signature, so perhaps "adaptor forgeability". Here I
mean the ability to create arbitrary encrypted signatures <em>without</em> the
signing key. This was demonstrated as possible for Schnorr in the first
section of this blog post (noting the obvious caveat!). For ECDSA, we
hit the same snag as for 'Deniability'. Without possessing the signing
key \(x\), you want to make the verification \(\sigma' R' =
\mathbb{H}(m)G + R_{\mathrm{x}}P\) pass for some \(R, R', T, R =
tR'\) such that you can prove DLOG equivalence w.r.t. \(G, T\). You
can do this by "back-solving" the same way as for Schnorr:</p>
<p>\(\stackrel{\$}{\leftarrow} k^{*}, R=k^{*}G, \quad Q =
\mathbb{H}(m)G + R_{\mathrm{x}}P\)</p>
<p>\(\stackrel{\$}{\leftarrow} \sigma', \quad \Rightarrow \sigma'
R' = Q \Rightarrow R' = (\sigma')^{-1}Q\)</p>
<p>But since this process did <em>not</em> allow you to deduce the scalar \(q\)
s.t. \(Q = qG\), it did not allow you to deduce the corresponding
scalar for \(R'\). Thus you can output a set \(\sigma', R, R'\)
but you cannot also know, and thus prove equivalence of, the discrete
logs of \(R\) and \(R'\).</p>
<p>The previous two sections demonstrate clearly that the otVES
construction for ECDSA is fundamentally different from that for Schnorr
in that it requires providing, and proving a relationship between two
nonces, and this also impacts quite significantly the security arguments
that follow.</p>
<h3>Validity, Recoverability</h3>
<p>These are aspects of the same thing, so grouped together, and they talk
about the most central and unique property for an otVES scheme, but
fortunately it is almost tautological to see that they hold for these
schemes.</p>
<p>The concern it addresses: what if Alice gave Bob an encrypted signature
to a key \(T\) but it turned out that when decrypted with the
corresponding key \(t\), a valid signature wasn't actually revealed.
That this is impossible is called <strong>validity</strong>. The flip side is
<strong>recoverability</strong>: if Alice gave Bob an encrypted signature and then
published the corresponding decrypted signature ("plaintext"), the
secret key for the encryption (\(t\)) must be revealed.</p>
<p>The Schnorr case illustrates the point clearly, see Lemma 4.1 in
Fournier's paper; \(\sigma' = \sigma -t\) in our notation and we
can see by the definition of Schnorr signature verification that this
must hold, given there cannot be another \(t' \ne t\) s.t. \(t'G =
T\) (there is a one-one mapping between scalars mod n and group
points). Recoverability is also unconditionally true in the same way.</p>
<p>For the ECDSA case, it is nearly the same, except: we rely on the PoDLE
between \(R, R'\), which has the same properties itself as a Schnorr
signature, and so the properties hold conditional on the inability to
break ECDLP (because that would allow Schnorr forgery, and thus PoDLE
forgery).</p>
<p>Note how a ECDLP break can obviously destroy the usefulness of all these
schemes, in particular the underlying signature schemes, but even that
does not alter the fact that the Schnorr encrypted signature is valid
and recoverable (though it becomes a mere technicality in that case).</p>
<h3>EUF-CMA for otVES using Schnorr</h3>
<p>EUF-CMA was discussed in the previous blogs on the Schnorr signature and
on ring signatures, in brief it is a technical term for "this signature
scheme is secure in that signatures cannot be forged by
non-secret-key-owners under this specific set of (fairly general)
assumptions".</p>
<p>Proving this for the Schnorr otVES turns out to be a fairly standard
handle-cranking exercise. This is essentially what I have focused on in
previous work as "proving soundness by running an extractor",
including patching up the random oracle. See the above linked post on
the Schnorr signature for more detail.</p>
<p>Note that unforgeability referred to here <strong>is not the same as "adaptor
forgeability" discussed above</strong>. Here we are specifically trying to
prove that access to such encrypted signatures does not help the
adversary in his pre-existing goal of forging <em>real </em>signatures.</p>
<p>So the handle-cranking simply involves adding an "encrypted signature
oracle" to the attacker's toolchest. EUF-CMA[VES] basically refers
to the inability to create signatures on new messages even when you have
access to arbitrary encrypted signatures, as well as arbitrary earlier
<em>complete</em> signatures, again, on different messages.</p>
<p>As Fournier points out here:</p>
<blockquote>
<p><em>EUF-CMA[VES] says nothing about the unforgeability of signature
encryptions. In fact, an adversary who can produce valid VES
ciphertexts without the secret signing key is perfectly compatible. Of
course, they will never be able to forge a VES ciphertext under a
particular encryption key. If they could do that, then they could
trivially forge an encrypted signature under a key for which they know
the decryption key and decrypt it.</em></p>
</blockquote>
<p>... which is the reason for my (I hope not too confusing) earlier
section on "adaptor forgeability". It <em>is</em> actually possible, for
Schnorr, but not ECDSA, to do what is mentioned in the second sentence
above.</p>
<h3>EUF-CMA[VES] for ECDSA</h3>
<p>Here is the most technical, but the most important and difficult point
about all this. In producing an encrypted ECDSA signature you output:</p>
<p>\((\sigma', R, R', m, P), \quad \textrm{DLEQ}(R, R')\)</p>
<p>(while \(m, P\) may be implicit of course), and this means you output
one piece of information in addition to the signature: that two nonce
points are related in a specific way. It turns out that this can be
expressed differently as the Diffie Hellman key of the key pair \((P,
T)\) (or, in Fournier's parlance, the signing key and the encryption
key). That DH key would be \(tP = xT = xtG\). Here's how; starting
from the verification equation for a published encrypted signature,
using the notation that we've used so far:</p>
<p>\(s'R' = \mathbb{H}(m) + R_{\mathrm{x}}P\)</p>
<p>isolate the public key P (this is basically "pubkey recovery"):</p>
<p>\(P = R_{\mathrm{x}}^{-1}\left(s'R' - \mathbb{H}(m)G\right)\)</p>
<p>\(\Rightarrow tP = R_{\mathrm{x}}^{-1}\left(s'tR' -
\mathbb{H}(m)tG\right)\)</p>
<p>\(\Rightarrow xT = tP = R_{\mathrm{x}}^{-1}\left(s'R -
\mathbb{H}(m)T\right)\)</p>
<p>Notice how we - a verifier, not possessing either the nonce \(k\) nor
the secret \(t\) - were able to deduce <em>this</em> DH key because we knew
the DH key of the key pair \((R', T)\) - it's \(R\), which we were
explicitly given. So this, in some sense "breaks" the <a href="https://en.wikipedia.org/wiki/Computational_Diffie%E2%80%93Hellman_assumption">CDH
assumption</a>:
that given only points on the curve \(A=aG, B=bG\) you should not be
able to calculate the third point \(abG\) (but "breaks" - because
actually we were given a related DH key to start with).</p>
<p>Fournier addresses this point in two ways. First, he argues that
requirement of the CDH problem being hard is not part of the protocols
for which this scheme is useful and that keys are by design one-time-use
in these applications. The more important point though, is that an
attempt is made to show the scheme secure <strong>if the CDH problem is
easy</strong>. A classic example of backwards cryptography logic ;)</p>
<p>The framework for this is non-trivial, and it is exactly the framework
developed by Fersch et al that was discussed in the section on ECDSA in
<a href="https://web.archive.org/web/20200803123741/https://joinmarket.me/blog/blog/liars-cheats-scammers-and-the-schnorr-signature/">this</a>
earlier blog post (subsection "What about ECDSA?"). I have not studied
this framework in any detail, only cursorily, and would encourage anyone
interested to at least watch the linked video of Fersch's talk on it,
which was quite interesting. With the addition of the assumption "CDH
is easy", Fournier claims that ECDSA can be said to have this
EUF-CMA[VES] security guarantee, which is intended to prove,
basically, that <strong>the leak of the DH key is the only leak of information
and that the scheme is secure against forgery</strong>. I can't claim to be
able to validate this; I can only say the argument appears plausible.</p>Avoiding Wagnerian Tragedies2019-12-15T00:00:00+01:002019-12-15T00:00:00+01:00Adam Gibsontag:joinmarket.me,2019-12-15:/blog/blog/avoiding-wagnerian-tragedies/<p>Wagner's attack</p><h3>Avoiding Wagnerian tragedies</h3>
<p><em>This blog post is all about
<a href="https://people.eecs.berkeley.edu/~daw/papers/genbday.html">this</a>
paper by David Wagner from 2002.</em></p>
<p><em>It is a personal investigation; long, mainly because I wanted to answer
a lot of questions for myself about it. If you are similarly motivated
to understand the algorithm, this may provide useful guideposts. But
there are no guarantees of accuracy.</em></p>
<p>_________________________________________________________________________</p>
<p>In the Berlin Lightning Conference, Jonas Nick gave a short talk (slides
<a href="https://nickler.ninja/slides/2019-tlc.pdf">here</a>)
that included a topic that had been on my "TODO list" for some
considerable time - the so-called Wagner attack. The talk was concise
and well thought out, and for me it made a lot of sense, but I suspect a
lot of the audience lost the key point, as indeed was evidenced by the
only audience question at the end, which was something along the lines
of "but doesn't the birthday attack mean you can only find a hash
collision in \(\sqrt{N}\) time, where \(N\) is the size of the hash
output?" - the questioner had, quite understandably, misunderstood
exactly what the attack does, and remembered what he (and most people
who take an interest in these things) saw as the key security property
that protects how SHA2 and similar are used in cryptocurrency.</p>
<p>So .. should you care? If so, why? I think the main value of this
<em>practically</em>, if, as likely, you're reading this from the perspective
of Bitcoin, is that it matters to various non-vanilla signing protocols:
it can matter to blind signatures, and multisignatures, and very likely
a whole slew of different complex contracts that might be based on such
things. And unfortunately, it is <strong>not intuitive</strong>, so it would be very
easy to miss it and leave a security hole.</p>
<p>My goal in this blog post will be to try to provide some intuition as to
what the hell Wagner's attack is, and why it could be dangerous.</p>
<h2>The Birthday Attack .. or Paradox ... (or just Party?)</h2>
<p>Just as the famous <a href="https://en.wikipedia.org/wiki/Twin_paradox">Twin
Paradox</a>
is not actually a paradox, nor is the perhaps even more famous <a href="https://en.wikipedia.org/wiki/Birthday_problem">Birthday
Paradox</a>.
The result shown in both of these thought experiments (and actual
experiments - the former <em>has</em> actually been done with atomic clocks and
small fractions of \(c\)) is just surprising, that's all. It violates
some simple intuitions we have. Here is it stated simply in words:</p>
<p>Given a set of 23 people (such as children in a classroom), it is a
<strong>better than 50-50 chance</strong> that at least some pair of them will share
the exact same birthday.</p>
<p>The simple argument is: the probability of at least one such pair
existing is \(1 - \) the probability \(p\) of there being <em>no</em> such
pair, which is the case exactly, and only, when <em>every child has a
different birthday.</em> Now we can easily see that \(p = 0\) when there
are 366 children (ignore leap years), and \(p=\frac{364}{365}\) when
there are only 2 children. The case for \(N\) children would be \(p =
\frac{364 \times 363 \times \ldots (365-N)}{ 365 \times 365 \times
\ldots 365}\) where here, we're using the fact that probabilities
multiply when we want the AND of different events. This \(1 - p\),
where \(p\) is a function of \(N\), just happens to be \(\simeq
0.5\) when \(N=23\), hence the result.</p>
<h3>Why intuitions about birthdays are (slightly) wrong.</h3>
<p>The 23 datapoint does surprise people, usually, but it doesn't shock
them. It just seems low. Why does it seem low? Is it because when we
hear the problem statement, we naturally think in more specific terms:
usually, when I am trying to make a match of two things, I am trying to
make a match from <em>one specific thing</em> against some other set of
comparable things. In case of birthdays, we might look for someone with
the same birthday as <em>us</em>, which is a very different problem to finding
<em>any pairwise match</em>, as here.</p>
<p>Also, it's probabilistic, and people don't have good intuitions about
probability generally.</p>
<p>But let's delve a little deeper than that. We're going to need to, to
understand the meat of this blogpost, i.e. Wagner's algorithm.</p>
<p>To <em>very roughly</em> quantify why there's a bit more of a chance of success
in getting a match, than you'd expect, imagine a square grid. Every new
child we add to the list adds another row *and* column; because this
is a <em>square</em>, this is a <strong>quadratic </strong>function, or effect, or scaling.</p>
<p><img src="https://web.archive.org/web/20200428222140im_/https://joinmarket.me/static/media/uploads/.thumbnails/simplesquare5.png/simplesquare5-300x367.png" width="300" height="367" alt="Simple illustration of search space for birthday matches" /></p>
<p><em>(Pictured above: simple example assuming only 3 children. The blue
stars represent possible matches; there are 3 choose 2 for 3 children,
i.e. 3. The lines illustrate that this is the same as 3x3/2 - 3/2. The
bottom left squares are redundant, and those on the diagonal don't
apply.)</em></p>
<p>If the set of children is \(\{L\}\), and we denote the size of the
set (number of elements) as \(|L|\), then we can see that the size
of the <a href="https://en.wikipedia.org/wiki/Cartesian_product"><strong>Cartesian
product</strong></a>
of the set with itself, is \(|L|^{2}\). Since in the problem
statement - getting a single match - we only need one of the elements of
this set to be a match. But let's qualify/correct a <em>little</em> bit so our
toy example is a little bit better defined. If Alice matches Carol on
the top row, she'll also match in the first column (A = C means also C =
A). Further the squares on the main diagonal don't count, A=A is not a
solution to the problem. So for a set \(\{L\}\), if we want the
number of chances of a 'hit' to be about the same as the number of
possible values (the 'sample space' - which for birthdays has size 365),
then we have this very rough approximation:</p>
<p>\(\frac{|L|^{2}}{2} - \frac{|L|}{2} \simeq 365\)</p>
<p>Notice this is a very artificial equation: there's no guarantee that
anything magical happen exactly when the size of the sample space of
each event (the 365) is equal to the number of 'events' (pairs of
children, in this case, that might have the same birthday). But it does
give us the right order of magnitude of <span
style="text-decoration: underline;">roughly how many children would be
needed for the probability to get at least one match in the set to be
'appreciable'</span> . Clearly if \(|L|\) was <em>much</em> bigger than the
positive solution to the above quadratic equation, the probability is
going to become overwhelming; eventually once it reaches 365 we must
have a solution, by the pigeonhole principle, and the probability will
be very close to 1 way before that. And indeed the positive solution is
\(\simeq 28\), which is around the same as the exact answer 23, if
our exact question is how large the set should be to get a 50%
probability.</p>
<p>So while as belaboured above, the calculation above is rough and
artificial, it conveys the key scaling information - <strong>the chance of
success scales with the square of the size of the set, because we are
comparing the set with itself</strong>.</p>
<h3>The birthday attack on hash functions</h3>
<p>This line of thinking is commonly applied to the problem of finding
<a href="https://en.wikipedia.org/wiki/Collision_(computer_science)">collisions in hash
functions</a>.</p>
<p>Suppose you had a hash function whose digests were of length 20 bytes
(SHA1 was of this type). This is 160 bits of 'entropy' - if you assume
it's doing a good job of producing unpredictably random output. However,
as a reminder, there is more than one attack against a hash function
that cryptographers worry about - finding a preimage for an output,
finding <em>another</em> preimage, and the third one relevant to our discussion
- just finding <strong>any</strong> collision, i.e. finding any two preimages giving
the same output hash. For this, the above "birthday paradox" scenario
applies exactly: we have a sample space of \(2^{160}\) possibilities,
and we're going to select a set \(\{L\}\) from it, with the
intention of finding at least one pair in the set with the same output
value. The mathematics is identical and we need something like
\(|L|^{2}\ \simeq 2^{160}\), or in other words, the size of the
set we'd have to generate to get a good chance of a collision in the
hash function, is \(\sqrt{2^{160}}=2^{80}\). Hence a common, if
approximate, statement, is that <span
style="text-decoration: underline;">hash functions have security against
collision of only half the bits of the output</span>. So here, SHA1
could crudely be considered as having 80 bits of security against
collisions ... unfortunately, this statement ignores the fact that
collisions in SHA1 have already been
<a href="https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html">found</a>.
This blog post is, however about non-broken cryptographic constructs;
collisions are supposed to not be possible to find other than by brute
force search, so that's a side story here.</p>
<h2>Wagner's algorithm</h2>
<p><a href="https://people.eecs.berkeley.edu/~daw/papers/genbday.html">Wagner's
paper</a>,
"A Generalised Birthday Problem", considers the question of what happens
if we don't just want a single match between items in a list like this,
but if we want, instead, <em>a relation between a set of items</em>. The
relation considered in particular is applying bitwise XOR, hereafter
\(\oplus\), e.g. :</p>
<p>\(a \oplus b \oplus c \oplus d = 0 \quad (1)\)</p>
<p>(this equation is only "e.g.", because we are not restricted to 4; any
number more than 2 is considered by the paper, and for a number \(k\)
of items, this is referred to as the <span
style="text-decoration: underline;">\(k\)-sum problem</span>, but for
now we'll keep it simple and stick to 4).</p>
<p>First, let's not forget the obvious: this is a trivial problem if we
just <strong>choose</strong> \(a, b, c, d\); just choose the first 3 at random and
make the last one fit. What Wagner's algorithm is addressing is the
scenario where the numbers are all drawn from a uniformly random
distribution (this observation also applies to the children's birthdays;
we are not <em>choosing</em> them but getting random ones), but we can generate
as many such randoms as is appropriate.</p>
<p>Next observation: this "generalised" problem is intuitively likely to be
easier than the original problem of finding only <em>one</em> pairwise match -
you can think of the original birthday problem of a match being the same
as: \(a \oplus b = 0\) (this means a perfect match between \(a\)
and \(b\)). There, we could think of ourselves as being constrained in
"roughly" one variable (imagine that \(a\) is fixed and you are
hunting for \(b\), with the caveat of course that it's crucial to the
argument of square-root scaling that that is <em>not</em> the correct problem
statement!). If we extend to 4 items holding a relation, as above in
\((1)\), then we have "roughly" three degrees of freedom to work with.
It'll always tend to be easier to find solutions to puzzles when you
have more pieces available to play with.</p>
<p>However, the meat of the paper is to explain just how much this problem
is easier than the original (pairwise) birthday problem to solve, and to
give an explicit algorithm for how to do so. Just like with the number
23, it is a bit surprising how effective this algorithm is.</p>
<h3>The algorithm</h3>
<p>To set the stage: suppose we are considering hash functions (so we'll
forget about birthdays now), and the values \(a, b, c, d\) in
\((1)\) are outputs of a hash function. Let's go with SHA256 for now,
so they will all be bit strings of length 256.</p>
<p>We can generate an arbitrary number of them by just generating random
inputs (one particularly convenient way: start with random \(x\),
calculate \(y = \mathbb{H}(x)\), then calculate \(\mathbb{H}(y)
\ldots \); this 'deterministic random' approach, which should still
give a completely random sequence if the hash function is well behaved,
can be very useful in many search algorithms, see e.g. Chapter 3 and 14
of
<a href="https://www.math.auckland.ac.nz/~sgal018/crypto-book/crypto-book.html">Galbraith</a>).
As in earlier sections, we can call this list of such values
\(\{L\}\).</p>
<p>Wagner's suggested approach is to break the problem up, in two ways:
first, take the list of items in \(L\) and split it into 4 (or
\(k\)) sublists \(L_1 , L_2, L_3, L_4\). Second, we will take 2
lists in pairs and then apply the birthday problem to each of them, but
with a twist: we'll only insist on a <strong>partial match</strong>, not a full
match.</p>
<p><em>(Historical note: this idea of using a subset of values satisfying a
simple verifying criteria is also seen in discrete log find algorithms
as well as hash collision finding algorithms, and is often known as
"distinguished points"; the idea seems to go back as far as the early
80s and is due to Rivest according to 14.2.4 of
<a href="https://www.math.auckland.ac.nz/~sgal018/crypto-book/crypto-book.html">Galbraith</a>.
(Of note is that it's intriguingly analogous to Back's or Dwork's proof
of work computation idea).)</em></p>
<p>The following diagram to illustrate the idea is taken directly from the
Wagner paper:</p>
<p><img src="https://web.archive.org/web/20200428222140im_/https://joinmarket.me/static/media/uploads/.thumbnails/wagnerpic1.png/wagnerpic1-527x472.png" width="527" height="472" alt="Wagner algorithm schematic from paper" /></p>
<p>The \(\bowtie\) symbol may not be familiar: it is here intended to
represent a
<a href="https://www1.udel.edu/evelyn/SQL-Class2/SQLclass2_Join.html">join</a>
operation; the non-subscripted variant at the top is what may be called
an 'inner join' (just, find matches between the two sets), whereas
\(\bowtie_{l}\) represents the novel part: here, we search not for
full matches, but only matches in the lowest \(l\) bits of the hash
values, and we store as output the \(\oplus\) of the pair (more on
this in a bit). A concrete example:</p>
<p>\(L_1 = \{\textrm{0xaabbcc}, \textrm{0x112804}, \textrm{0x1a1dee}
\ldots \}, \quad L_2 = \{\textrm{0x8799cc}, \textrm{0x54ea3a},
\textrm{0x76332f} \ldots \}\)</p>
<p>Here we're showing toy hash outputs of 3 bytes (instead of 32), written
in hexadecimal for ease of reading. We're going to use list lengths of
\(2^{l}\) (which will be justified later; we could have picked any
length). If \(l\) were 8 (and the lists length 256 therefore), then
we're searching for matches on the lowest 8 bits of the values, and we
have:</p>
<p>\(L_1 \bowtie_{l} L_2 = \{(\textrm{0xaabbcc} \oplus
\textrm{0x8799cc} = \textrm{0x2d2200}) \ldots \}\)</p>
<p>... plus any other matches if the lists are longer, so that the output
on doing the low-l-bit-join on these two lists of items, produced at
least this single item, which is the \(\oplus\) of the "partial
match", and perforce it will always have its lowest-\(l\) bits as zero
(because of the properties of \(\oplus\)).</p>
<p>Having done this first step for \(L_1 , L_2\) we then do exactly the
same for \(L_3 , L_4\) (remember - we took an original large random
list and split it into 4 (equal-sized) sub-lists).</p>
<p>That leaves us with two lists that'll look something like this:</p>
<p>\(L_1 \bowtie_{l} L_2 = \{\textrm{0x2d2200}, \textrm{0xab3100},
\textrm{0x50a200}, \ldots\}\)</p>
<p>... and the same for \(L_3 , L_4\). Wagner's idea is now to <strong>solve
the original birthday problem directly on this pair of lists</strong> - this is
the simple \(\bowtie\) operator - and he knows it will be easier
precisely because he has reduced the number of bits to be attacked (in
this case, by 8, from 24 to 16). To repeat, this <em>isn't</em> a way to solve
the original birthday problem (which we restated as \(a \oplus b = 0
\), but it <em>is</em> a way to solve the generalised problem of \(a \oplus
b \oplus c \oplus d = 0\).</p>
<p>To give concrete completeness to the above fictitious examples, we can
imagine:</p>
<p>\(L_3 \bowtie_{l} L_4 = \{\textrm{0x2da900}, \textrm{0x896f00},
\textrm{0x50a200}, \ldots\}\)</p>
<p>So we've found this one positive result of the join operation (ignoring
others from a longer list): \(\textrm{0x50a200}\). What can we deduce
from that?</p>
<h3>From partial solutions to an overall solution</h3>
<p>The reason the above steps make any sense in unison is because of these
key properties of the \(\oplus\) operation:</p>
<ul>
<li>Associativity: \(a \oplus (b \oplus c) = (a \oplus b) \oplus
c\)</li>
<li>\(a = b \Rightarrow a \oplus b = 0 \)</li>
<li>The above two imply: \( a \oplus b = c \oplus d \Rightarrow a
\oplus b \oplus c \oplus d = 0\)</li>
</ul>
<p>I hope it's clear that the third of the above is the reason why finding:</p>
<p>\((L_1 \bowtie_{l} L_2) \bowtie (L_3 \bowtie_{l} L_4)\)</p>
<p>... means exactly finding sets of 4 values matching \(a \oplus b
\oplus c \oplus d = 0\).</p>
<h2>Efficiency of the algorithm</h2>
<p>Here's why the above idea even matters: it means that finding such
multi-value matches can be <strong>much</strong> faster than finding pairwise
matches. Wagner goes through the reasoning as follows to give an
approximate feel for how much faster:</p>
<p>First, we can observe that it's likely that the efficiency of following
the above algorithm will depend on the value \(l\). Second, because
it's hard to get it in abstract, let's stick to our concrete toy example
where the hash function has only three bytes in the output (so 24 bits),
and \(l=8\).</p>
<p>The chance of a match on <em>any one pair</em> of elements from \(L_1 ,
L_2\) respectively is about \(2^{-l}\) (they have to match in
\(l\) bits and each bit is a coin flip); the number of possible
matches is \~ \(|L_1| \times |L_2|\). But given that we
arbitrarily chose the length of the lists as \(2^{8}\) - then we
expect the number of matches in \(L_1 \bowtie_{l} L_2\) to be
around \((2^{8} \times 2 ^{8} \times 2^{-8}) = 2 ^{8}\). At first it
may sound strange to say we expect so many matches but consider a
smaller example and it's obvious: if there are 10 possible values, and
we have <span style="text-decoration: underline;">two</span> lists of 10
items, then there are 100 possible matches and a probability 1/10 for
each one (roughly), so we again expect 10 matches.</p>
<p>To complete the analysis we only have to judge how many matches there
are likely to be between the output of \((L_1 \bowtie_{l} L_2)\)
and that of \((L_3 \bowtie_{l} L_4)\). As shown in our toy
example, all of those values have their lowest \(l\) bits zero; a full
solution of \(a \oplus b \oplus c \oplus d = 0\) will therefore be
obtained if the remaining bits of the \(\oplus\) of pairs of items
from the two lists are also zero (keep this deduction I just slid in
there, in mind! It will be crucial!); the probability of that for one
pair is clearly \(2^{-(n-l)}\) which in our toy case is
\(2^{-(24-8)}\), and since each of the lists is length
\(2^{l}=2^{8}\), we have finally that the expected number of solutions
from the whole process is around \(|L_{12}| \times |L_{34}|
\times 2^{-(24-8)} = 2^{8 + 8 - (24-8)} = 1\). This was not an
accident; we deliberately chose the lengths of the lists to make it so.
If we call this length \(2^{k}\), and generalise back to \(l\) bits
for the first partial match step, and \(n\) bits for the hash function
output, then we have an expected number of solutions of \(2^{2k}
\times 2^{-(n-l)}\). Clearly we have room for maneuver in what values
we choose here, but if we choose both \(l\) and \(k = f(l)\) so as
to make the expected number of matches around 1, then we can choose
\(k=l\) and \(l = \frac{n}{3}\), as the reader can easily verify.</p>
<p>Note that that choice \(l=n/3\) and \(k=l\) (or, in words, have the
4 sublists of length \(2^{l}\), and have \(l\) be one third of the
size of the hash output) is not arbitrary, in fact: because we are
trying to optimise our space and time usage. We discuss how this
generalises to more than 4 items in the next section, but for 4, this
means that we need space to store lists of \(\simeq
2^{\frac{n}{3}}\).</p>
<p>Compare this with the already-explained well-known scaling of the
original birthday problem: the time-space usage is of the order of
\(2^{\frac{n}{2}}\) for the same definition of \(n\). This
difference is big: consider, if a hash function had a 150 bit output
(let's forget that that's not a whole number of bytes!), then the
birthday problem is 'defended' by about 75 bits, whereas the 4-list
"generalised birthday problem" here is defended by only 50 bits (which
isn't a reasonable level of defence, at all, with modern hardware).</p>
<h3>Bigger \(k\)-sum problems and bigger trees.</h3>
<p>Clearly while the 4-sum problem illustrated above is already quite
powerful, it will be even more powerful if we can realise instances of
the problem statement with more lists. If we stick with powers of 2 for
simplicity, then, in the case of \(k=256\), we will be able to
construct a larger, complete binary tree with depth 8, combining pairs
of lists just as above and passing to the next level up the tree. At
each step, the number of bits matched increases until we search for full
matches (birthday) right at the top or root of the tree.</p>
<p><strong>This results in overall a time/space usage for these algorithms of
roughly \(O(2^{\frac{n}{log_{2}k+1}})\). So while for our earlier
\(k=4\) we had \(O(2^{\frac{n}{3}})\), for \(k=256\) we have
\(O(2^{\frac{n}{9}})\), i.e. the attack could be very powerful
indeed!</strong></p>
<p>If you're still a bit bewildered as how it might be possible to so
drastically reduce the difficulty of finding matches, just by
constructing a tree, note that it's part of a broader theme in much
mathematics: note what is sometime called the triangle inequality:</p>
<p>\(|a| + |b| \ge |a+b|\)</p>
<p>and in cases where a homomorphism applies, i.e. \(f(a+b) = f(a) +
f(b)\), it can sometimes be the case that the ability to shift from one
to the other - from "process each object individually" to "process the
combined object" allows one to collapse down the computational
difficulty of a problem. And that's what's happening here - the fact
that one can process <em>parts</em> of these objects individually - i.e., find
matches on <em>subsets</em> of the bits of the random numbers, and then combine
those linearly, gives a better outcome (performance wise) than if one
were to try to find total matches all at once.</p>
<p>This is just a very vague musing though; feel free to ignore it :)</p>
<h2>Generalising the algorithm</h2>
<p>First let's briefly mention the important but fairly simple point: you
can generalise from \(a \oplus b \oplus c \oplus d = 0\) to \(a
\oplus b \oplus c \oplus d = c\) for some non-zero \(c\); just
replace one of the lists, e.g. \(L_4\) with a corresponding list
where all terms are xor-ed with the value \(c\), so that the final
result of xor-ing the 4 terms found by the above algorithm will now be
\(c\) instead of zero.</p>
<p>Also let's note that we ended up finding solutions only from a small
set: those for which there was a match in the final \(l\) bits of
pairs of elements. This restriction can be changed from a match to an
offset in the bit values, but it's only of minor interest here.</p>
<p>A far more important question though, which we will expand upon in the
next section: can we generalise from groups with the
\(\oplus\)-operation to groups with addition? Solving, say:</p>
<p>\(a+b+c+d=0\ \textrm{mod}\ 2^{n}\)</p>
<p>(it's a little easier mod \(2^{n}\) than for arbitrary sized additive
groups, but that's a detail, explained in the paper).</p>
<p>The answer is yes, but it's worth taking a moment to consider why:</p>
<p>We need to slightly alter the algorithm to make it fit the properties of
addition: to replicate the property \(a \oplus b = 0\) we replace
\(b\) with \(-b\), and we do this in both the two "layers" of the
algorithm for the 4 list case (see paper for details). Now what's
crucial is that, in doing this, we preserve the property that <strong>a match
in the lowest \(l\) bits in the first step is retained after
combination in the second step</strong> (the way Wagner puts it is: "The reason
this works is that \(a \equiv b \ \textrm{mod} 2^{l}\) implies
\((a+c \ \textrm{mod}2^{n}) \equiv (b+c\ \textrm{mod}2^{n})\
(\textrm{mod}2^{l})\): the carry bit propagates in only one
direction."; in other words the match is not 'polluted' by the way in
which addition differs from xor, namely the carry of bits. This the
reader can, and probably should, verify for themselves with toy examples
of numbers written as bitstrings, using e.g. \(l=2, n=4\) or similar).</p>
<p>Because of the carry of bits (or digits) when we add, this isn't
perfectly obvious, but in the \(\oplus\) case it really is: what
makes the algorithm works is the preservation of a distinguishing
property after multiple applications of the operation, to reduce a large
set into a smaller one.</p>
<h3>Does it work for all groups?</h3>
<p>Since the above algorithm seems to be kind of generic, it's natural to
start wondering (and worrying!) that it may apply also to other
apparently hard collision problems. In particular, couldn't you do
something similar with elliptic curve points?</p>
<p>The main point of this blog post, apart from just trying to explain the
Wagner algorithm, was to answer this question in the negative. As we'll
see shortly, there is a concise academic argument that the answer
<em>should</em> be no, but I want to give some insight as to <em>why</em> it's no,
that is, why you cannot use this approach to find sets of scalars which,
when passed through the randomising function of elliptic curve scalar
multiplication to produce points on the curve, result in a sum to a
provided point, and thus solve the ECDLP.</p>
<h3>Wei Dai's argument</h3>
<p>Before we begin, an amusing piece of trivia: the long version of
Wagner's paper cites both Wei Dai and Adam Back, in a curious similarity
to ... another well known paper that came out 6 years later :)</p>
<p>What is cited as coming from private correspondence with Wei Dai is the
following logic, which superficially appears fairly trivial. But it's
nonetheless crucial. It's a <strong>reduction argument</strong> of the type we
discussed in some considerable detail in the last two blog posts (on
signatures):</p>
<blockquote>
<p>If the \(k\)-sum problem can be solved on any cyclic group \(G\)
in time \(t\), then the discrete logarithm problem on that group can
also be solved in time \(O(t)\).</p>
</blockquote>
<p>The words are carefully chosen here. Note that both \((\mathbb{Z}_n ,
+)\) and \((\mathbb{Z}_n , \times )\) are cyclic groups of order
\(n\). In the former, we have already explained that the \(k\)-sum
problem can be solved efficiently; so this is really only an important
statement about the multiplicative group, not the additive group.</p>
<p>And that makes sense, because the "discrete logarithm problem" (defined
in the broadest possible way) is only hard in the multiplicative group
(and even then, only if \(n\) has large/very large prime factors, or
ideally just is a prime) and not in the additive group. To illustrate:
take the group \(G = (\mathbb{Z}_{11} , +)\), and define a
'generator' element 3 (any element works as a generator if n is prime);
if I were to ask you for the 'discrete log' of 7 in this group, it would
really mean finding \(x \in G\) such that \(3x = 7\) which is
really just the problem of finding \(x = 7 \times 3^{-1} \
\textrm{mod} 11\), which is a trivial problem (see: the <a href="https://en.m.wikipedia.org/wiki/Extended_Euclidean_algorithm">Extended
Euclidean
Algorithm</a>),
even if you replace 11 with a very large prime. It's for this reason
that it would be a terribly naive error to try to do cryptography on an
additive group of integers; basically, division, being the additive
analog of logarithms for multiplication, is trivially easy.</p>
<p>But Wei Dai's argument goes a bit further than that concrete reasoning,
because he's saying the "if-then" (which can also be reversed, by the
way - see the paper, "Theorem 3") can be applied to any, arbitrary
groups - and that includes elliptic curve groups. If the DLP is hard in
that group, the \(k\)-sum problem can't be solved easily, and vice
versa. The argument is something like (we use \(\cdot\) specifically
to indicate <em>any</em> group operation):</p>
<p>If you can find a solution to:</p>
<p>\(x_1 \cdot x_2 \cdot \ldots x_k = y\)</p>
<p>..using an efficient \(k\)-sum problem algorithm applied to uniformly
randomly generated \(x_i\)s, and if the group's generator is written
as \(g\), and the dlog of \(y\) in this group is \(\theta\), i.e.
\(y=g^{\theta}\), then you can use that solution to find
\(\theta\):</p>
<p>\(w_1 + w_2 + \ldots w_k = \theta\)</p>
<p>Thus, we have, essentially, a <span
style="text-decoration: underline;">reduction of the discrete logarithm
problem to the k-sum problem</span>.</p>
<h3>But why doesn't the algorithm work for DLP hard groups?</h3>
<p>We've already seen the key point in "Generalising the algorithm" above,
so if you skipped the last part of that section, do read it!</p>
<p>To reiterate, notice that the main description of solving this problem
with groups using \(\oplus\) or just addition required finding
partial matches and then preserving the features of partial matches
through repeated operations. It's precisely this that does not work in a
multiplicative group.</p>
<p>Here's a concrete example of doing that, with an additive group of the
simplest type, where we are working modulo a power of 2, let's say
\(n=4\) and \(l=2\) so we are examining the lowest 2 bits, in
numbers of 4 bits (i.e. modulo 16):</p>
<p>Take \(a=17, \ b=41\) which are both 1 mod 4. Now we apply an offset
value \(c=9\) (can be anything). We find:</p>
<p>\((a+c)_{16} = 26_{16}=10,\quad (b+c)_{16}=50_{16} = 2\)</p>
<p>and both the answers (10 and 2) are 2 mod 4, which verifies the point:
equality in the lowest order bits can be preserved when adding members.
This is what allows Wagner's trick to work.</p>
<p>If we talk about multiplication, though, particularly in a group of
prime order, we find we don't get these properties preserved; in such a
group, multiplication has a strong <strong>scrambling effect</strong>. We'll take one
concrete example: \((\mathbb{Z}_{29}, \times)\). If I start with
any number and just keep multiplying by itself (this is basically how
'generators' work), we get this sequence:</p>
<p>\(3,9,27,23,11,4,12,7,21,5,15,16,19,28,26,20,2,6,18,25,17,22,8,24,14,13,10,1,3,\ldots
\)</p>
<p>(e.g. 4th element is 23 is because 27 times 3 mod 29 = 23).</p>
<p>The pattern repeats after 29 steps as expected; but within the sequence
we have an entirely random ordering. This is a direct consequence of the
fact that the number 3 and 29 have no common factors, there's nowhere
they can "line up".</p>
<p>To illustrate further, consider what happens with addition instead:
still working modulo 29, let's see what happens if we add a number to
itself repeatedly (note I chose 25 to be a slightly less obvious case -
but it's still obvious enough!):</p>
<p>\(25,21,17,13,9,5,1,26,22,18,14,10,6,2,27,23,19,15,11,7,3,28,24,20,16,12,8,4,0,
\ldots \)</p>
<p>Note that you're seeing it dropping by 4 each time because \(25 \equiv
-4\) in mod 29. There is always such a simple pattern in these
sequences in additive groups, and that's why division is trivial while
discrete logarithm is not.</p>
<p>So, as a consequence of this scrambling effect, we also find that
Wagner's observation about adding integers and then taking modulo
\(l\) no longer works, in multiplicative groups, at least in general.
Again, a concrete example using \((\mathbb{Z}_{29}, \times)\):</p>
<p>Let \(a=17,\ b=13\); both integers modulo 29. We'll, as before, check
the value modulo 4, both before and after adding an offset: they are
both 1 modulo 4. Let the offset we're going to apply to both, be 9. But
this time we're not going to <em>add</em> 9 but multiply it, because that is
the group operation now; we get:</p>
<p>\((17\times 9)_{29} = 153_{29} = 8_{29} \quad \rightarrow 0_{4}
\)</p>
<p>but:</p>
<p>\((13\times 9)_{29} = 117_{29} = 1_{29} \quad \rightarrow 1_{4}
\)</p>
<p>and, so unlike in the additive group case, we failed (at least for this
example, and this group - I haven't <span
style="text-decoration: underline;">proved</span> anything!) to preserve
the two low order bits (or the value mod 4, equivalently).</p>
<p>In summary, as far as the current state of mathematics goes, it is
believed that there is not a way to do such a property preservation
"through" multiplication - but specifically this statement only applies
in groups where the discrete log is <em>actually</em> hard.</p>
<p>All of the above cross-applies to elliptic curves: like in
multiplicative groups (certain of them), the DLP is hard because the
group operator is essentially a 'scrambler', so the preservation of
properties, that Wagner requires, doesn't work.</p>
<h2>Applications to real systems</h2>
<h3>The OR of sigma protocols.</h3>
<p>This is a topic that was covered in an earlier <a href="https://joinmarket.me/blog/blog/ring-signatures/">blog
post</a>,
so I will not give the outline here - but you'll need that context to
understand the following. But we see here a fascinating implication of
Wagner's idea to these protocols. Recall that the verification uses the
following equation:</p>
<p>\(e_1 \oplus e_2 \ldots \oplus e_k = e\)</p>
<p>... look familiar at all? This of course is <em>exactly</em> the \(k\)-sum
problem that Wagner attacks! Therefore a dishonest prover has a much
better chance of fooling a verifier (by providing a valid set of
\(e_i\)-s) than one might expect naively if one hadn't thought about
this algorithm. Fortunately, there is a huge caveat: <strong>this attack
cannot be carried out if the protocol has special soundness</strong>. Special
soundness is a technical term meaning that if an extractor can generate
two validating transcripts, it can extract the witness. In this case,
the Wagner algorithm could not be performed <em>without already knowing the
secret/witness </em>(details: the attack would be to generate huge lists of
transcripts \(R, e, s\) (notation as per previous blogs), where \(e,
s\) are varied, keeping \(R\) fixed - but that's exactly how an
extractor works) - so in that sense it wouldn't be an attack at all.
However, not all zero knowledge protocols do have the special soundness
property. So while this is very in the weeds and I am not able to
illustrate further, it is certainly an interesting observation, and the
discussion in the full version of the Wagner paper is worth a read.</p>
<h3>Musig</h3>
<p>Obviously Wagner did not discuss this one :) This will be a very high
level summary of the issue in the context of
<a href="https://eprint.iacr.org/2018/068">Musig</a>,
the newly proposed scheme for constructing multisignatures via
aggregated Schnorr signatures. Read the Musig paper for more detail.</p>
<p>Recall that the naive aggregation of Schnorr signatures is insecure in
the multisig context due to what can be loosely called "related key
attacks" or "key subtraction attacks":</p>
<p>\(P_1 = x_1 G\quad P_2 =x_2G\)</p>
<p>\(s_1 = k_1 + ex_1\ ,\ s_2 = k_2 + ex_2\quad
s_{\textrm{agg}} = k_1 + k_2 + e(x_1+x_2)\)</p>
<p>fails in the multisig context of user-generated keys due to attacker
choosing:</p>
<p>\(P_2 = P^{*}_2 - P_1\quad P^{*}_2 = x^{*}_2 G\)</p>
<p>and then the attacker is able to construct a valid signature without
knowledge of \(x_1\).</p>
<p>The paper explains that a naive fix for this problem <span
style="text-decoration: underline;">is actually susceptible to Wagner's
attack!</span></p>
<p>If you write each key as \(P^{*}_{i} = \mathbb{H}(P_i)P_i\), in
words, you (scalar) multiply each key by its hash, then you still know
the private key (just also multiply it by the same hash value), and you
might think you have removed the key subtraction attack, because an
attacker wants to create \(P_2\) such that it's the difference
between a key he knows and \(P_1\); but he can't know the hash value
before he computes it, so he will never be able to arrange for
\(\mathbb{H}(P_2)P_2\) to be a non-random value. This same logic is
seen in many places, e.g. in the fixing of public keys inside a basic
Schnorr signature challenge. But here, it's not enough, because there
are more degrees of freedom:</p>
<p>Suppose the attacker is all \(n-1\) keys \(P_i\) except for the
first, \(P_1\), which the honest victim provides. Then the attacker's
goal is to make signing work without the honest victim's participation.
Now the aggregate key in this naive form of Musig is:</p>
<p>\(P_{agg} = \sum\limits_{i=1}^{n} \mathbb{H}(P_i)P_i\)</p>
<p>So the attacker's goal is to find all the other keys as offsets to the
first key such that the first key is removed from the equation. He sets:</p>
<p>\(P_i = P_1 + y_iG \quad \forall i \in 2\ldots n\)</p>
<p>i.e the \(y_i\) values are just linear tweaks. Then let's see what
the aggregated key looks like in this naive version of Musig:</p>
<p>\(P_{agg} = \mathbb{H}(P_1)P_1 + \sum\limits_{i=2}^{n}
\mathbb{H}(P_1 + y_i G)(P_1 + y_i G) \)</p>
<p>\(P_{agg} = \mathbb{H}(P_1)P_1 + \sum\limits_{i=2}^{n}
\mathbb{H}(P_1 + y_i G)(P_1) + \sum\limits_{i=2}^{n}
\mathbb{H}(P_1 + y_i G)(y_i G)\)</p>
<p>Now, note that there are three terms and <strong>the last term is an
aggregated key which the attacker controls entirely</strong>. Consequently, if
the attacker can arrange for the first and second terms to cancel out,
he will succeed in signing without the victim's assent. Luckily that's
exactly an instance of Wagner's \(k\)-sum problem!:</p>
<p>\(\sum\limits_{i=2}^{n} \mathbb{H}(P_1 + y_i G) =
-\mathbb{H}(P_1) \)</p>
<p>Notice crucially that we've reduced this to an equation in <strong>integers</strong>
not elliptic curve points, as per the long discussions above about Wei
Dai's observation. This will be soluble, and it will be more soluble
(and more soluble than expected!) for arbitrarily chosen \(y_i\)-s,
as the value of \(n\) increases. The attack requires the attacker to
control some subset of keys (in this simple illustration, \(n-1\)
keys, but it can actually be fewer), but since the whole point is to
remove trust of other key-owners, this is certainly enough to reject
this construction.</p>
<p>The solution is nearly obvious, if unfortunately it makes the equation a
little more complicated: <strong>fix the entire keyset, not just your own key,
in the hash</strong> (notice an echo here to the discussion of ring signatures
in an earlier blog post). By doing so, you cannot separate out the
dependence in \(P_1\) and thus cancel it out. So replace
\(\mathbb{H}(P_1)P_1\) with \(\mathbb{H}(P_1, P_2, \ldots ,
P_n)P_1\). The authors of the musig construct tend to use the term
'delinearization' specifically to describe this.</p>
<h3>Other examples</h3>
<p>In fact, probably the most striking example of how Wagner's attack may
have implications for the security of real systems, is the attack he
describes against Schnorr blind signatures. But it is unfortunately also
the most complicated, so I will just briefly mention here that he shows
that a certain kind of such blind signatures can be forged given a
number \(k\) of parallel interactions with a signing oracle (which is
often a realised thing in systems that actually use blind signatures;
they are often used as kind of tokens/certificates), using the
corresponding \(k\)-sum problem.</p>
<p>He shows that certain specialised hash constructions (which may well be
outdated now, nearly 20 years later) have weaknesses exposed by this
kind of attack.</p>
<p>Curiously, he discusses the case of KCDSA, a Korean variant of DSA,
pointing out that it's possible to collide signatures (specifically the
\(s\) in an \(r, s\) pair), in the sense of having two different
messages with the same signature. A similar concept w.r.t. ECDSA can be
found in <a href="https://link.springer.com/content/pdf/10.1007%2F3-540-45708-9_7.pdf">this
paper</a>
- there it exploits a simple symmetry of the algorithm, but requires
that the public/private key pair be created as part of the 'stunt'.
Wagner on the other hand shows his algorithm can be used to find
"collisions" of this type in the KCDSA algorithm, but without the
restriction of having to create a key pair specially for the purpose
(i.e. it works for an existing key).</p>
<p>Several other possible applications are listed in the long version of
the paper.</p>Ring Signatures2019-02-28T00:00:00+01:002019-02-28T00:00:00+01:00Adam Gibsontag:joinmarket.me,2019-02-28:/blog/blog/ring-signatures/<p>construction of several different ring signatures relevant to Bitcoin.</p><h3>Ring signatures</h3>
<h2>Outline:</h2>
<ul>
<li>Basic goal of 1-of-\(N\) ring signatures</li>
<li>Recap: the \(\Sigma\)-protocol</li>
<li>OR of \(\Sigma\)-protocols, CDS 1994</li>
<li>Abe-Ohkubo-Suzuki (AOS) 2002 (broken version)</li>
<li>Security weaknesses</li>
<li>Key prefixing</li>
<li>Borromean, Maxwell-Poelstra 2015</li>
<li>Linkability and exculpability</li>
<li>AND of \(\Sigma\)-protocols, DLEQ</li>
<li>Liu-Wei-Wong 2004</li>
<li>Security arguments for the LWW LSAG</li>
<li>Back 2015; compression, single-use</li>
<li>Fujisaki-Suzuki 2007 and Cryptonote 2014</li>
<li>Monero MLSAG</li>
</ul>
<h2>Basic goal of 1-of-\(N\) ring signatures</h2>
<p>The idea of a <a href="https://en.wikipedia.org/wiki/Ring_signature">ring
signature</a>
(the term itself is a bit sloppy in context, but let's stick with it
for now) is simple enough:</p>
<p>An owner of a particular private key \(x\) signs a message \(m\) by
taking, usually without setup or interaction, a whole set of public
keys, one of which is his (\(P=xG\)), and forms a signature (exact
form unspecified) such that there is proof that <strong>at least one</strong> of the
private keys is known to the signer, but which one was responsible for
the signature is not known by the verifier, and not calculatable.</p>
<p>Obviously that's pretty vague but captures the central idea. We often
use the term "ring" because the construction must have some symmetry
over the entire set of \(n\) public keys, and a ring/circle represents
symmetry of an arbitrarily high order (limit of an \(n\)-gon). Less
abstractly it could be a good name because of some "loop"-ing aspect
of the algorithm that constructs the signature, as we'll see.</p>
<p>What properties do we want then, in summation?</p>
<ul>
<li>Unforgeability</li>
<li>Signer ambiguity</li>
</ul>
<p>We may want additional properties for some ring signatures, as we'll
see.</p>
<p>In the following sections I want to cover some of the key conceptual
steps to the kinds of ring signatures currently used in cryptocurrency
protocols; most notably Monero, but also several others; and also in the
Confidential Transactions construction (see: Borromean ring signatures,
briefly discussed here). I will also discuss security of such
constructions, in much less detail than the <a href="https://web.archive.org/web/20200713230948/https://joinmarket.me/blog/blog/liars-cheats-scammers-and-the-schnorr-signature/">previous
blog</a>
(on the security of Schnorr signatures), but showing how there are
several tricky issues to be dealt with, here.</p>
<h2>Recap: the \(\Sigma\)-protocol</h2>
<p>We consider a prover \(\mathbb{P}\) and a verifier \(\mathbb{V}\).</p>
<p>A \(\Sigma\)-protocol is a three step game, in which the prover
convinces the verifier of something (it can be \(\mathbb{P}\)'s
knowledge of a secret, but it can also be something more complicated),
in zero knowledge. Readers interested in a much more detailed discussion
of the logic behind this and several applications of the idea can read
Sections 3 and 4 of my <a href="https://github.com/AdamISZ/from0k2bp">From Zero (Knowledge) to
Bulletproofs</a>
writeup, especially section 4.1.</p>
<p>In brief, the three step game is:</p>
<p>\(\mathbb{P} \rightarrow \mathbb{V}\): <strong>commitment</strong></p>
<p>\(\mathbb{V} \rightarrow \mathbb{P}\): <strong>challenge</strong></p>
<p>\(\mathbb{P} \rightarrow \mathbb{V}\): <strong>response</strong></p>
<p>A few minor notes on this: obviously the game is not literally over with
the response step; the verifier will examine the response to establish
whether it is valid or invalid.</p>
<p>The <strong>commitment</strong> will usually in this document be written \(R\) and
will here always be a point on an elliptic curve, which the prover may
(or may not! in these protocols) know the corresponding scalar multiple
(private key or nonce) \(k\) such that \(R=kG\).</p>
<p>The <strong>challenge</strong> will usually be written \(e\) and will usually be
formed as the hash of some transcript of data; the subtleties around
exactly <em>what</em> is hashed can be vitally important, as we'll see. (This
is in the "Fiat-Shamir transform" case; we discussed the pure
interactive challenge case a bit in the previous blog and many other
places!)</p>
<p>The <strong>response</strong> will usually be a single scalar which will usually be
denoted \(s\).</p>
<p>We will be playing with this structure a lot: forging transcripts \(R,
e, s\); running multiple instances of a \(\Sigma\)-protocol in
parallel and performing logical operations on them. All of this will
play out <em>mostly</em> in the form of a Schnorr signature; again, refer to
previous blog posts or elementary explanations (including those written
by me) for more on that.</p>
<h2>OR of \(\Sigma\)-protocols, CDS 1994</h2>
<p>Let's start with the OR of \(\Sigma\)-protocols. I <em>believe</em> this
solution is due to <a href="https://link.springer.com/content/pdf/10.1007%2F3-540-48658-5_19.pdf">Cramer, Damgård and Schoenmakers
'94</a></p>
<p>(Historical note: the "believe" is because I've seen it cited to that
paper (which is famous for good reason, I guess); but in the paper they
actually attribute <em>this specific idea</em> to "M. Ito, A. Saito, and T.
Nishizeki: Secret Sharing Scheme realizing any Access Structure, Proc.
Glob.Com. (1987)" ; unfortunately I can't find that on the 'net).</p>
<p>It is also described, with a brief discussion of its security proof, in
<a href="https://crypto.stanford.edu/~dabo/cryptobook/BonehShoup_0_4.pdf">Boneh-Shoup</a>
Sec 19.7.2.</p>
<p>This is not, as far as I know, used at all(?) nor that widely discussed,
but it is in some sense the most simple and logical way to get a 1 out
of \(N\) ring signature; use the XOR (\(\oplus\)) operation:</p>
<p>We have in advance a set of public keys \(P_i\). We only know one
private key for index \(j\), \(x_j\).</p>
<p>We'll now use a standard three move \(\Sigma\)-protocol to prove
knowledge of <strong>at least one key</strong> without revealing which index is
\(j\).</p>
<p>We're going to fake the non-\(j\)-index signatures in advance. Choose
\(s_i \stackrel{\$}{\leftarrow} \mathbb{Z}_N\ ,\ e_i
\stackrel{\$}{\leftarrow} \mathbb{Z}_N \quad \forall i \neq j\).</p>
<p>Calculate \(R_i = s_iG - e_iP_i \quad \forall i \neq j\).</p>
<p>For the real signing index, \(k_j \stackrel{\$}{\leftarrow}
\mathbb{Z}_N\ ,\quad R_j = k_jG\).</p>
<p>We now have the full set of commitments: \((R_i \ \forall i)\)</p>
<p>Now for the clever part. In an interactive \(\Sigma\)-protocol, we
would at this point receive a random challenge \(e \in
\mathbb{Z}_N\). For the Fiat Shamir transformed case,
noninteractively (as for a signature), we use the constructed
\(R\)-values as input to a hash function, i.e. \(e = H(m||R_i
\ldots)\). We have already set the non-signing index \(e\)-values,
for the signing index we set \(e_j = e \oplus (\bigoplus_{i \ne
j}{e_i})\).</p>
<p>This allows us to calculate \(s_j = k_j + e_j x_j\), and we now have
the full set of 'responses' for all the \(\Sigma\)-protocols:
\(s_i \ \forall i\). (but here we are using Fiat Shamir, so it's
not actually a response).</p>
<p>By working this way we have ensured that the signature verifier can
verify that the logical XOR of the three \(e\)-values is equal to the
Fiat Shamir based hash-challenge, e.g. for the case of three
"signatures", we will have:</p>
<p>\(e = e_1 \oplus e_2 \oplus e_3 \stackrel{?}{=}
H(m||R_0||R_1||...)\)</p>
<p>where the verifier would calculate each \(R_i\) as \(s_iG -
e_iP_i\).</p>
<p>The excellent feature of this of course is that it is perfectly hidden
which of the three indexes was genuine. But the bad news is that the
protocol as stated, used let's say as a signature scheme, requires
about twice as many field elements as members of the group of signers.
The verifier needs to be given \((s_1, \ldots s_n),(e_1 \ldots
e_n)\).</p>
<p>Another excellent feature: this is not restricted to the Schnorr ID
protocol. It can work with another identity protocol, and even better,
it could work with a <em>mix</em> of them; they only have to share the one
challenge \(e\).</p>
<h2>Abe-Ohkubo-Suzuki (AOS) 2002 (broken version)</h2>
<p>This is an excellent
<a href="https://www.iacr.org/cryptodb/archive/2002/ASIACRYPT/50/50.pdf">paper</a>
generally, but its stand-out contribution, in this context, is a <strong>more
compact</strong> version of the 1 of n ring signature above. To clarify here,
both this and the previous are \(O(n)\) where \(n\) is the group
size, so "much more compact" is about the constant factor (scale not
scaling!); we reduce it from roughly 2 to roughly 1.</p>
<p>"Broken version" - here I'll present a slightly simpler form than the
one in the paper, and then explain the serious problem with it - which I
hope will be productive. <strong>Please don't mistake this as meaning that
the AOS design was broken, it was never presented like this in the
paper!</strong></p>
<p>Anyway, I think the best explanation for what's going on here
conceptually is due to A. Poelstra in the <a href="https://github.com/Blockstream/borromean_paper">Borromean ring signatures
paper</a>,
particularly Section 2 ; the reference to time travel may seem whimsical
but it gets to the heart of what's going on here; it's about having a
simulated form of causality with one way functions, and then violating
that.</p>
<p>In short: creating an ordinary Schnorr sig without the key (i.e.
forging) is impossible because, working at the curve point level of the
equation (\(sG = R + H(m||R)P\)), you need to know the hash value
before you can calculate \(R\), but you need to know the value of
\(R\) before you can calculate the hash. So we see that two one way
functions are designed to conflict with one another; only by removing
one of them (going from curve points to scalar eqn: (\(s = k +
H(m||kG)x\)), can we now create a valid \(s, R, m\) set.</p>
<p>To achieve that goal over a set of keys, we can make that "simulated
causality enforcement" be based on the same principle, but over a set
of equations instead of one. The idea is to make the commitment
\(H(m||R)\) use the \(R\) value from the "previous"
signer/key/equation, where "previous" is modulo \(N\), i.e. there is
a loop of dependencies (a ring, in fact).</p>
<p>[Quick description:]{style="text-decoration: underline;"}</p>
<p>Our goal is a list of \(N\) correctly verifying Schnorr signature
equations, with the tweak as mentioned that each hash-value refers to
the "previous" commitment. We will work with \(N=4\) and index from
zero for concreteness. Our goal is:</p>
<p>\(s_0 G = R_0 + H(m||R_3)P_0\)</p>
<p>\(s_1 G = R_1 + H(m||R_0)P_1\)</p>
<p>\(s_2 G = R_2 + H(m||R_1)P_2\)</p>
<p>\(s_3 G = R_3 + H(m||R_2)P_3\)</p>
<p>Again for concreteness, we imagine knowing specifically the private key
\(x_2\) for index 2, only. We can successfully construct the above,
but only in a certain sequence:</p>
<p>Choose \(k_2 \stackrel{\$}{\leftarrow} \mathbb{Z}_N,\ R_2 =
k_2G\), choose \(s_3 \stackrel{\$}{\leftarrow} \mathbb{Z}_N\).</p>
<p>\(\Rightarrow R_3 = s_3 G - H(m||R_2)P_3\). Now choose \(s_0
\stackrel{\$}{\leftarrow} \mathbb{Z}_N\).</p>
<p>\(\Rightarrow R_0 = s_0 G - H(m||R_3)P_0\). Now choose \(s_1
\stackrel{\$}{\leftarrow} \mathbb{Z}_N\).</p>
<p>\(\Rightarrow R_1 = s_1 G - H(m||R_0)P_1\).</p>
<p>Last, do not choose but <strong>calculate</strong> \(s_2\): it must be \(s_2 = k_2
+ H(m||R_1)x_2\).</p>
<p>After this set of steps, the set of data: \(e_0, s_0, s_1, s_2, s_3\)
can be verified without exposing which private key was known. Here is
the verification:</p>
<p>Given \(e_0, s_0\), reconstruct \(R_0 = s_0G -e_0P_0\).</p>
<p>\(\Rightarrow e_1 =H(m||R_0)\ ,\ R_1 = s_1 G - e_1P_1\)</p>
<p>\(\Rightarrow e_2 =H(m||R_1)\ ,\ R_2 = s_2 G - e_2P_2\)</p>
<p>\(\Rightarrow e_3 =H(m||R_2)\ ,\ R_3 = s_3 G - e_3P_3\)</p>
<p><strong>Check</strong>: \(e_0 \stackrel{?}{=} H(m||R_3)\).</p>
<h3>Security weaknesses</h3>
<p>The description above can't be described as secure.</p>
<p>To give a hint as to what I mean: is there something <strong>not completely
fixed</strong> in the above construction? Maybe an issue that's not even
specific to the "ring" construction, but even for any one of the
signature equations?</p>
<p>....</p>
<p>The answer is the keys, \(P_i\). We can in the most general case
consider three scenarios, although there may be some gray areas between
them:</p>
<ul>
<li>Key(s) fixed in advance: \(P_1 \ldots P_N\) are all specified
before doing anything, and not allowed to change by the verifier.
Every signature must be on that set of keys.</li>
<li>The <em>set</em> <em>of possible keys</em> is fixed in advance exactly as
described above, but the <em>set of keys used in the ring</em> is chosen by
the signer, dynamically, in signing oracle queries or forgery
attempts.</li>
<li>Even the set of possible keys is dynamic. That is to say, any valid
curve point (for EC case) is a valid potential key in (ring)
signature.</li>
</ul>
<p>This is not a full taxonomy of possible attack scenarios, either. Not
only must we consider the difference between EUF-CMA and SUF-CMA as was
discussed in the previous blog (a reminder: with SUF, a forger should
not be able to even create a second signature on the same message -
ECDSA doesn't have this in naive form), but much more: we must also
consider which of the above three key settings applies.</p>
<p>Even outside of ring signature settings, just considering a large scale
deployment of a signature scheme across millions or billions of keys,
could mean that the difference between these cases really matters. In
<a href="https://eprint.iacr.org/2015/996">this</a>
paper by Dan Bernstein the term MU-UF-CMA is used to refer to the
"multi-user" setting for this, where only single-key signatures are
used but one must consider whether having billions of other keys and
signing oracles for them might impact the security of <strong>any one</strong> key
(notice the huge difference between "I want to forge on \(P\)" and
"I want to forge on any existing key" is, in this scenario).</p>
<p>So enough about settings, what exactly constitutes a security problem
with the above version of the AOS ring sig?</p>
<p>Consider any one element in the ring like:</p>
<p>\(s_0 = R_0 + H(m||R_3)P_0\)</p>
<p>where, for concreteness, I choose \(n=4\) and look at the first of 4
signature equations. Because of Schnorr's linearity (see <a href="https://web.archive.org/web/20200713230948/https://joinmarket.me/blog/blog/flipping-the-scriptless-script-on-schnorr/">this earlier
blog
post</a>
for some elucidations on the <em>advantage</em> of this linearity, although it
was also noted there that it had concomitant dangers (worse,
actually!)), there are two obvious ways we could tweak this equation:</p>
<p>(1) Tweaked \(s\) values on fixed message and tweaked keys:</p>
<p>Choose \(\alpha \in \mathbb{Z}_N\) and set \(s' = s_0
+\alpha\). We will not alter \(R=kG\), but we alter \(P_0
\rightarrow P_0 + e_0\^{-1}\alpha G\). This makes the verification
still work <strong>without altering the fixing of the nonce in the hash value
\(e_0\):</strong></p>
<p>\(s_0 G + \alpha G = R_0 + e_0 P_0 + \alpha G = R_0 + e_0\left(P_0 +
e_0\^{-1}\alpha G\right)\)</p>
<p>So it's really not clear how bad this failing is; it's <em>kinda</em> a
failure of strong unforgeability, but that notion doesn't precisely
capture it: we created a new, valid signature against a
[new]{style="text-decoration: underline;"} key, but with two severe
limitations: we weren't able to alter the message, and also, we
weren't able to <em>choose</em> the new key \(P'\). That last is slightly
unobvious, but crucial : if I have a pre-prepared \(P\^{*}\), I
cannot choose \(\alpha\) to get \(P' = P\^{*}\) as that would
require a discrete logarithm break.</p>
<p>A final statement, hopefully obvious: the above can apply to any and all
of the elements of the ring, so the forgery could consist of an entirely
different and random set of keys, not related to the starting set; but
the message would be the same, as would the \(R\) values.</p>
<p>(2) Completely different messages on tweaked keys, with the same
signature</p>
<p>This one is almost certainly more important. Algebraically, we here
allow alterations to the \(e\) values, using multiplication rather
than addition:</p>
<p>Given the same starting \(s_0\) as in (1), we take a chosen new
message \(m\^{*}\) and calculate the new \(e\^{*} =
H(m\^{*}||R_3)\). If we likewise tweak the public key we get that
\(s_0, R_0\) is a valid signature on the new message, with the tweaked
key:</p>
<p>\(s_0 G = R_0 + e_0\^{*}\left(\frac{e_0}{e_0\^{*}} P_0\right)\)</p>
<p>We can see here that this produces a forgery with the same signature
values (but different hash values) on the new keys.</p>
<p>Most definitions of security against forgery require the attacker to
create a signature on a not-previously-queried message - so this <em>is</em> a
successful attack, by most measures.</p>
<p>However it does share the same limitation with (1) mentioned above -
that you cannot "control" the keys on which you get a signature,
unless you know a relative discrete log between one of the existing keys
and your new key, which implies you knew the secret key of the first (in
which case all this is pointless; whenever you have a private key, there
is no forgery on it).</p>
<p><strong>All of this should make very clear the reason why the real AOS (see
Section 5.1 of the paper) discrete-log ring signature fixes the entire
set of keys inside the hash, i.e. \(e_i = H(m || R_{(i-1)\%n}||
P_0 \ldots P_{n-1})\).</strong></p>
<h3>Key Prefixing</h3>
<p>The method in the previous bolded sentence is sometimes called
"key-prefixing". One way of looking at it: the Fiat-Shamir transform
that takes the Identity Protocol into a signature scheme, should hash
the conversation transcript between the prover and verifier, previous to
the challenge step; by including the public keys in this hash, we are
treating the keyset as part of the conversation transcript, rather than
something ex-protocol-run.</p>
<p>Also, the discussion above (both cases (1) and (2)) show clearly that
the same weakness exists for a single (\(n=1\)) key case.</p>
<p>[And yet, for the single key case, it was not a done deal historically -
this caused real world arguments!]{style="text-decoration: underline;"}.
After all, there are many use cases where the key <em>is</em> a given
ex-protocol-run, plus there may be some practical disadvantage to doing
the key-prefixing.</p>
<p>In
<a href="https://rd.springer.com/chapter/10.1007%2F978-3-662-53008-5_2">this</a>
paper from CRYPTO-2016, the controversy arising out of this is
elucidated, showing that these theoretical concerns had very substantial
impact on arguably the largest real world crypto usage (TLS):</p>
<blockquote>
<p>"Key-prefixing comes with the disadvantage that the entire public-key
has to\
be available at the time of signing. Specifically, in a CFRG message
from Sep-\
tember 2015 Hamburg [32] argues "having to hold the public key along
with\
the private key can be annoying" and "can matter for constrained
devices".\
Independent of efficiency, we believe that a cryptographic protocol
should be\
as light as possible and prefixing (just as any other component)
should only\
be included if its presence is justified. Naturally, in light of the
GMLS proof,\
Hamburg [32] and Struik [44] (among others) recommended against
key prefixing\
for Schnorr. Shortly after, Bernstein [10] identifies the error in
the GMLS theo-\
rem and posts a tight security proof for the key-prefixed variant of
Schnorr signa-\
tures. In what happens next, the participant of the CFRG mailing list
switched\
their minds and mutually agree that key-prefixing should be preferred,
despite of\
its previously discussed disadvantages. Specifically, Brown writes
about Schnorr\
signatures that "this justifies a MUST for inclusion of the public key
in the mes-\
sage of the classic signature" [16]. As a consequence, key-prefixing
is contained in\
the current draft for EdDSA [33]..."</p>
</blockquote>
<p><em>Technical note: the "GMLS proof" mentioned in the above is the proof
given in
<a href="https://www.researchgate.net/publication/256720499_Public_key_signatures_in_the_multi-user_setting">this</a>
paper, that was intended to reduce the security of the multi-user
setting to that of the single-user setting, and that Dan Bernstein's
<a href="https://eprint.iacr.org/2015/996">paper</a>
previously mentioned proved to be invalid.</em></p>
<p>What's the TLDR? Fix the keys in any group/ring/multisignature. And
even that may not be enough, see
<a href="https://eprint.iacr.org/2018/068">MuSig</a>
for details of why it really isn't, in the scenario of Bitcoin
aggregated multisig.</p>
<h2>Borromean, Maxwell-Poelstra 2015</h2>
<p>I covered this extensively (including description of AOS as above) in my
<a href="https://github.com/AdamISZ/ConfidentialTransactionsDoc/">CT
writeup</a>
section 3.2</p>
<p>The idea of the construction as outlined in <a href="https://github.com/Blockstream/borromean_paper">the paper by Maxwell,
Poelstra</a>
is to increase the space-efficiency of the published proof even more. By
having several ring signatures joined at a single index we get a
reduction in the number of \(e\) values we publish. This is basically
the same idea as the "AND of \(\Sigma\)-protocols" discussed a
little later in this document (although here we will only be using it
for achieving a specific goal, "Linkability", see more on this next).</p>
<p>For the real world context - Borromean ring signatures are used in
certain implementations of Confidential Transactions (e.g. Liquid by
Blockstream) today, and were previously used also in Monero for the same
goal of CT. They are a radically different use-case of ring signatures
to the one mostly described in the below; instead of using a ring
signature to hide the identity of a signer, they are used to hide which
exponent contains values in the encoding of a value committed to in a
Pedersen commitment. This allows arithmetic to be done on the
Pedersen-committed amount without worrying about overflow into negative
values modulo \(N\).</p>
<h2>Linkability and Exculpability</h2>
<p>In this section we'll briefly describe certain key features that turn
out to be useful in some real-world applications of a ring signature,
before in the following sections laying out how these features are, or
are not, achieved.</p>
<h3>Linkability (and spontaneity)</h3>
<p>At first glance, the idea "linkability" with a ring signature seems to
be a contradiction. Since we are trying to achieve signer
ambiguity/anonymity, we don't really want any "linking" being done.
But the idea is rather clever, and proves to be very interesting for
digital cash.</p>
<p>In a <strong>linkable</strong> ring signature, a participant with key \(P \in L\)
(i.e. \(L\) is a particular set of public keys), should be able to
produce one ring signature on a given message, but should not be able to
do so again without the two ring signatures being linked. Thus,
functionally, each participant can only make such a signature once
(note: they can still retain anonymity if double-signing).</p>
<p>This restriction-to-one-signature-while-keeping-anonymity is easily seen
to be valuable in cases like electronic voting or digital cash, as well
as the oft-cited example explained in the next paragraph.</p>
<p>The <strong>spontaneity</strong> property should be a lot more obvious. Consider the
example of a whistleblower. We would want individuals in some large
group (e.g. government bureaucrats) to attest to a statement, while only
revealing group membership and not individual identity. Clearly this is
not workable if it requires cooperation of other members of the group
(even in any setup phase), so it's necessary that the individual can
create the ring signature "spontaneously", knowing only the public key
of other participants.</p>
<p>The paper uses the abbreviation LSAG for this type of signature:
"Linkable Spontaneous Anonymous Group" signature.</p>
<p>Note that the previous two constructions (CDS, AOS) can also have this
spontaneity property; but not the linkability property.</p>
<h3>Culpability, Exculpability and Claimability</h3>
<p>A ring signature can be described as exculpable if, even given knowledge
of the signing private key, an adversary cannot deduce that that signing
key was the one used to create the ring signature.</p>
<p>Notice that such a property may be immensely important in a range of
scenarios where a ring sig is useful - e.g. for a whistleblower whose
cryptographic keys were stolen or extracted by force, he could still
plausibly deny being the origin of a leak.</p>
<p>The reader can easily verify that the AOS construction, for example, has
this exculpability. The fact that a particular key is released e.g.
\(x_2\) in our concrete example, does not allow inference of it having
been used to create that signature. Any other key could have created the
signature, using the same signing algorithm.</p>
<p>The LWW LSAG, which we'll describe shortly, is on the other hand
<strong>culpable</strong>, i.e. the opposite - because the key image can be verified
to be tied to one particular key.</p>
<p>It's easy to see that the two properties <strong>exculpability</strong> and
<strong>linkability</strong> are somewhat in conflict, although I'm not aware of a
theorem that <em>absolutely requires</em> linkability to somehow tag one key in
case it is leaked.</p>
<p>Lastly, I'll mention <strong>claimability</strong>, which is briefly described also
in the LWW paper (see below). It may be possible for the owner of a key
to independently/voluntarily prove that they were the source of a given
ring signature, which doesn't logically require culpability.
Claimability is generally easy to achieve with some proof of knowledge
technique.</p>
<h2>AND of \(\Sigma\)-protocols, DLEQ</h2>
<p>The thoughtful reader probably won't have much trouble in imagining
what it would mean to do the logical AND of 2 \(\Sigma\)-protocols.</p>
<p>"AND" here just means you need to prove to the Verifier that you know
both secrets / both conditions are true. So this only requires that you
can answer both challenges (second step) with correct responses. Using
the standard notation, that means generating two transcripts:</p>
<p>\((R_1, e, s_1) \quad (R_2, e, s_2)\)</p>
<p>i.e. the same \(e\)-value is given to both protocol runs after
receiving the initial commitments from each. Fiat-Shamir-ising this
protocol will work the same as the usual logic; if considering a
signature scheme, we'll be hashing something like
\(H(m||R_1||R_2||P_1||P_2)\), if we include, as we have learnt
to, key-prefixing.</p>
<p>As we already mentioned, the Borromean ring signature design uses this
idea to compactify a set of ring signatures, since only one
\(e\)-value is being published, rather than \(M\) for \(M\) ring
signatures.</p>
<p>This much is not super-interesting; but we can tighten this up a bit and
only use <strong>one</strong> commitment and response in a special case:</p>
<h3>Proof of Discrete Log Equivalence (DLEQ, PoDLE)</h3>
<p>See one of the first posts on this
<a href="https://web.archive.org/web/20200713230948/https://joinmarket.me/blog/blog/poodle">blog</a>
for a description of this technique; here we're giving a slightly
deeper look at the meaning.</p>
<p>If you are proving not only knowledge of a secret \(x\), but also that
two curve points have the same discrete log \(x\) w.r.t. different
bases \(G\) and \(J\) (whose relative discrete log must not be
known; see earlier blog post etc.), you can condense the above AND by
reusing the commitment and challenge for the two bases:</p>
<p>\(\mathbb{P} \rightarrow \mathbb{V}\): \(R_1= kG,R_2=kJ\)</p>
<p>\(\mathbb{V} \rightarrow \mathbb{P}\): \(e =
H(m||R_1||R_2||P_1||P_2)\)</p>
<p>\(\mathbb{P} \rightarrow \mathbb{V}\): \(s\), (in secret:
\(=k+ex\))</p>
<p>Now, if the prover acted honestly, his construction of \(s\) will
correctly pass verification <strong>twice</strong>:</p>
<p>\(sG \stackrel{?}{=}R_1 +e P_1 \quad sJ \stackrel{?}{=} R_2 +
eP_2\)</p>
<p>... and notice that it would be impossible to make that work for
different \(x\)-values on the two bases \(G\) and \(J\) because
you would need to find \(k_1, k_2 \in \mathbb{Z}_N, x_1, x_2 \in
\mathbb{Z}_N\) such that, <strong>without knowing \(e\) in advance,</strong>
\(s = k_1 + ex_1 =k_2 + ex_2\), which is clearly impossible.</p>
<p>Proof of soundness is easy to see using the standard rewinding technique
(see e.g. previous blog post amongst many other places); after the two
upfront commitments are fixed and the \(e\)-values are "forked", we
will get two \(s\) values as usual and extract \(x\).</p>
<h2>Liu-Wei-Wong 2004 LSAG</h2>
<p>Shortly after the AOS paper, Liu, Wei and Wong published a
<a href="https://www.researchgate.net/publication/220798466_Linkable_Spontaneous_Anonymous_Group_Signature_for_Ad_Hoc_Groups_Extended_Abstract">paper</a>
outlining how the same basic idea could be extended to a slightly more
complex context of requiring <strong>linkability</strong>, as earlier mentioned. It
uses a combination of the above: DLEQ via AND of
\(\Sigma\)-protocols, and OR of \(\Sigma\)-protocols for the ring
signature hiding effect. Detailed algorithm with commentary follows.</p>
<h3>Liu-Wei-Wong's LSAG algorithm</h3>
<p>We start with a keyset \(L = \{P_0 \ldots P_{n-1}\}\) chosen by
the signer, whose index will be \(\pi\) (note the ambiguities about
"what is the set of valid keys?" as was discussed under "Key
Prefixing"). We then form a special new kind of curve point that we'll
name from now on as the <strong>key image</strong> (for reasons that'll become
clear):</p>
<p>\(I =x_{\pi} \mathbb{H}(L)\)</p>
<p>Here \(\mathbb{H}\) is a hash function whose output space is points
on the curve, rather than scalar numbers. (<em>The mechanical operation for
doing this is sometimes described as "coerce to point"; for example,
take the 256 bit number output by SHA256 and interpret it as an
\(x-\)coordinate on secp256k1, find the "next" valid point
\(x,y\), incrementing \(x\) if necessary, or whatever; just has to
be deterministic</em>). \(\mathbb{H}(L)\) is therefore going to play the
same role as \(J\) in the previous section, and we assume
intractability of relative discrete log due to the hashing.</p>
<h3>Signing LWW LSAG</h3>
<p>The following steps are very similar "in spirit" to AOS; we still
"extend the causality loop" (bastardising Poelstra's description)
over the whole set of signatures instead of just one, but this time we
also "lift" the loop onto a base of \(\mathbb{H}(L)\) and replicate
the signatures there, too:</p>
<ul>
<li>Set \(k_{\pi} \stackrel{\$}{\leftarrow} \mathbb{Z}_N\)</li>
<li>Form the hash-challenge at the next index: \(e_{\pi+1} =
H(m||L||k_{\pi}G||k_{\pi}\mathbb{H}(L)||I)\)</li>
<li>Note to the above: \(k_{\pi}G\) was previously called
\(R_{\pi}\) in AOS; we are trying to preserve here, the same
notation where possible; and of course it's the \(R\) value, not
the \(k\)-value that will be known/calculated by the verifier. The
same applies to the "lifted" nonce-point which follows it in the
concatenation. With respect to the key image, note that it <em>will</em> be
published and known to the verifier; but he won't know which index
it corresponds to.</li>
<li>Pick \(s_{\pi+1} \stackrel{\$}{\leftarrow} \mathbb{Z}_N\);
then we do as in AOS, but duplicated; we set:</li>
<li>\(R_{\pi+1} = s_{\pi+1}G - e_{\pi+1}P_{\pi+1}\) and
\(R\^{*}_{\pi+1} = s_{\pi+1}\mathbb{H}(L) - e_{\pi+1}I\)</li>
<li>I realise the last line is pretty dense, so let's clarify: the
first half is exactly as for AOS; calculate \(R\) given the random
\(s\) and the just-calculated hash value \(e\). The <em>second</em>
half is <strong>the same thing with the base point \(G\) replaced with
\(\mathbb{H}(L)\), and the pubkey replaced with \(I\) at every
index</strong>. We used a shorthand \(R\^{*}\) to mean
\(k_{\pi+1}\mathbb{H}(L)\), because of course we don't
actually <em>know</em> the value \(k_{\pi+1}\).</li>
<li>Calculate the next hash-challenge as \(e_{pi+2} =
H(m||L||R_{\pi+1}||R\^{*}_{\pi+1}||I)\)</li>
<li>Etc...</li>
<li>As with AOS, we can now forge all the remaining indices, wrapping
around the loop, by repeating the above operation, generating a new
random \(s\) at each step, until we get back to the signing index
\(\pi\), when we must calculate \(s_{\pi}\) as: \(s_{\pi}
= k_{\pi} + e_{\pi}x_{\pi}\).</li>
<li>Signature is published as \(\sigma_{L}(m) = (s_0 \ldots
s_{n-1}, e_0, I)\). (As before, if the keyset \(L\) is not
specified in advance, it will have to be published for the
verifier).</li>
</ul>
<p>So what we're doing here is OR(DLEQ(0), DLEQ(1),.... DLEQ(n-1)). And
as observed, each DLEQ is actually an AND: "AND(I know x for P, x for P
is same as x for P2)". Hence this represents a clever combination of
AND- and OR- of \(\Sigma\)-protocols.</p>
<p><em>On a personal note, when I first saw something of this type (I think it
was Cryptonote, see below), I found it quite bewildering, and I'm sure
I'm not alone! But what partially saved me is having already studied
PoDLE/DLEQ as well as AOS ring sigs, so I could intuit that something
combining the two ideas was going on. I hope the previous painstaking
introductions make it all a lot clearer!</em></p>
<p>Note the key similarities and difference(s) in the published signature,
to the AOS case: you still only need to publish one hash \(e_0\) since
the others are determined by it, but you <strong>must</strong> publish also the key
image \(I\); if another LWW LSAG is published using the same private
key, it will perforce have the same key image, and be recognized as
having come from the same key [without revealing which
key]{style="text-decoration: underline;"}.</p>
<p>The protocol using the LSAG can thus reject a "double-sign", if
desired.</p>
<p>Let's sanity check that we understand the verification algorithm, since
it is slightly different than AOS:</p>
<h3>Verifying LWW LSAG</h3>
<p>Start with the given keyset \(L\), the message \(m\) and the
signature \((s_0 \ldots s_{n-1}, e_0, I)\)</p>
<ul>
<li>Construct \(e_{1} = H(m||L||R_{0}||R\^{*}_{0}||I)\)
using \(R_0 = s_0G - e_0 P_0\) and \(R\^{*}_{0} = s_0
\mathbb{H}(L) - e_0 I \)</li>
<li>Repeat at each index using the new \(e_j\) until \(e_0\) is
calculated at the last step and verify it matches: \(e_0
\stackrel{?}{=} H(m||L||R_{n-1}||R\^{*}_{n-1}||I)\).
Accept if so, reject if not.</li>
</ul>
<p>(with the additional point mentioned: the protocol using the sig scheme
may also reject this as valid if \(I\) has already been used; this
additional protocol step is usually described as "LINK" in the
literature).</p>
<h3>A brief note on the key image</h3>
<p>Make sure you get the difference between this \(\mathbb{H}(L)\) and
the previous \(J\) as per the general DLEQ. In the latter case we can
(and should) choose an arbitrary globally-agreed NUMS point, for example
hashing the standard curve base point \(G\) (with the
"coerce-to-point" technique mentioned). In this case, we have chosen
something that both signer and verifier agree on, as part of the
<strong>setting</strong> of this particular run of the protocol - it's
deterministically tied to the keyset \(L\). The key image\(I\) is
analogous to \(P_2\) in my PoDLE blog post; it's the signer's
"hidden", one-time key.</p>
<p>This changes in the next construction, Back 2015. But first, a few words
on security.</p>
<h2>Security arguments for the LWW LSAG</h2>
<p>The general approach to proving <strong>unforgeability</strong> of this ring
signature is the same as that for the basic Schnorr signature as
described in the previous blog post.</p>
<p>A wrapper around an attacker \(\mathbb{A}\) who we posit to have the
ability to construct a forgery without knowing any private key
\(x_i\), will, as before, have to guess which random oracle query
corresponds to the forgery, and will want to provide two different
"patched" answers to the RO query at that point. As before, there will
be some reduced probability of success due to having to make this kind
of guess, and so the reduction will be even less tight than before.</p>
<p>Also as before, in the EUF-CMA model, we must allow for an arbitrary
number of signing oracle as well as RO queries, which complicates the
statistical analysis considerably, but the basic principles remain the
same. If at some point forgery is successfully achieved twice at the
same index, we will have something like:</p>
<p>\(x_{\pi} =
\frac{s\^{*}_{\pi}-s_{\pi}}{e\^{*}_{\pi}-e_{\pi}}\)</p>
<p>where the * superscripts indicate the second run, and the
\(e\)-values being the patched RO responses.</p>
<p>And as usual, with appropriate statistical arguments, one can generate a
reduction such that forgery ability with a certain probability \(p\)
implies thus ability to solve ECDLP with a related probability
\(p'\).</p>
<p>For proving <strong>signer ambiguity</strong> - for simplicity, we break this into
two parts. If <em>all</em> of the private keys are known to the attacker (e.g.
by subpoena), then this property completely fails. This is what we
called <strong>culpability</strong>. It's easy to see why - we have the key image as
part of the signature, and that is deterministically reproducible given
the private key. If <em>none</em> of the private keys are known to the
attacker, the problem is reduced to the <strong>solution of the <a href="https://en.wikipedia.org/wiki/Decisional_Diffie%E2%80%93Hellman_assumption">Decisional
Diffie Hellman
Problem</a></strong>,
which is considered computationally hard. The reduction is quite
complicated, but as in a standard zero knowledgeness proof, the idea is
that a Simulator can generate a transcript that's statistically
indistinguishable from a genuine transcript.</p>
<p>For proving <strong>linkability </strong> - in the LWW paper an argument is made that
this reduces to ECDLP in more or less the same was as for the
unforgeability argument, using two pairs of transcripts for two
different signatures which are posited to be based on the same private
key but having different key images. Examination of the two pairs of
transcripts allows one to deduce that the private key in the two cases
are the same, else ECDLP is broken.</p>
<p>Notice that these security arguments are [much more complicated than for
the single Schnorr signature case]{style="text-decoration: underline;"}
and perhaps for two distinct reasons: one, because the ring signature is
a more complex algebraic construction, with more degrees of freedom, but
also, because we are asking for a significantly richer set of properties
to hold. In particular notice that even for unforgeability, the EUF-CMA
description is not good enough (we've already discussed this a bit); we
need to consider what happens when creating multiple signatures on
different keysets and how they overlap. Signer anonymity/ambiguity is
especially difficult for LWW and its postdecessors (see below), because
by design it has been weakened (culpability).</p>
<h2>Back 2015; compression, single-use</h2>
<p><em>This is a good place to note that the constructions starting with LWW
are described in some detail in the useful document
<a href="https://ww.getmonero.org/library/Zero-to-Monero-1-0-0.pdf">Zero-To-Monero</a></em>.</p>
<p>Adam Back
<a href="https://bitcointalk.org/index.php?topic=972541.msg10619684#msg10619684">posted</a>
in 2015 on bitcointalk about a potential space saving over the
cryptonote ring signature, based on using AOS and tweaking it to include
a key image.</p>
<p>As was noted above, it's a space saving of asymptotically about 50% to
use a scheme like AOS that only requires publication of one hash
challenge as opposed to one for each index (like the CDS for example).</p>
<p>He then followed up noting that a very similar algorithm had already
been published, namely the LWW we've just described in the above, and
moreover it was published three years before Fujisaki-Suzuki that was
the basis of cryptonote (see below). So it was <em>somewhat</em> of an
independent re-discovery, but there is a significant tweak. I'll
outline the algorithm below; it'll look very similar to LWW LSAG, but
there's a difference.</p>
<h3>Signing Back-LSAG</h3>
<ul>
<li>Define key image \(I =x_{\pi}\mathbb{H}(P_{\pi})\);</li>
<li>Set \(k_{\pi} \stackrel{\$}{\leftarrow} \mathbb{Z}_N\)</li>
<li>Form the hash-challenge at the next index: \(e_{\pi+1} =
H(m||k_{\pi}G||k_{\pi}\mathbb{H}(P_{\pi}))\)</li>
<li>Pick \(s_{\pi+1} \stackrel{\$}{\leftarrow} \mathbb{Z}_N\);
then:</li>
<li>\(R_{\pi+1} = s_{\pi+1}G - e_{\pi+1}P_{\pi+1}\) and
\(R\^{*}_{\pi+1} = s_{\pi+1}\mathbb{H}(P_{\pi+1}) -
e_{\pi+1}I\)</li>
<li>Calculate the next hash-challenge as \(e_{pi+2} =
H(m||R_{\pi+1}||R\^{*}_{\pi+1})\)</li>
<li>Etc...</li>
<li>As with AOS and LWW, we can now forge all the remaining indices,
wrapping around the loop, by repeating the above operation,
generating a new random \(s\) at each step, until we get back to
the signing index \(\pi\), when we must calculate \(s_{\pi}\)
as: \(s_{\pi} = k_{\pi} + e_{\pi}x_{\pi}\).</li>
<li>Signature is published as \(\sigma_{L}(m) = (s_0 \ldots
s_{n-1}, e_0, I)\), as in LWW (\(L\) being the set of \(P\)s).</li>
</ul>
<p>Verification for this is near-identical as for LWW, so is left as an
exercise for the reader.</p>
<h3>What's the difference, and what's the purpose?</h3>
<p>The tweak - which is very similar to Cryptonote (makes sense as it was
an attempt to improve that) - is basically this: by making each of the
signatures in the shifted base point version symmetrical (example:
\(s_2 \mathbb{H}(P_2) = k_2 \mathbb{H}(P_2) + e_2 I\)), it means
that a key image will be valid <em>independent of the set of public keys,
\(L\).</em> This is crucial in a cryptocurrency application - we need the
key image to be a unique double spend signifier across many different
ring signatures with different keysets - the keys are ephemeral and
change between transactions.</p>
<p>So it's a blend of the LWW LSAG, which has the advantage of space
compaction for the same reason as AOS - only one hash must be published,
the others can be deduced from the ring structure - with the
F-S-2007/Cryptonote design, which fixes the key image to the key and not
just the specific ring.</p>
<p>However I have to here leave open whether the security arguments of LWW
carry across to this case. I note that the original description did
<em>not</em> include the keyset in the hash challenge (notice absence of
\(L\)); but see the note on MLSAG below.</p>
<h2>Fujisaki-Suzuki 2007 and Cryptonote</h2>
<p><a href="https://cryptonote.org/whitepaper.pdf">Cryptonote</a>
was adapted from a paper of <a href="https://eprint.iacr.org/2006/389.pdf">Fujisaki and
Suzuki</a>
describing an alternate version of a linkable (here "traceable") ring
signature, in 2007. We won't dwell on these constructions here (except
inasmuch as we referred to them above), as they provide the same
linkability function as the above LSAG, but are less compact. Instead,
in the final section, I'll describe how Monero has applied LWW LSAG and
the Back LSAG to their specific requirements.</p>
<h2>Monero MLSAG</h2>
<p>For anyone paying close attention all the way through, there will be
nothing surprising here!</p>
<p>For a cryptocurrency, we build transactions consisting of multiple
inputs. Each input in Monero's case uses a ring signature, rather than
a single signature, to authenticate the transfer, referring back to
multiple pubkeys possessing coins as outputs of earlier transactions.</p>
<p>So here we need <strong>one ring signature per input</strong>. Moreover, per normal
transaction logic, we obviously need <em>all</em> of those ring signatures to
successfully verify. So this is another case for the "AND of
\(\Sigma\)-protocols". We just run \(M\) cases of Back's LSAG and
combine them with a single \(e\) hash challenge at each key index (so
the hash challenge kind of "spans over the inputs"). Additionally,
note that the hash challenge here is assumed to include the keyset with
a generic \(L\) (limiting tiresome subscripting to a minimum...).</p>
<p>To sign \(M\) inputs each of which have \(n\) keys:</p>
<ul>
<li>For each input, define key image \(I_i
=x_{i,\pi}\mathbb{H}(P_{i,\pi}) \ \forall i \in 0 \ldots
M-1\);</li>
<li>Set \(k_{i, \pi} \stackrel{\$}{\leftarrow} \mathbb{Z}_N \
\forall i \in 0 \ldots M-1\)</li>
<li>Form the hash-challenge at the next index: \(e_{\pi+1} =
H(m||L||k_{0, \pi}G||k_{0,
\pi}\mathbb{H}(P_{0,\pi})||k_{1, \pi}G||k_{1,
\pi}\mathbb{H}(P_{1,\pi}) ...)\)</li>
<li>Pick \(s_{i, \pi+1} \stackrel{\$}{\leftarrow} \mathbb{Z}_N\
\forall i \in 0 \ldots M-1\); then:</li>
<li>\(R_{i, \pi+1} = s_{i, \pi+1}G - e_{\pi+1}P_{i, \pi+1}\)
and \(R\^{*}_{i, \pi+1} = s_{i, \pi+1}\mathbb{H}(P_{i,
\pi+1}) - e_{\pi+1}I_i \ \forall i \in 0 \ldots M-1\)</li>
<li>Calculate the next hash-challenge as \(e_{\pi+2} =
H(m||L||R_{0, \pi+1}||R\^{*}_{0,\pi+1}||R_{1,
\pi+1}||R\^{*}_{2,\pi+1} ...)\)</li>
<li>Etc...</li>
<li>Logic as for AOS, LWW but duplicated at every input with single
\(e\)-challenge, and at signing index for all inputs (\(\pi\)):
\(s_{i, \pi} = k_{i, \pi} + e_{i, \pi}x_{i, \pi}\ \forall
i \in 0 \ldots M-1\).</li>
<li>Signature is published as \(\sigma_{L}(m) = (s_{0,0} \ldots
s_{0,M-1}, \ldots, s_{n-1,0}, \ldots s_{n-1,M-1}, e_0, I_0
\ldots I_{M-1})\).</li>
</ul>
<p>Note:</p>
<p>(1) This algorithm as described requires each input to have the genuine
signer at the same key-index in the set of pubkeys for each input, which
is a limitation.</p>
<p>(2) Monero has implemented Confidential Transactions, and this is
folded in with the above into a new design which seems to have two
variants "RingCTFull" and "RingCTSimple". This can be investigate
further in the documents on RingCT as referenced in the previously
mentioned
<a href="https://ww.getmonero.org/library/Zero-to-Monero-1-0-0.pdf">ZeroToMonero</a>.</p>Liars, cheats, scammers and the Schnorr signature2019-02-01T00:00:00+01:002019-02-01T00:00:00+01:00Adam Gibsontag:joinmarket.me,2019-02-01:/blog/blog/liars-cheats-scammers-and-the-schnorr-signature/<p>security arguments for Schnorr</p><h3>Liars, cheats, scammers and the Schnorr signature</h3>
<p>How sure are <em>you</em> that the cryptography underlying Bitcoin is secure?
With regard to one future development of Bitcoin's crypto, in
discussions in public fora, I have more than once confidently asserted
"well, but the Schnorr signature has a security reduction to ECDLP".
Three comments on that before we begin:</p>
<ul>
<li>If you don't know what "reduction" means here, fear not, we will
get deeply into this here.</li>
<li>Apart from simply <em>hearing</em> this and repeating it, I was mostly
basing this on a loose understanding that "it's kinda related to
the soundness proof of a sigma protocol" which I discussed in my
<a href="https://github.com/AdamISZ/from0k2bp">ZK2Bulletproofs</a>
paper, which is true - but there's a lot more involved.</li>
<li>The assertion is true, but there are caveats, as we will see. And
Schnorr is different from ECDSA in this regard, as we'll also see,
at the end.</li>
</ul>
<p>But why write this out in detail? It actually came sort of out of left
field. Ruben Somsen was asking on slack about some aspect of Monero, I
forget, but it prompted me to take another look at those and other ring
signatures, and I realised that attempting to understand the
<strong>security</strong> of those more complex constructions is a non-starter unless
you <strong>really understand why we can say "Schnorr is secure" in the
first place</strong>.</p>
<h3>Liars and cheats</h3>
<p>The world of "security proofs" in cryptography appears to be a set of
complex stories about liars - basically made up magic beans algorithms
that <em>pretend</em> to solve things that nobody <em>actually</em> knows how to
solve, or someone placing you in a room and resetting your clock
periodically and pretending today is yesterday - and cheats, like
"let's pretend the output of the hash function is \(x\), because it
suits my agenda for it to be \(x\)" (at least in this case the lying
is consistent - the liar doesn't change his mind about \(x\); that's
something!).</p>
<p>I hope that sounds crazy, it mostly really is :)</p>
<p>(<em>Concepts I am alluding to include: the random oracle, a ZKP simulator,
extractor/"forking", an "adversary" etc. etc.</em>)</p>
<h2>Preamble: the reluctant Satoshi scammer</h2>
<p>The material of this blog post is pretty abstract, so I decided to spice
it up by framing it as some kind of sci-fi :)</p>
<p><img alt="" src="https://web.archive.org/web/20200428212652im_/https://joinmarket.me/static/media/uploads/cube-250082_6402.png"></p>
<p>Imagine you have a mysterious small black cube which you were given by
an alien that has two slots you can plug into to feed it input data and
another to get output data, but you absolutely can't open it (so like
an Apple device, but more interoperable), and it does one thing only,
but that thing is astonishing: if you feed it a message and a <strong>public</strong>
key in its input slot, then it'll <em>sometimes</em> spit out a valid Schnorr
signature on that message.</p>
<p>Well in 2019 this is basically useless, but after considerable
campaigning (mainly by you, for some reason!), Schnorr is included into
Bitcoin in late 2020. Delighted, you start trying to steal money but it
proves to be annoying.</p>
<p>First, you have to know the public key, so the address must be reused or
something similar. Secondly (and this isn't a problem, but is weird and
will become relevant later): the second input slot is needed to pass the
values of the hash function sha2 (or whatever is the right one for our
signature scheme) into the black box for any data it needs to hash. Next
problem: it turns out that the device only works if you feed it a few
<em>other</em> signatures of other messages on the same public key, first.
Generally speaking, you don't have that. Lastly, it doesn't <em>always</em>
work for any message you feed into it (you want to feed in 'messages'
which are transactions paying you money), only sometimes.</p>
<p>With all these caveats and limitations, you fail to steal any money at
all, dammit!</p>
<p>Is there anything else we can try? How about we pretend to be someone
else? Like Satoshi? Hmm ...</p>
<p>For argument's sake, we'll assume that people use the Schnorr Identity
Protocol (henceforth SIDP), which can be thought of as "Schnorr
signature without the message, but with an interactive challenge".
We'll get into the technicals below, for now note that a signature
doesn't prove anything about identity (because it can be passed
around), you need an interactive challenge, a bit like saying "yes,
give me a signature, but *I* choose what you sign".</p>
<p>So to get people to believe I'm Satoshi (and thus scam them into
investing in me perhaps? Hmm sounds familiar ...) I'm going to somehow
use this black box thing to successfully complete a run of SIDP. But as
noted it's unreliable; I'll need a bunch of previous signatures
(let's pretend that I get that somehow), but I *also* know this thing
doesn't work reliably for every message, so the best I can do is
probably to try to <strong>scam 1000</strong> <strong>people simultaneously</strong>. That way
they might reasonably believe that their successful run represents
proof; after all it's supposed to be <em>impossible</em> to create this kind
of proof without having the private key - that's the entire idea of it!
(the fact that it failed for other people could be just a rumour, after
all!)</p>
<p>So it's all a bit contrived, but weirder scams have paid off - and they
didn't even use literally alien technology!</p>
<p>So, we'll need to read the input to our hash function slot from the
magic box; it's always of the form:</p>
<p><code>message || R-value</code></p>
<p>... details to follow, but basically \(R\) is the starting value in
the SIDP, so we pass it to our skeptical challenger(s). They respond
with \(e\), intended to be completely random to make our job of
proving ID as hard as possible, then <strong>we trick our black box</strong> - we
don't return SHA2(\(m||R\)) but instead we return \(e\). More on
this later, see "random oracle model" in the below. Our magic box
outputs, if successful, \(R, s\) where \(s\) is a new random-looking
value. The challenger will be amazed to see that:</p>
<p>\(sG = R + eP_{satoshi}\)</p>
<p>is true!! And the millions roll in.</p>
<p>If you didn't get in detail how that scam operated, don't worry,
we're going to unpack it, since it's the heart of our technical story
below. The crazy fact is that <strong>our belief that signatures like the
Schnorr signature (and ECDSA is a cousin of it) is mostly reliant on
basically the argument above.</strong></p>
<p>But 'mostly' is an important word there: what we actually do, to make
the argument that it's secure, is stack that argument on top of at
least 2 other arguments of a similar nature (using one algorithm as a
kind of 'magic black box' and feeding it as input to a different
algorithm) and to relate the digital signature's security to the
security of something else which ... we <em>think</em> is secure, but don't
have absolute proof.</p>
<p>Yeah, really.</p>
<p>We'll see that our silly sci-fi story has <em>some</em> practical reality to
it - it really <em>is</em> true that to impersonate is a bit more practically
feasible than to extract private keys, and we can even quantify this
statement, somewhat.</p>
<p>But not the magic cube part. That part was not real at all, sorry.</p>
<h2>Schnorr ID Protocol and signature overview</h2>
<p>I have explained SIDP with reference to core concepts of Sigma Protocols
and Zero Knowledge Proofs of Knowledge in Section 3.2
<a href="https://github.com/AdamISZ/from0k2bp">here</a>
. A more thorough explanation can be found in lots of places, e.g.
Section 19.1 of <a href="https://crypto.stanford.edu/~dabo/cryptobook/">Boneh and
Shoup</a>.
Reviewing the basic idea, cribbing from my own doc:</p>
<p>Prover \(\mathbf{P}\) starts with a public key \(P\) and a
corresponding private key \(x\) s.t. \(P = xG\).</p>
<p>\(\mathbf{P}\) wishes to prove in zero knowledge, to verifier
\(\mathbf{V}\), that he knows \(x\).</p>
<p>\(\mathbf{P}\) → \(\mathbf{V}\): \(R\) (a new random curve
point, but \(\mathbf{P}\) knows \(k\) s.t. \(R = kG\))</p>
<p>\(\mathbf{V}\) → \(\mathbf{P}\): \(e\) (a random scalar)</p>
<p>\(\mathbf{P}\) → \(\mathbf{V}\): \(s\) (which \(\mathbf{P}\)
calculated from the equation \(s = k + ex\))</p>
<p>Note: the transcript of the conversation would here be: \((R, e,
s)\).</p>
<p>Verification works fairly trivially: verifier checks sG
\(\stackrel{?}{=} R+eP\). See previously mentioned doc for details on
why this is supposedly <em>zero knowledge</em>, that is to say, the verifier
doesn't learn anything about the private key from the procedure.</p>
<p>As to why it's sound - why does it really prove that the Prover knows
\(x\), see the same doc, but in brief: if we can convince the prover
to re-run the third step with a modified second step (but the same first
step!), then he'll be producing a second signature \(s'\) on a
second random \(e'\), but with the same \(k\) and \(R\), thus:</p>
<p>\(x = \frac{s-s'}{e-e'}\)</p>
<p>So we say it's "sound" in the specific sense that only a
knower-of-the-secret-key can complete the protocol. But more on this
shortly!</p>
<p>What about the famous "Schnorr signature"? It's just an
noninteractive version of the above. There is btw a brief summary in
<a href="https://web.archive.org/web/20200428212652/https://joinmarket.me/blog/blog/flipping-the-scriptless-script-on-schnorr/">this</a>
earlier blog post, also. Basically replace \(e\) with a hash (we'll
call our hash function \(H\)) of the commitment value \(R\) and the
message we want to sign \(m\):</p>
<p>\(e = H(m||R)\)</p>
<p>; as mentioned in the just-linked blog post, it's also possible to add
other stuff to the hash, but these two elements at least are necessary
to make a sound signature.</p>
<p>As was noted in the 80s by <a href="https://link.springer.com/content/pdf/10.1007%2F3-540-47721-7_12.pdf">Fiat and
Shamir</a>,
this transformation is generic to any zero-knowledge identification
protocol of the "three pass" or sigma protocol type - just use a hash
function to replace the challenge with H(message, commitment) to create
the new signature scheme.</p>
<p>Now, if we want to discuss security, we first have to decide what that
even means, for a signature scheme. Since we're coming at things from a
Bitcoin angle, we're naturally focused on preventing two things:
forgery and key disclosure. But really it's the same for any usage of
signatures. Theorists class security into at least three types (usually
more, these are the most elementary classifications):</p>
<ul>
<li>Total break</li>
<li>Universal forgery</li>
<li>Existential forgery</li>
</ul>
<p>(Interesting historical note: this taxonomy is due to Goldwasser, Micali
and Rackoff - the same authors who introduced the revolutionary notion
of a "Zero Knowledge Proof" in the 1980s.)</p>
<p>Total break means key disclosure. To give a silly example: if \(k=0\)
in the above, then \(s = ex\) and, on receipt of \(s\), the verifier
could simply multiply it by the modular inverse of \(e\) to extract
the private key \(x\). A properly random \(k\) value, or 'nonce',
as explained ad nauseam elsewhere, is critical to the security. Since
this is the worst possible security failure, being secure against it is
considered the weakest notion of "security" (note this kind of
"reverse" reasoning, it is very common and important in this field).</p>
<p>The next weakest notion of security would be security against universal
forgery - the forger should not be able to generate a signature on any
message they are given. We won't mention this too much; we will focus
on the next, stronger notion of "security":</p>
<p>"Security against existential forgery under adaptive chosen message
attack", often shortened to EUF-CMA for sanity (the 'adaptive(ly)'
sometimes seems to be dropped, i.e. understood), is clearly the
strongest notion out of these three, and papers on this topic generally
focus on proving this. "Chosen message" here refers to the idea that
the attacker even gets to choose <em>what</em> message he will generate a
verifying forgery for; with the trivial restriction that it can't be a
message that the genuine signer has already signed.</p>
<p>(A minor point: you can also make this definition more precise with
SUF-CMA (S = "strongly"), where you insist that the finally produced
signature by the attacker is not on the same message as one of the
pre-existing signatures. The famous problem of <strong>signature
malleability</strong> experienced in ECDSA/Bitcoin relates to this, as noted by
Matt Green
<a href="https://blog.cryptographyengineering.com/euf-cma-and-suf-cma/">here</a>.)</p>
<p>I believe there are even stronger notions (e.g. involving active
attacks) but I haven't studied this.</p>
<p>In the next, main section of this post, I want to outline how
cryptographers try to argue that both the SIDP and the Schnorr signature
are secure (in the latter case, with that strongest notion of security).</p>
<h2>Why the Schnorr signature is secure</h2>
<h3>Why the SIDP is secure</h3>
<p>Here, almost by definition, we can see that only the notion of "total
break" makes sense: there is no message, just an assertion of key
ownership. In the context of SIDP this is sometimes called the
"impersonation attack" for obvious reasons - see our reluctant
scammer.</p>
<p>The justification of this is somehow elegantly and intriguingly short:</p>
<blockquote>
<p>The SIDP is secure against impersonation = The SIDP is <em>sound</em> as a
ZKPOK.</p>
</blockquote>
<p>You can see that these are just two ways of saying the same thing. But
what's the justification that either of them are true? Intuitively the
soundness proof tries to isolate the Prover as a machine/algorithm and
screw around with its sequencing, in an attempt to force it to spit out
the secret that we believe it possesses. If we hypothesise an adversary
\(\mathbb{A}\) who <em>doesn't</em> possess the private key to begin with,
or more specifically, one that can pass the test of knowing the key for
any public key we choose, we can argue that there's only one
circumstance in which that's possible: <strong>if \(\mathbb{A}\) can solve
the general Elliptic Curve Discrete Logarithm Problem(ECDLP) on our
curve.</strong> That's intuitively <em>very</em> plausible, but can we prove it?</p>
<h3>Reduction</h3>
<p>(One of a billion variants on the web, taken from
<a href="https://jcdverha.home.xs4all.nl/scijokes/6_2.html">here</a>
:))</p>
<blockquote>
<p>``` {.joke}
A mathematician and a physicist were asked the following question:</p>
<div class="highlight"><pre><span></span><span class="err"> Suppose you walked by a burning house and saw a hydrant and</span>
<span class="err"> a hose not connected to the hydrant. What would you do?</span>
</pre></div>
<p>P: I would attach the hose to the hydrant, turn on the water, and put out
the fire.</p>
<p>M: I would attach the hose to the hydrant, turn on the water, and put out
the fire.</p>
<p>Then they were asked this question:</p>
<div class="highlight"><pre><span></span><span class="err"> Suppose you walked by a house and saw a hose connected to</span>
<span class="err"> a hydrant. What would you do?</span>
</pre></div>
<p>P: I would keep walking, as there is no problem to solve.</p>
<p>M: I would disconnect the hose from the hydrant and set the house on fire,
reducing the problem to a previously solved form.
```</p>
</blockquote>
<p>The general paradigm here is:</p>
<blockquote>
<p>A protocol X is "reducible to" a hardness assumption Y if a
hypothetical adversary \(\mathbb{A}\) who can break X can also
violate Y.</p>
</blockquote>
<p>In the concrete case of X = SIDP and Y = ECDLP we have nothing to do,
since we've already done it. SIDP is intrinsically a test that's
relying on ECDLP; if you can successfully impersonate (i.e. break SIDP)
on any given public key \(P\) then an "Extractor" which we will now
call a <strong>wrapper</strong>, acting to control the environment of
\(\mathbb{A}\) and running two executions of the second half of the
transcript, as already described above, will be able to extract the
private key/discrete log corresponding to \(P\). So we can think of
that Extractor itself as a machine/algorithm which spits out the \(x\)
after being fed in the \(P\), in the simple case where our
hypothetical adversary \(\mathbb{A}\) is 100% reliable. In this
specific sense:</p>
<blockquote>
<p><strong>SIDP is reducible to ECDLP</strong></p>
</blockquote>
<p>However, in the real world of cryptographic research, such an analysis
is woefully inadequate; because to begin with ECDLP being "hard" is a
computational statement: if the group of points on the curve is only of
order 101, it is totally useless since it's easy to compute all
discrete logs by brute force. So, if ECDLP is "hard" on a group of
size \(2^k\), let's say its hardness is measured as the probability
of successfully cracking by guessing, i.e. \(2^{-k}\) (here
<strong>deliberately avoiding</strong> the real measure based on smarter than pure
guesses, because it's detail that doesn't affect the rest). Suppose
\(\mathbb{A}\) has a probability of success \(\epsilon\); what
probability of success does that imply in solving ECDLP, in our
"wrapper" model? Is it \(\epsilon\)?</p>
<p>No; remember the wrapper had to actually extract <strong>two</strong> successful
impersonations in the form of valid responses \(s\) to challenge
values \(e\). We can say that the wrapper <strong>forks</strong> \(\mathbb{A}\):</p>
<p><img alt="Fork your sigma protocol if you want
fork" src="https://web.archive.org/web/20200428212652im_/https://joinmarket.me/static/media/uploads/.thumbnails/forking.png/forking-659x466.png"></p>
<p><em>Fork your sigma protocol if you want fork</em></p>
<p>Crudely, the success probability is \(\epsilon^2\); both of those
impersonations have to be successful, so we multiply the probabilities.
(More exact: by a subtle argument we can see that the size of the
challenge space being reduced by 1 for the second run of the protocol
implies that the probability of success in that second run is reduced,
and the correct formula is \(\epsilon^2 - \frac{\epsilon}{n}\),
where \(n\) is the size of the hash function output space; obviously
this doesn't matter too much).</p>
<p>How does this factor into a real world decision? We have to go back to
the aforementioned "reverse thinking". The reasoning is something
like:</p>
<ul>
<li>We believe ECDLP is hard for our group, let's say we think you
can't do better than p = \(p\) (I'll ignore running time and
just use probability of success as a measure, for simplicity).</li>
<li>The above reduction implies that <em>if</em> we can break SIDP with prob
\(\epsilon\), we can also break ECDLP with prob \(\simeq
\epsilon^2\).</li>
<li>This reduction is thus <strong>not tight</strong> - if it's really the case that
"the way to break SIDP is only to break ECDLP" then a certain
hardness \(p\) only implies a hardness \(\sqrt{p}\) for SIDP,
which we may not consider sufficiently improbable (remember that if
\(p=2^{-128}\), it means halving the number of bits: \(\sqrt{p}
=2^{-64}\)). See
<a href="https://crypto.stackexchange.com/questions/14439/proofs-by-reduction-and-times-of-adversaries">here</a>
for a nice summary on "non-tight reductions".</li>
<li>And <em>that</em> implies that if I want 128 bit security for my SIDP, I
need to use 256 bits for my ECDLP (so my EC group, say). This is all
handwavy but you get the pattern: these arguments are central to
deciding what security parameter is used for the underlying hardness
problem (here ECDLP) when it's applied in practice to a specific
protocol (here SIDP).</li>
</ul>
<p>I started this subsection on "reductions" with a lame math joke; but I
hope you can see how delicate this all is ... we start with something
we believe to be hard, but then "solve" it with a purely hypothetical
other thing (here \(\mathbb{A}\) ), and from this we imply a two-way
connection (I don't say <em>equivalence</em>; it's not quite that) that we
use to make concrete decisions about security. Koblitz (he of the 'k'
in secp256k1) had some interesting thoughts about 'reductionist'
security arguments in Section 2.2 and elsewhere in
<a href="https://cr.yp.to/bib/2004/koblitz.pdf">this</a>
paper. More from that later.</p>
<p>So we have sketched out how to think about "proving our particular SIDP
instance is/isn't secure based on the intractability of ECDLP in the
underlying group"; but that's only 2 stacks in our jenga tower; we
need MOAR!</p>
<h2>From SIDP to Schnorr signature</h2>
<p>So putting together a couple of ideas from previous sections, I hope it
makes sense to you now that we want to prove that:</p>
<blockquote>
<p>"the (EC) Schnorr signature has existential unforgeability against
chosen message attack (EUFCMA) <strong>if</strong> the Schnorr Identity Protocol is
secure against impersonation attacks."</p>
</blockquote>
<p>with the understanding that, if we succeed in doing so, we have proven
also:</p>
<blockquote>
<p>"the (EC) Schnorr signature has existential unforgeability against
chosen message attack (EUFCMA) <strong>if</strong> the Elliptic Curve discrete
logarithm problem is hard in our chosen EC group."</p>
</blockquote>
<p>with the substantial caveat, as per the previous section, that the
reduction involved in making this statement is not tight.</p>
<p>(there is another caveat though - see the next subsection, <em>The Random
Oracle Model</em>).</p>
<p>This second (third?) phase is much less obvious and indeed it can be
approached in more than one way.
<a href="https://crypto.stanford.edu/~dabo/cryptobook/">Boneh-Shoup</a>
deals with it in a lot more detail; I'll use this as an outline but
dumb it down a fair bit. There is a simpler description
<a href="http://web.stanford.edu/class/cs259c/lectures/schnorr.pdf">here</a>.</p>
<p>The "CMA" part of "EUFCMA" implies that our adversary
\(\mathbb{A}\), who we are now going to posit has the magical ability
to forge signatures (so it's the black cube of our preamble), should be
able to request signatures on an arbitrarily chosen set of messages
\(m_i\), with \(i\) running from 1 to some defined number \(S\).
But we must also allow him to make queries to the hash function, which
we idealise as a machine called a "random oracle". Brief notes on that
before continuing:</p>
<h3>Aside: The Random Oracle Model</h3>
<p>Briefly described
<a href="https://en.wikipedia.org/wiki/Random_oracle">here</a>
. It's a simple but powerful idea: we basically idealise how we want a
cryptographic hash function \(f\) to behave. We imagine an output
space for \(f\) of size \(C\). For any given input \(x\) from a
predefined input space of one or more inputs, we will get a
deterministic output \(y\), but it should be unpredictable, so we
imagine that the function is <em>randomly</em> deterministic. Not a
contradiction - the idea is only that there is no <strong>public</strong> law or
structure that allows the prediction of the output without actually
passing it through the function \(f\). The randomness should be
uniform.</p>
<p>In using this in a security proof, we encounter only one problem: we
will usually want to model \(f\) by drawing its output \(y\) from a
uniformly random distribution (you'll see lines like \(y
\stackrel{\$}{\leftarrow} \mathbb{Z}_N\) in papers, indicating
\(y\) is set randomly). But in doing this, we have set the value of
the output for that input \(x\) permanently, so if we call \(f\)
again on the same \(x\), whether by design or accident, we <em>must</em>
again return the same "random" \(y\).</p>
<p>We also find sometimes that in the nature of the security game we are
playing, one "wrapper" algorithm wants to "cheat" another, wrapped
algorithm, by using some hidden logic to decide the "random" \(y\)
at a particular \(x\). This <em>can</em> be fine, because to the "inner"
algorithm it can look entirely random. In this case we sometimes say we
are "<strong>patching the value of the RO at \(x\) to \(y\)"</strong> to
indicate that this artificial event has occurred; as already mentioned,
it's essential to remember this output and respond with it again, if a
query at \(x\) is repeated.</p>
<p>Finally, this "perfectly random" behaviour is very idealised. Not all
cryptographic protocols involving hash functions require this behaviour,
but those that do are said to be "secure in the random oracle model
(ROM)" or similar.</p>
<h3>Wrapping A with B</h3>
<p><img alt="B tries to win the impersonation game against C, by wrapping the
signature forger
A" src="https://web.archive.org/web/20200428212652im_/https://joinmarket.me/static/media/uploads/.thumbnails/EUFCMA1.png/EUFCMA1-584x413.png"></p>
<p>So we now wrap \(\mathbb{A}\) with \(\mathbb{B}\).
And \(\mathbb{B}\)'s job will be to succeed at winning the SIDP
"game" against a challenger \(\mathbb{C}\) .</p>
<p>Now \(\mathbb{A}\) is allowed \(S\) signing queries; given his
messages \(m_i\), we can use \(S\) eavesdropped conversations \(R,
e, s\) from the actual signer (or equivalently, just forge transcripts
- see "zero knowledgeness" of the Schnorr signature), and for each,
\(\mathbb{B}\) can patch up the RO to make these transcripts fit
\(\mathbb{A}\)'s requested messages; just do
\(H(m_i||R_i)=e_i\). Notice that this part of the process represents
\(S\) queries to the random oracle.</p>
<p>Observe that \(\mathbb{B}\) is our real "attacker" here: he's the
one trying to fool/attack \(\mathbb{C}\) 's identification
algorithm; he's just using \(\mathbb{A}\) as a black box (or cube,
as we say). We can say \(\mathbb{A}\) is a "subprotocol" used by
\(\mathbb{B}\).</p>
<p>It's all getting a bit complicated, but by now you should probably have
a vague intuition that this will work, although of course not reliably,
and as a function of the probability of \(\mathbb{A}\) being able to
forge signatures of course (we'll again call this \(\epsilon\)).</p>
<h3>Toy version: \(\epsilon = 1\)</h3>
<p>To aid understanding, imagine the simplest possible case, when
\(\mathbb{A}\) works flawlessly. The key \(P\) is given to him and
he chooses a random \(k, R =kG\), and also chooses his message \(m\)
as is his right in this scenario. The "CMA" part of EUF-CMA is
irrelevant here, since \(\mathbb{A}\) can just forge immediately
without signature queries:</p>
<ul>
<li>\(\mathbb{A}\) asks for the value of \(H(m||R)\), by passing
across \(m,R\) to \(\mathbb{B}\).</li>
<li>\(\mathbb{B}\) receives this query and passes \(R\) as the
first message in SIDP to \(\mathbb{C}\) .</li>
<li>\(\mathbb{C}\) responds with a completely random challenge value
\(e\).</li>
<li>\(\mathbb{B}\) "patches" the RO with \(e\) as the output for
input \(m, R\), and returns \(e\) to \(\mathbb{A}\) .</li>
<li>\(\mathbb{A}\) takes \(e\) as \(H(m||R)\), and provides a
valid \(s\) as signature.</li>
<li>\(\mathbb{B}\) passes \(s\) through to \(\mathbb{C}\) , who
verifies \(sG = R + eP\); identification passed.</li>
</ul>
<p>You can see that nothing here is new except the random oracle patching,
which is trivially non-problematic as we make only one RO query, so
there can't be a conflict. The probability of successful impersonation
is 1.</p>
<p>Note that this implies the probability of successfully breaking ECDLP is
also \(\simeq 1\). We just use a second-layer wrapper around
\(\mathbb{B}\), and fork its execution after the provision of
\(R\), providing two separate challenges and thus in each run getting
two separate \(s\) values and solving for \(x\), the private
key/discrete log as has already been explained.</p>
<p>Why \(\simeq\)? As noted on the SIDP to ECDLP reduction above, there
is a tiny probability of a reused challenge value which must be factored
out, but it's of course negligible in practice.</p>
<p>If we assert that the ECDLP is not trivially broken in reasonable time,
we must also assert that such a powerful \(\mathbb{A}\) does not
exist, given similarly limited time (well; <em>in the random oracle model</em>,
of course...).</p>
<h3>Full CMA case, \(\epsilon \<\< 1\)</h3>
<p>Now we give \(\mathbb{A}\) the opportunity to make \(S\) signing
queries (as already mentioned, this is what we mean by an "adaptive
chosen message attack"). The sequence of events will be a little longer
than the previous subsection, but we must think it through to get a
sense of the "tightness of the reduction" as already discussed.</p>
<p>The setup is largely as before: \(P\) is given. There will be \(h\)
RO queries allowed (additional to the implicit ones in the signing
queries).</p>
<ul>
<li>For any signing query from \(\mathbb{A}\), as we covered in
"Wrapping A with B", a valid response can be generated by patching
the RO (or using real transcripts). We'll have to account for the
possibility of a conflict between RO queries (addressed below), but
it's a minor detail.</li>
<li>Notice that as per the toy example previously, during
\(\mathbb{A}\)'s forgery process, his only interaction with his
wrapper \(\mathbb{B}\) is to request a hash value
\(H(m||R)\). So it's important to understand that, first
because of the probabilistic nature of the forgery (\(\epsilon
\<\< 1\)), and second because \(\mathbb{A}\)'s algorithm is
unknown, <strong>\(\mathbb{B}\) does not know which hash function query
(and therefore which RO response) will correspond to a successful
forgery.</strong> This isn't just important to the logic of the game; as
we'll see, it's a critical limitation of the security result we
arrive at.</li>
<li>So to address the above, \(\mathbb{B}\) has to make a decision
upfront: which query should I use as the basis of my impersonation
attempt with \(\mathbb{C}\)? He chooses an index \(\omega\
\in 1..h\).</li>
<li>There will be a total of \(S+h+1\) queries to the random oracle,
at most (the +1 is a technical detail I'll ignore here). We
discussed in the first bullet point that if there is a repeated
\(m, R\) pair in one of the \(S\) signing queries, it causes a
"conflict" on the RO output. In the very most pessimistic
scenario, the probability of this causing our algorithm to fail can
be no more than \(\frac{S+h+1}{n}\) for each individual signing
query, and \(\frac{S(S+h+1)}{n}\) for all of them (as before we
use \(n\) for the size of the output space of the hash function).</li>
<li>So \(\mathbb{B}\) will <strong>fork</strong> \(\mathbb{A}\)'s execution,
just as for the SIDP \(\rightarrow\) ECDLP reduction, <strong>at index
\(\omega\)</strong>, without knowing in advance whether \(\omega\) is
indeed the index at the which the hash query corresponds to
\(\mathbb{A}\)'s final output forgery. There's a \(1/h\)
chance of this guess being correct. So the "partial success
probability", if you will, for this first phase, is
\(\epsilon/h\), rather than purely \(\epsilon\), as we had for
the SIDP case.</li>
<li>In order to extract \(x\), though, we need that the execution
<em>after</em> the fork, with the new challenge value, at that same index
\(\omega\), also outputs a valid forgery. What's the probability
of both succeeding together? Intuitively it's of the order of
\(\epsilon^2\) as for the SIDP case, but clearly the factor
\(1/h\), based on accounting for the guessing of the index
\(\omega\), complicates things, and it turns out that the
statistical argument is rather subtle; you apply what has been
called the <strong>Forking Lemma</strong>, described on
<a href="https://en.wikipedia.org/wiki/Forking_lemma">Wikipedia</a>
and with the clearest statement and proof in
<a href="https://cseweb.ucsd.edu/~mihir/papers/multisignatures-ccs.pdf">this</a>
paper of Bellare-Neven '06. The formula for the success probability
of \(\mathbb{B}\) turns out to be:</li>
</ul>
<blockquote>
<p>\(\epsilon_{\mathbb{B}} = \epsilon\left(\frac{\epsilon}{h} -
\frac{1}{n}\right)\)</p>
</blockquote>
<ul>
<li><a href="https://crypto.stanford.edu/~dabo/cryptobook/">Boneh-Shoup</a>
in Section 19.2 bundle this all together (with significantly more
rigorous arguments!) into a formula taking account of the Forking
Lemma, the accounting for collisions in the signing queries, to
produce the more detailed statement, where \(\epsilon\) on the
left here refers to the probability of success of \(\mathbb{B}\),
and "DLADv" on the right refers to the probability of success in
solving the discrete log. The square root term of course corresponds
to the "reduction" from Schnorr sig. to ECDLP being roughly a
square:</li>
</ul>
<blockquote>
<p>\(\epsilon \le \frac{S(S+h+1)}{n} + \frac{h+1}{n} +
\sqrt{(h+1)\ \times \ \textrm{DLAdv}}\)</p>
</blockquote>
<p>So in summary: we see that analysing the full CMA case in detail is
pretty complicated, but by far the biggest take away should be: <strong>The
security reduction for Schnorr sig to ECDLP has the same
\(\epsilon^2\) dependency, but is nevertheless far less tight,
because the success probability is also reduced by a factor \(\simeq
h\) due to having to guess which RO query corresponds to the successful
forgery.</strong></p>
<p>(<em>Minor clarification: basically ignoring the first two terms on the RHS
of the preceding as "minor corrections", you can see that DLAdv is
very roughly \(\epsilon^2/h\)</em>).</p>
<p>The above bolded caveat is, arguably, very practically important, not
just a matter of theory - because querying a hash function is something
that it's very easy for an attacker to do. If the reduction loses
\(h\) in tightness, and the attacker is allowed \(2^{60}\) hash
function queries (note - they can be offline), then we (crudely!) need
60 bits more of security in our underlying cryptographic hardness
problem (here ECDLP); at least, <em>if</em> we are basing our security model on
the above argument.</p>
<p>Although I haven't studied it, <a href="https://eprint.iacr.org/2012/029">the 2012 paper by Yannick
Seurin</a>
makes an argument (as far as I understand) that we cannot do better than
this, in the random oracle model, i.e. the factor of \(h\) cannot be
removed from this security reduction by some better kind of argument.</p>
<h2>Summary - is Schnorr secure?</h2>
<p>For all that this technical discussion has exposed the non-trivial guts
of this machine, it's still true that the argument provides some pretty
nice guarantees. We can say something like "Schnorr is secure if:"</p>
<ul>
<li>The hash function behaves to all intents and purposes like an ideal
random oracle as discussed</li>
<li>The ECDLP on our chosen curve (secp256k1 in Bitcoin) is hard to the
extent we reasonably expect, given the size of the curve and any
other features it has (in secp256k1, we hope, no features at all!)</li>
</ul>
<p>This naturally raises the question "well, but how hard <em>is</em> the
Elliptic Curve discrete logarithm problem, on secp256k1?" Nobody really
knows; there are known, standard ways of attacking it, which are better
than brute force unintelligent search, but their "advantage" is a
roughly known quantity (see e.g. <a href="https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm">Pollard's
rho</a>).
What there isn't, is some kind of proof "we know that \(\nexists\)
algorithm solving ECDLP on (insert curve) faster than \(X\)".</p>
<p>Not only don't we know this, but it's even rather difficult to make
statements about analogies. I recently raised the point on
#bitcoin-wizards (freenode) that I thought there must be a relationship
between problems like RSA/factoring and discrete log finding on prime
order curves, prompting a couple of interesting responses, agreeing that
indirect evidence points to the two hardness problems being to some
extent or other connected. Madars Virza kindly pointed out a
<a href="https://wstein.org/projects/john_gregg_thesis.pdf#page=43">document</a>
that details some ideas about the connection (obviously this is some
pretty highbrow mathematics, but some may be interested to investigate
further).</p>
<h2>What about ECDSA?</h2>
<p>ECDSA (and more specifically, DSA) were inspired by Schnorr, but have
design decisions embedded in them that make them <em>very</em> different when
it comes to security analysis. ECDSA looks like this:</p>
<blockquote>
<p>\(s = k^{-1}\left(H(m) + rx\right), \quad r=R.x, \ R = kG\)</p>
</blockquote>
<p>The first problem with trying to analyse this is that it doesn't
conform to the
three-move-sigma-protocol-identification-scheme-converts-to-signature-scheme-via-Fiat-Shamir-transform.
Why? Because the hash value is \(H(m)\) and doesn't include the
commitment to the nonce, \(R\). This means that the standard
"attack" on Schnorr, via rewinding and resetting the random oracle
doesn't work. This doesn't of course mean, that it's insecure -
there's another kind of "fixing" of the nonce, in the setting
of\(R.x\). This latter "conversion function" kind of a random
function, but really not much like a hash function; it's trivially
"semi-invertible" in as much as given an output x-coordinate one can
easily extract the two possible input R-values.</p>
<p>Some serious analysis has been done on this, for the obvious reason that
(EC)DSA is <strong>very widely used in practice.</strong> There is work by
<a href="https://www.iacr.org/archive/pkc2003/25670309/25670309.pdf">Vaudenay</a>
and
<a href="https://www.cambridge.org/core/books/advances-in-elliptic-curve-cryptography/on-the-provable-security-of-ecdsa/69827A20CC94C54BBCBC8A51DBAF075A">Brown</a>
(actually a few papers but I think most behind academic paywalls) and
most recently <a href="https://dl.acm.org/citation.cfm?doid=2976749.2978413">Fersch et
al</a>.
Fersch gave a talk on this work
<a href="https://www.youtube.com/watch?v=5aUPBT4Rdr8">here</a>
.</p>
<p>The general consensus seems to be "it's very likely secure - but
attempting to get a remotely "clean" security reduction is very
difficult compared to Schnorr".</p>
<p>But wait; before we trail off with an inaudible mumble of "well, not
really sure..." - there's a crucial logical implication you may not
have noticed. Very obviously, ECDSA is not secure if ECDLP is not secure
(because you just get the private key; game over for any signature
scheme). Meanwhile, in the long argument above we <strong>reduced</strong> Schnorr to
ECDLP. This means:</p>
<blockquote>
<p><strong>If ECDSA is secure, Schnorr is secure, but we have no security
reduction to indicate the contrary.</strong></p>
</blockquote>
<p>The aforementioned Koblitz paper tells an interesting historical
anecdote about all this, when the new DSA proposal was first put forth
in '92 (emphasis mine):</p>
<blockquote>
<p>"At the time, the proposed standard --- which soon after became the
first digital signature algorithm ever approved by the industrial
standards bodies --- encountered stiff opposition, especially from
advocates of RSA signatures and from people who mistrusted the NSA's
motives. Some of the leading cryptographers of the day tried hard to
find weaknesses in the NIST proposal. A summary of the most important
objections and the responses to them was published in the Crypto'92
proceedings[17]. The opposition was unable to find any significant
defects in the system. [In retrospect, it is amazing that none of the
DSA opponents noticed that when the Schnorr signature was modified,
the equivalence with discrete logarithms was
lost.]{style="text-decoration: underline;"}"</p>
</blockquote>
<h2>More exotic constructions</h2>
<p>In a future blog post, I hope to extend this discussion to other
constructions, which are based on Schnorr in some way or other, in
particular:</p>
<ul>
<li>The AOS ring signature</li>
<li>The Fujisaki-Suzuki, and the cryptonote ringsig</li>
<li>the Liu-Wei-Wong, and the Monero MLSAG (via Adam Back) ringsig</li>
<li>The MuSig multisignature</li>
</ul>
<p>While these are all quite complicated (to say the least!), so no
guarantee of covering all that, the security arguments follow similar
lines to the discussion in this post. Of course ring signatures have
their own unique features and foibles, so I will hopefully cover that a
bit, as well as the security question.</p>Payjoin2018-12-15T00:00:00+01:002018-12-15T00:00:00+01:00Adam Gibsontag:joinmarket.me,2018-12-15:/blog/blog/payjoin/<p>coinjoins in payments</p><h3>PayJoin</h3>
<h2>PayJoin.</h2>
<p>You haven't read any other blog posts here? No worries, here's what
you need to know (<em>unless you're an expert, read them anyway...</em>):</p>
<ul>
<li>A utxo is an "unspent transaction output" - a Bitcoin transaction
creates one or more of these, and each contains a specific amount of
Bitcoin. Those outputs get "used up" in the transaction that
spends them (somewhat like physical coins, someone gives you them in
a payment to you, then you give them to someone else when you spend
them; bitcoins aren't coins, utxos are coins; only difference is
physical coins don't get destroyed in transactions).</li>
<li>The fees you have to pay for a Bitcoin transaction depend on how
many bytes it takes up; this is *somewhat* dominated by how many
inputs you provide, although there are other factors.</li>
<li>CoinJoin basically means - two or more people provide inputs (utxos)
to a transaction and co-sign without needing trust, because when
they sign that the output amounts and addresses are what they
expect. <strong>Note that CoinJoin requires interaction, almost always.</strong></li>
<li>Traditional "equal-sized" CoinJoin means a bunch of people paying
<em>themselves</em> the same fixed amount in a single transaction
(according to the process just mentioned), with the intention that
nobody can tell which of the equal sized outputs belong to who
(basically!).</li>
</ul>
<h2>The drawbacks of CoinJoin as implemented</h2>
<p>Current implementations of CoinJoin are of the "equal-sized" variety
(see above). This requires coordination, but it's possible to get a
decent number of people to come together and agree to do a CoinJoin of a
certain fixed amount. The negative is that this kind of transaction is
trivially distinguishable from an "ordinary" transaction, in
particular a payment from one counterparty to another. Here's a typical
Joinmarket CoinJoin (and other implementations are just as, or more,
distinguishable):</p>
<p><img alt="Equal-outs-coinjoin-example" src="/web/20200803124759im_/https://joinmarket.me/static/media/uploads/.thumbnails/screenshot_from_2019-01-18_15-00-33.png/screenshot_from_2019-01-18_15-00-33-807x433.png">{width="807"
height="433"}</p>
<p>The biggest flag of "this is CoinJoin" is exactly the multiple
equal-value (0.18875417 here) outputs that are the core premise of the
idea, that give the anonymity. Here, you get anonymity in an "anonymity
set" of all participants of <em>this</em> transaction, first, but through
repeated rounds, you <em>kind of</em> get a much bigger anonymity set,
ultimately of all participants of that CoinJoin implementation in the
absolute best scenario. But it's still only a small chunk of Bitcoin
usage generally.</p>
<p>And while this obviously gets better if more people use it, there is a
limit to that thinking: because <strong>all participants are forced to use the
same denomination for any single round</strong>, it isn't possible to fold in
the payments you're doing using Bitcoin as a currency (don't laugh!)
into these CoinJoin rounds (notice: this problem mostly disappears with
blinded amounts).</p>
<p>So a world where "basically everyone uses CoinJoin" is cool for
privacy, but could end up pretty bad for scalability, because these
transactions are <em>in addition to</em> the normal payments.</p>
<p>Also, the fact that these transactions are trivially watermarked means
that, if the blockchain analyst is not able to "crack" and unmix such
transactions, he can at least isolate them in analysis. That's
something; "these coins went from Exchange A to wallet B and then into
this mixset" may be a somewhat negative result, but it's still a
result. There are even noises made occasionally that coins might be
blocked from being sent to certain exchange-type entities if they're
seen to have come from a "mixer" (doesn't matter that CoinJoin is
<em>trustless</em> mixing here; just that it's an activity specific for
obfuscation).</p>
<p>I don't mean to scaremonger - I have used such CoinJoin for years
(measured in the thousands) and will continue to do so, and never had
payments blocked because of it. But this is another angle that must be
borne in mind.</p>
<p>So let's say our primary goal is to minimize the negative privacy
effects of blockchain analysis; can we do better? It's debatable, but
we <em>do</em> have another angle of attack.</p>
<h2>Hiding in a much bigger crowd ... ?</h2>
<p>[One angle is to make your behaviour look more like other, non-coinjoin
transactions]{style="text-decoration: underline;"}. (For the
philosophically/abstract inclined people, <a href="https://web.archive.org/web/20200803124759/https://joinmarket.me/blog/blog/the-steganographic-principle/">this post might be of
interest</a>,
but it sidetracks us here, so - later!). Let's think of the naive way
to do that. Suppose just Alice and Bob make a 2 party CoinJoin:</p>
<p><code>0.05 BTC --->| 0.05 BTC 3AliceSAddReSs</code></p>
<p><code>0.05 BTC --->| 0.05 BTC 3BobSAddReSs</code></p>
<p>This first attempt is a clear failure - it "looks like an ordinary
payment" <em>only</em> in the sense that it has two outputs (one change, one
payment). But the failure is not <em>just</em> the obvious, that the output
amounts are equal and so "obviously CoinJoin". There's another aspect
of that failure, illustrated here:</p>
<p><code>0.01 BTC --->| 0.05 BTC 3AliceSAddReSs</code></p>
<p><code>0.04 BTC --->| 0.06 BTC 3BobSAddReSs</code></p>
<p><code>0.03 BTC --->|</code></p>
<p><code>0.03 BTC --->|</code></p>
<p>This at least is <em>more</em> plausible as a payment, but it shows the
<strong>subset sum</strong> problem that I was describing in my <a href="https://web.archive.org/web/20200803124759/https://joinmarket.me/blog/blog/coinjoinxt/">CoinJoinXT
post</a>
- and trying to solve with CoinJoinUnlimited (i.e. using a Lightning
channel to break the subset sum problem and feed-back the LN privacy
onto the main chain). While the blockchain analyst <em>could</em> interpret
this as a payment, semi-reasonably, of 0.05 btc by one participant, he
could also notice that there are two subsets of the inputs that add up
to 0.05, 0.06. And also splitting the outputs doesn't fundamentally
solve that problem, notice (they'd also have to split into subsets),
and it would anyway break the idea of "looking like a normal payment"
(one payment, one change):</p>
<p><code>0.01 BTC --->| 0.011 BTC 3AliceSAddReSs</code></p>
<p><code>0.04 BTC --->| 0.022 BTC 3BobSAddReSs</code></p>
<p><code>0.03 BTC --->| 0.039 BTC 3Alice2</code></p>
<p><code>0.03 BTC --->| 0.038 BTC 3Bob2</code></p>
<p>After you think about this problem for a while you come to the
conclusion - only if there's actually a transfer of coins from one
party to the other is it solved. Hence
<a href="https://web.archive.org/web/20200803124759/https://joinmarket.me/blog/blog/coinjoinxt/">CoinJoinXT</a>.</p>
<p>But also, hence <strong>PayJoin</strong> - why not actually do a CoinJoin [while you
are making a payment?]{style="text-decoration: underline;"}</p>
<p>[]{style="text-decoration: underline;"}</p>
<h2>PayJoin advantages</h2>
<p>I'm not sure who first thought of doing CoinJoins (see bullet point at
start) of this particular flavour, but a <a href="https://blockstream.com/2018/08/08/improving-privacy-using-pay-to-endpoint/">blogpost from Matthew
Haywood</a>
last summer detailed an implementation approach which came out of a
technical workshop in London shortly before, and a little later a
<a href="https://github.com/bitcoin/bips/blob/master/bip-0079.mediawiki">BIP</a>
was put out by Ryan Havar.</p>
<p>The central idea is:</p>
<ul>
<li>Let Bob do a CoinJoin with his customer Alice - he'll provide at
least one utxo as input, and that/those utxos will be consumed,
meaning that in net, he will have no more utxos after the
transaction than before, and an obfuscation of ownership of the
inputs will have happened [without it looking different from an
ordinary payment.]{style="text-decoration: underline;"}</li>
</ul>
<p>Before we look in detail at the advantages, it's worth answering my
earlier question ("Why not actually do a CoinJoin while you are making
a payment?") in the negative: it's not easy to coordinate that. It
means that either (a) all wallets support it and have a way for
*anyone* to connect to *anyone* to negotiate this (2-party) CoinJoin
or (b) it's only limited to peer to peer payments between owners of a
specific wallet that has a method for them to communicate. So let's be
clear: this is not going to suddently take over the world, but
incremental increases in usage could be tremendously valuable (I'll
explain that statement shortly; but you probably already get
it).[]{style="text-decoration: underline;"}</p>
<ul>
<li><strong>Advantage 1: Hiding the payment amount</strong></li>
</ul>
<p>This is what will immediately stand out from looking at the idea. Bob
"chips in" a utxo (or sometimes more than one). So the payment
<em>output</em> will be more than the actual payment, and it will be profoundly
unobvious what the true payment amount was. Here's an example:</p>
<p><code>0.05 BTC --->| 0.04 BTC 3AliceSAddReSs</code></p>
<p><code>0.09 BTC --->| 0.18 BTC 3BobSAddReSs</code></p>
<p><code>0.08 BTC --->|</code></p>
<p>Now, actually, Alice paid Bob 0.1 BTC using 0.09 and 0.05, getting back
0.04 change. But what does a blockchain analyst think? His first
interpretation will certainly be that there is a payment <em>either</em> of
0.04 BTC or 0.18 BTC, by the owner of the wallet containing all the
inputs. Now, it probably seems very unlikely that the <em>payment</em> was 0.04
and the <em>change</em> 0.18. Why? Because, if the payment output were 0.04,
why would you use all three of those utxos, and not just the first, say?
(0.05). This line of reasoning we have called "UIH1" in the comments
to <a href="https://gist.github.com/AdamISZ/4551b947789d3216bacfcb7af25e029e">this
gist</a>
(h/t Chris Belcher for the nomenclature - "unnecessary input
heuristic") for the details. To be fair, this kind of deduction by a
blockchain analyst is unreliable, as it depends on wallet selection
algorithms; many are not nearly so simplistic that this deduction would
be correct. But possibly combined with wallet fingerprinting and
detailed knowledge of wallet selection algorithms, it's one very
reasonable line of attack to finding the change output and hence the
payment output.</p>
<p>For those interested in the "weeds" I've reproduced the key points
about this UIH1 and UIH2 (probably more important) including stats
collected by LaurentMT of oxt.me, in an "Appendix" section at the end
of this post.</p>
<p>Anyway, what else <em>could</em> the payment amount be, in the transaction
above? As well as 0.04 and 0.18, there is 0.09 and 0.01. Do you see the
reasoning? <em>If</em> we assume that PayJoin is a possibility, then one party
could be consuming 0.09 and 0.08 and getting back 0.01. And similarly
for other contributions of inputs. In the simplest case, I would claim
there are 4 potential payment amounts if there are only two inputs and
we assume that one of the two is owned by the receiver. For the
blockchain analyst, this is a huge mess.</p>
<ul>
<li><strong>Advantage 2 - breaking Heuristic 1</strong></li>
</ul>
<p>I discussed Heuristic 1 in the <a href="%22https://joinmarket.me/blog/blog/coinjoinxt/">CoinJoinXT
post</a>. Simple
description: people (analysts) assume that all the inputs to any
particular transaction are owned by one wallet/owner; i.e. they assume
coinjoin is not used, usually. Following the overall logic of our
narrative here, it's obvious what the main point is with PayJoin - we
break the heuristic <em>without flagging to the external observer that the
breakage has occurred. </em>This is enormously important, even if the
breakage of the assumption of common input ownership on its own seems
rather trivial (especially if PayJoin is used by only few people), with
only 2 counterparties in each transaction.</p>
<ul>
<li><strong>Advantage 3 - Utxo sanitization</strong></li>
</ul>
<p>This one might not occur to you immediately, at all, but is actually
really nice. Consider the plight of the merchant who sells 1,000 widgest
per day for Bitcoin. At the end of the day he has 1,000 utxos that he
has to spend. Perhaps the next day he pays his supplier with 80% of the
money; he'll have to construct a transaction (crudest scenario) with
800 inputs. It's not just that that costs a lot in fees (it does!); we
can't really directly solve that problem (well - use layer 2! - but
that's another blog post); but we can solve something else about it -
the privacy. The merchant immediately links <em>almost</em> <em>all</em> of his
payments in the 800-input payout transaction - horrible!</p>
<p>But PayJoin really helps this; each payment that comes in can consume
the utxo of the last payment. Here are two fictitious widget payments in
sequence to illustrate; Bob's utxos are bolded for clarity:</p>
<p>[PayJoin 1 - Alice pays Bob 0.1 for a
widget:]{style="text-decoration: underline;"}</p>
<p><code>0.05 BTC --->| 0.04 BTC 3AliceSAddReSs</code></p>
<p><code>0.09 BTC --->| 0.18 BTC 3BobSAddReSs</code></p>
<p><code>0.08 BTC --->|</code></p>
<p>(notice: Bob used up one utxo and created one utxo - no net change)</p>
<p>[PayJoin2 - Carol pays Bob 0.05 for a discount
widget:]{style="text-decoration: underline;"}</p>
<p><code>0.01 BTC --->| 0.02 BTC 3CarolSAddReSs</code></p>
<p><code>0.06 BTC --->| 0.23 BTC 3BobSAddReSs</code></p>
<p><code>0.18 BTC --->|</code></p>
<p>This would be a kind of snowball utxo in the naive interpretation, that
gets bigger and bigger with each payment. In the fantasy case of every
payment being PayJoin, the merchant has a particularly easy wallet to
deal with - a wallet that only ever has 1 coin/utxo! (I know it's quite
dubious to think that nobody could trace this sequence, there are other
potential giveaways <em>in this case</em> than just Heuristic 1; but with
Heuristic 1 gone, you have a lot more room to breathe, privacy-wise).</p>
<p>It's worth mentioning though that the full snowball effect can damage
the anonymity set: after several such transactions, Bob's utxo is
starting to get big, and may dwarf other utxos used in the transaction.
In this case, the transaction will violate "UIH2" (you may remember
UIH1 - again, see the Appendix for more details on this) because a
wallet <em>probably</em> wouldn't choose other utxos if it can fulfil the
payment with only one. So this may create a dynamic where it's better
to mix PayJoin with non-PayJoin payments.</p>
<ul>
<li><strong>Advantage 4 - hiding in (and being helpful to) the large crowd</strong></li>
</ul>
<p>"...but incremental increases in usage could be tremendously
valuable..." - let's be explicit about that now. If you're even
reasonably careful, these PayJoin transactions will be basically
indistinguishable from ordinary payments (see earlier comments about
UIH1 and UIH2 here, which don't contradict this statement). It's a
good idea to use decide on a specific locktime and sequence value that
fits in with commonly used wallets (transaction version 2 makes the most
sense). Now, here's the cool thing: suppose a small-ish uptake of this
was publically observed. Let's say 5% of payments used this method.
<strong>The point is that nobody will know which 5% of payments are PayJoin</strong>.
That is a great achievement (one that we're not yet ready to achieve
for some other privacy techniques which use custom scripts, for example;
that may happen after Schnorr/taproot but not yet), because <em>it means
that all payments, including ones that don't use PayJoin, gain a
privacy advantage!</em></p>
<h2>Merchants? Automation?</h2>
<p>The aforementioned
<a href="https://github.com/bitcoin/bips/blob/master/bip-0079.mediawiki">BIP79</a>
tries to address how this might work in a standardized protocol;
there's probably still significant work to do before the becomes
actualized. As it stands, it may be enough to have the following
features:</p>
<ul>
<li>Some kind of "endpoint" (hence "pay to endpoint"/p2ep) that a
customer/payer can connect to encoded as some kind of URL. A Tor
hidden service would be ideal, in some cases. It could be encoded in
the payment request similar to BIP21 for example.</li>
<li>Some safety measures on the server side (the merchant/receiver) to
make sure that an attacker doesn't use the service to connect,
request, and block: thus enumerating the server's (merchant's)
utxos. BIP79 has given one defensive measure against this that may
be sufficient, Haywood's blog post discussed some more advanced
ideas on that score.</li>
<li>To state the obvious friction point - wallets would have to
implement such a thing, and it is not trivial compared to features
like RBF which are pure Bitcoin.</li>
</ul>
<h2>Who pays the fees?</h2>
<p>The "snowball effect" described above, where the merchant always has
one utxo, may lead you to think that we are saving a lot of fees (no 800
input transactions). But not true except because of some second/third
order effect: every payment to the merchant creates a utxo, and every
one of those must be paid for in fees when consumed in some transaction.
The effect here is to pay those fees slowly over time. And it's left
open to the implementation how to distribute the bitcoin transaction
fees of the CoinJoin. Most logically, each participant pays according to
the amount of utxos they consume; I leave the question open here.</p>
<h2>Implementation in practice</h2>
<p>As far as I know as of this writing (mid-January 2019), there are two
implementations of this idea in the wild. One is from Samourai Wallet,
called
<a href="https://samouraiwallet.com/stowaway">Stowaway</a>
and the other is in
<a href="https://github.com/Joinmarket-Org/joinmarket-clientserver/blob/master/docs/PAYJOIN.md">Joinmarket</a>
as of version 0.5.2 (just released).</p>
<p>I gave a demo of the latter in my last <a href="https://web.archive.org/web/20200803124759/https://joinmarket.me/blog/blog/payjoin-basic-demo/">post on this
blog</a>.</p>
<p>In both cases this is intended for peers to pay each other, i.e. it's
not something for large scale merchant automation (as per discussion in
previous section).</p>
<p>It requires communication between parties, as does any CoinJoin, except
arguably
<a href="https://web.archive.org/web/20200803124759/https://joinmarket.me/blog/blog/snicker/">SNICKER</a>.</p>
<p>The sender of the payment always sends a non-CoinJoin payment
transaction to start with; it's a convenient/sane thing to do, because
if connection problems occur, or software problems, the receiver can
simply broadcast this "fallback" payment instead.</p>
<p>In Joinmarket specifically, the implementation looks crudely like this:</p>
<p><code>Sender Receiver</code></p>
<p><code>pubkey+versionrange --></code></p>
<p><code><-- pubkey and version</code></p>
<p><code>(ECDH e2e encryption set up)</code></p>
<p><code>fallback tx ---></code></p>
<p><code><--- PayJoin tx partial-signed</code></p>
<p><code>co-signs and broadcasts</code></p>
<p>Before starting that interchange of course, the receiver must "send"
(somehow) the sender the payment amount and destination address, as well
as (in Joinmarket) an ephemeral "nick" to communicate over the message
channel. Details here of course will vary, but bear in mind that as any
normal payment, there <em>must </em>be some mechanism for receiver to
communicate payment information to the sender.</p>
<h2>Conclusion</h2>
<p>This is another nail in the coffin of blockchain analysis. If 5% of us
do this, it will <em>not</em> be safe to assume that a totally ordinary looking
payment is not a CoinJoin. That's basically it.</p>
<p>----------------------------------------------------------------------</p>
<h3>Appendix: Unnecessary Input Heuristics</h3>
<p>The health warning to this reasoning has already been given: wallets
will definitely not <em>always</em> respect the logic given below - I know of
at least one such case (h/t David Harding). However I think it's worth
paying attention to (this is slightly edited from the comment section of
the referenced gist):</p>
<p>[Definitions:]{style="text-decoration: underline;"}</p>
<p>"UIH1" : one output is smaller than any input. This heuristically
implies that <em>that</em> output is not a payment, and must therefore be a
change output.</p>
<p>"UIH2": one input is larger than any output. This heuristically
implies that <em>no output</em> is a payment, or, to say it better, it implies
that this is not a normal wallet-created payment, it's something
strange/exotic.</p>
<p>Note: UIH2 does not necessarily imply UIH1.</p>
<p>~~ ~~</p>
<p>So we just have to focus on UIH2. Avoiding UIH1 condition is nice,
because it means that both outputs could be the payment; but in any case
the normal blockchain analysis will be wrong about the payment amount.
If we don't avoid the UIH2 condition, though, we lose the
steganographic aspect which is at least 50% of the appeal of this
technique.</p>
<p>Joinmarket's current implementation does its best to avoid UIH2, but
proceeds with PayJoin anyway even if it can't. The reasoning is
partially as already discussed: not all wallets follow this logic; the
other part of the reasoning is the actual data, as we see next:</p>
<p>[Data collection from LaurentMT:]{style="text-decoration: underline;"}</p>
<p>From block 552084 to block 552207 (One day: 01/12/2018)</p>
<ul>
<li>Txs with 2 outputs and more than 1 input = 35,349<ul>
<li>UIH1 Txs (identifiable change output) = 19,020 (0.54)</li>
<li>!UIH1 Txs = 16,203 (0.46)</li>
<li>Ambiguous Txs = 126 (0.00)</li>
</ul>
</li>
</ul>
<p>From block 552322 to block 553207 (One week: 03/12/2018 - 09/12/2018)</p>
<ul>
<li>Txs with 2 outputs and more than 1 input = 268,092<ul>
<li>UIH1 Txs (identifiable change output) = 145,264 (0.54)</li>
<li>!UIH1 Txs = 121,820 (0.45)</li>
<li>Ambiguous Txs = 1,008 (0.00)</li>
</ul>
</li>
</ul>
<p>And here are a few stats for UIH2:</p>
<p>Stats from block 552084 to block 552207 (One day: 01/12/2018)</p>
<ul>
<li>Txs with 2 outputs and more than 1 input = 35,349<ul>
<li>UIH2 Txs = 10,986 (0.31)</li>
<li>!UIH2 Txs = 23,596 (0.67)</li>
<li>Ambiguous Txs = 767 (0.02)</li>
</ul>
</li>
</ul>
<p>From block 552322 to block 553207 (One week: 03/12/2018 - 09/12/2018)</p>
<ul>
<li>Txs with 2 outputs and more than 1 input = 268,092<ul>
<li>UIH2 Txs = 83,513 (0.31)</li>
<li>!UIH2 Txs = 178,638 (0.67)</li>
<li>Ambiguous Txs = 5,941 (0.02)</li>
</ul>
</li>
</ul>CoinjoinXT2018-09-15T00:00:00+02:002018-09-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2018-09-15:/blog/blog/coinjoinxt/<p>a proposal for multi-transaction coinjoins.</p><h3>CoinJoinXT</h3>
<h1>CoinJoinXT - a more flexible, extended approach to CoinJoin</h1>
<p>*Ideas were first discussed
<a href="https://gist.github.com/AdamISZ/a5b3fcdd8de4575dbb8e5fba8a9bd88c">here</a>.
Thanks again to arubi on IRC for helping me flesh them out.\
*</p>
<h2>Introduction</h2>
<p>We assume that the reader is familiar with CoinJoin as a basic idea -
collaboratively providing inputs to a transactions so that it may be
made difficult or impossible to distinguish ownership/control of the
outputs.</p>
<p>The way that CoinJoin is used in practice is (today mainly using
JoinMarket, but others over Bitcoin's history) is to create large-ish
transactions with multiple outputs of exactly the same amount. This can
be called an "intrinsic fungibility" model - since, although the
transactions created are unambiguously recognizable as CoinJoins, the
indistinguishability of said equal outputs is kind of "absolute".</p>
<p>However, as partially discussed in the earlier blog post <a href="https://web.archive.org/web/20200603010653/https://joinmarket.me/blog/blog/the-steganographic-principle/">"the
steganographic
principle"</a>,
there's at least an argument for creating fungibility in a less
explicit way - that is to say, creating transactions that have a
fungibility effect but aren't <em>necessarily</em> visible as such - they
<em>may</em> look like ordinary payments. I'll call this the <em>deniability</em>
model vs the <em>intrinsic fungibility</em> model. It's harder to make this
work, but it has the possibility of being much more effective than the
<em>intrinsic fungibility model</em>, since it gives the adversary (who we'll
talk about in a minute) an additional, huge problem: he doesn't even
know where to start.</p>
<h2>The adversary's assumptions</h2>
<p>In trying to create privacy, we treat the "blockchain analyst" as our
adversary (henceforth just "A").</p>
<p>Blockchain analysis consists, perhaps, of two broad areas (not sure
there is any canonical definition); we can call the first one
"metadata", vaguely, and think of it is every kind of data that is not
directly recorded on the blockchain, such as personally identifying
information, exchange records etc, network info etc. In practice, it's
probably the most important. The second is stuff recorded directly on
the blockchain - pseudonyms (scriptPubKeys/addresses) and amount
information (on non-amount-blinded blockchains as Bitcoin's is
currently; for a discussion about that see this earlier <a href="https://web.archive.org/web/20200603010653/https://joinmarket.me/blog/blog/the-steganographic-principle/">blog
post</a>);
note that amount information includes the implicit amount - network fee.</p>
<p>Timing information perhaps straddles the two categories, because while
transactions are (loosely) timestamped, there is also the business of
trying to pick up timing and perhaps geographic information from
snooping the P2P network.</p>
<p>With regard to that second category, the main goal of A is to correlate
ownership of different utxos. An old
<a href="https://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf">paper</a>
of Meiklejohn et al 2013 identified two Heuristics (let's call them
probabilistic assumptions), of which the first was by far the most
important:</p>
<ul>
<li>Heuristic 1 - All inputs to a transaction are owned by the same
party</li>
<li>Heuristic 2 - One-time change addresses are owned by the same party
as the inputs</li>
</ul>
<p>The second is less important mainly because it had to be caveat-ed quite
a bit and wasn't reliable in naive form; but, identification of change
addresses generally is a plausible angle for A. The first has been, as
far as I know, the bedrock of blockchain analysis and has been referred
to in many other papers, was mentioned in Satoshi's whitepaper, and you
can see one functional example at the long-existent website
<a href="https://www.walletexplorer.com/">walletexplorer</a>.</p>
<p>[But I think it's important to observe that this list is
incomplete.]{style="text-decoration: underline;"}[]{style="text-decoration: underline;"}</p>
<p>I'll now add two more items to the list; the first is omitted because
it's elementary, the other, because it's subtle (and indeed you might
find it a bit dumb at first sight):</p>
<ul>
<li><code>Heuristic/Assumption 0</code>: All inputs controlled by only one pubkey
are unilaterally controlled</li>
<li>Heuristic/Assumption 1: All inputs to a transaction are owned by the
same party</li>
<li>Heuristic/Assumption 2(?): One-time change addresses are owned by
the same party as the inputs</li>
<li><code>Heuristic/Assumption 3</code>: Transfer of ownership between parties in
one transaction implies payment</li>
</ul>
<p>So, "Heuristic/Assumption" because assumption is probably a better
word for all of these generally, but I want to keep the existing
nomenclature, the "?" for 2 is simply because, as mentioned, this one
is problematic (although still worthy of consideration).</p>
<p><strong>Assumption 0</strong>: basically, that if it's not multisig, was never fully
safe; there was always <a href="https://en.wikipedia.org/wiki/Shamir's_Secret_Sharing">Shamir's secret
sharing</a>
to share shards of a key, albeit that's very rarely used, and you can
argue pedantically that full reconstruction means unilateral control.
But Assumption 0 is a lot less safe now due to the recent
<a href="https://eprint.iacr.org/2018/472">work</a>
by Moreno-Sanchez et al. which means, at the very least, that 2 parties
can easily use a 2-party computation based on the Paillier encryption
system to effectively use a single ECDSA pubkey as a 2-2 multisig. So
this assumption is generally unspoken, but in my opinion is now
generally important (i.e. not necessarily correct!).</p>
<p><strong>Assumption 3</strong>: this is rather strange and looks tautological; I could
have even written "transfer of ownership between parties in one
transaction implies transfer of ownership" to be cheeky. The point, if
it is not clear to you, will become clear when I explain what
"CoinJoinXT" means.</p>
<p>Our purpose, now, is to make A's job harder <strong>by trying to invalidate
all of the above assumptions at once</strong>.</p>
<h2>Quick refresher: BIP141</h2>
<p>This has been discussed in other blog posts about various types of
"CoinSwap", so I won't dwell on it.</p>
<p>Segwit fixes transaction malleability
(<a href="https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki">BIP141</a>,
along with BIP143,144 were the BIPs that specified segwit). One of the
most important implications of this is explained directly in BIP 141
itself, to
<a href="https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki#Trustfree_unconfirmed_transaction_dependency_chain">quote</a>
from it:</p>
<blockquote>
<p><em>Two parties, Alice and Bob, may agree to send certain amount of
Bitcoin to a 2-of-2 multisig output (the "funding transaction").
Without signing the funding transaction, they may create another
transaction, time-locked in the future, spending the 2-of-2 multisig
output to third account(s) (the "spending transaction"). Alice and
Bob will sign the spending transaction and exchange the signatures.
After examining the signatures, they will sign and commit the funding
transaction to the blockchain. Without further action, the spending
transaction will be confirmed after the lock-time and release the
funding according to the original contract.</em></p>
</blockquote>
<p>In short, if we agree a transaction, then we can fix its txid and sign
transactions which use its output(s). The BIP specifically references
the Lightning Network as an example of the application of this pattern,
but of course it's not restricted to it. We can have Alice and Bob
agree to any arbitrary set of transactions and pre-sign them, in
advance, with all of them having the funding transaction as the root.</p>
<h2>CoinJoinXT - the basic case</h2>
<p>CoinJoin involves 2 or more parties contributing their utxos into 1
transaction, but using the above model they can do the same to a funding
transaction, but then pre-sign a set of more than one spending
transaction. Here's a simple schematic:</p>
<div class="highlight"><pre><span></span><span class="err">A 1btc ---></span>
<span class="err"> F (2,2,A,B) --+</span>
<span class="err">B 1btc ---> |</span>
<span class="err"> |</span>
<span class="err"> +-->[Proposed transaction graph (PTG) e.g. ->TX1->TX2->TX3 ..]</span>
</pre></div>
<p>In human terms, you can envisage that: Alice and Bob would like to start
to negotiate a set of conditional contracts about what happens to their
money. Then they go through these steps:</p>
<ol>
<li>One side proposes F (the funding transaction) and a full graph of
unsigned transactions to fill out the PTG above; e.g. Alice
proposes, Bob and Alice share data (pubkeys, destination addresses).
Note that the set doesn't have to be a chain (TX1->TX2->TX3...),
it can be a tree, but each transaction must require sign-off of both
parties (either, at least one 2-2 multisig utxo, or at least one
utxo whose key is owned by each party).</li>
<li>They exchange signatures on all transactions in the PTG, in either
order. Of course, they abort if signatures don't validate.</li>
<li>With this in place (i.e. <strong>only</strong> after valid completion of (2)),
they both sign (in either order) F.</li>
<li>Now both sides have a valid transaction set, starting with F. Either
or both can broadcast them. [The transactions are <em>all</em> guaranteed
to occur as long as at least one of them wants
it]{style="text-decoration: underline;"}. Contrariwise, <strong>none</strong> of
them is valid without F being broadcast.</li>
</ol>
<p>This does achieve one significant thing: <strong>one transaction such as TX2
can transfer coins to, say, Bob's wallet, giving Alice nothing; and yet
we can still get the overall effect of a CoinJoin. In other words,
we've opened up the possibility to violate Heuristic 3 as well as
Heuristic 1, in the same (short) interaction.</strong></p>
<p>This construction works fine if <em>all</em> inputs used in transactions in the
PTG are descendants of F; but this makes the construction very limited.
So we'll immediately add more details to allow a more general use-case,
in the next section.</p>
<h2>Introducing Promises</h2>
<p>If we allowed any of the transactions (TX1, TX2, ...) in the PTG in our
previous example to have an input which did <em>not</em> come from the funding
transaction F, then we would have introduced a risk; if Alice added utxo
UA to, say, TX2, then, before Bob attempted to broadcast TX2, she could
double spend it. This would break the atomicity of the graph, which was
what allowed the crucial additional interesting feature (in bold,
above): that an individual transaction could transfer funds to one
party, without risks to the other. To address this problem, we call
these additional inputs <strong>promise utxos</strong> and make use of <strong>refund
transactions</strong>.</p>
<div class="highlight"><pre><span></span><span class="err">A 1btc ---></span>
<span class="err"> F (2,2,A,B) ---</span>
<span class="err">B 1btc ---> | +--> external payout 0.5 btc to Bob</span>
<span class="err"> | |</span>
<span class="err"> +->[TX1 --> TX2 --> TX3 --> TX4]</span>
<span class="err"> | ^</span>
<span class="err"> | |</span>
<span class="err"> | |</span>
<span class="err"> | +--- utxo A1</span>
<span class="err"> |</span>
<span class="err"> +--> refund locktime M, pay out *remaining* funds to A: 1btc, B: 0.5btc</span>
</pre></div>
<p>In words: if, between the negotiation time and the time of broadcast of
TX3, Alice spends A1 in some other transaction, Bob will still be safe;
after block M he can simply broadcast the presigned refund transaction
to claim the exact number of coins he is owed at that point in the
graph.</p>
<p>The above addresses the case of a single external input being included
in a chain of transactions in the PTG (here, TX1,2,3,4). Extending this,
and generalising to allowing external inputs in many transactions, is
straightforward; we can add such in-PTG backouts at every step,
redeeming all remaining funds to parties according to what they're
owed.</p>
<p>To summarize this section and how it differs from the original, simpler
construction:</p>
<p>Alice and Bob have a choice:</p>
<ol>
<li>They can set up a fully trustless PTG, without promises. They are
then guaranteed to achieve "all or nothing": either all
cooperative signing works, then all transactions can be broadcast
(as long as <em>at least one</em> of them wants to), or nothing
(including F) is broadcast at all.</li>
<li>They can set up a PTG including promises from one or both parties.
Now they don't get "all or nothing" but only ensure that the
transactions that complete are a subset, in order, from the start F.
To achieve this they add presigned backouts at (probably every)
step, so that if the chain "breaks" somewhere along, they will
recover all the funds remaining that are owed to them.</li>
</ol>
<p>The tradeoff is: (2) is not perfectly atomic, but it allows the
transaction graph to include utxos from outside of F's ancestory,
particularly useful for privacy applications. In a sequence of 10
coinjoins, you may be happy to risk that TXs 6-10 don't end up
happening, if it doesn't cost you money. Case (2) is more likely to be
of interest.</p>
<h2>Interlude - overview of features of CoinJoinXT</h2>
<p>There's a large design space here.</p>
<ul>
<li>We can have N parties, not just 2.</li>
<li>We can have as many transactions as we like.</li>
<li>We can have a tree with F as root, rather than a chain.</li>
<li>We can have as many promise utxos from any of the N parties as we
like.</li>
</ul>
<p>A mixture of these features may give different tradeoffs in terms of
<em>intrinsic fungibility</em> vs <em>deniability</em> vs <em>cost</em>; the tradeoff
discussed in the introduction.</p>
<p><strong>Interactivity</strong> - unlike either a CoinSwap of types discussed earlier
in this blog, or doing multiple CoinJoins (to get a better fungibility
effect than just a single one), this only requires one "phase" of
interactivity (in terms of rounds, it may be 3). The two parties
connect, exchange data and signatures, and then immediately disconnect.
(This is what I called no-XBI in the previous <a href="https://web.archive.org/web/20200603010653/https://joinmarket.me/blog/blog/the-half-scriptless-swap/">blog
post</a>).</p>
<p><strong>Boundary</strong> - the adversary A, as was hinted at in the introduction, in
this model, will not necessarily be able to easily see on the blockchain
where the start and end points of this flow of transactions was. To the
extent that this is true, it's an enormous win, but more on this later.</p>
<h2>Example</h2>
<p><img alt="ExampleCJXT" src="../../../../../../20200603010653im_/https:/joinmarket.me/static/media/uploads/.thumbnails/onchaincontract3.png/onchaincontract3-614x422.png">{width="614"
height="422"}</p>
<p>Here we are still restricting to 2 parties for simplicity of the
diagram. There is still a chain of 4 TXs, but here we flesh out the
inputs and outputs. About colors:</p>
<p>Blue txos are co-owned by the two parties, envisioned as 2 of 2 multisig
(although as originally mentioned, the technical requirement is only
that each transaction is signed by both parties).</p>
<p>Red inputs are <strong>promise utxos</strong> as described in the earlier section.</p>
<p>Each promise has a corresponding backout transaction pre-signed as
output consuming the bitcoins of the
[previous]{style="text-decoration: underline;"} transaction to the one
consuming that promise.</p>
<p>Notice that this example contains two possible setups for each
individual transaction in the chain; it can pay out only to one party
(like TX3 which pays bob 0.6btc), or it can pay "CoinJoin-style"
equal-sized outputs to 2 (or N) parties. Choosing this latter option
means you are consciously deciding to blur the line between the
<em>intrinsic-fungibility</em> model and the <em>deniability</em> <em>model,</em> which, by
the way, is not necessarily a bad idea.</p>
<h2>The return of A - amounts leak.</h2>
<p>As mentioned, our adversary A has a very important problem - he may not
know that the above negotiation has happened, unlike a simple CoinJoin
where the transactions are watermarked as such (and this is particularly
true if Alice and Bob do <em>not</em> use equal-sized outputs). The boundary
may be unclear to A.</p>
<p>So, what strategy <em>can</em> A use to find the transaction graph/set? He can
do <a href="https://en.wikipedia.org/wiki/Subset_sum_problem">subset
sum</a>
analysis.</p>
<p>If Alice and Bob are just 'mixing' coins, so that they are paid out
the same amount that they paid in, I'll assert that subset sum is
likely to work. It's true that A's job is quite hard, since in
general, he would have to do such subset-sum analysis on a huge array of
different possible sets of (inputs, outputs) on chain; but nevertheless
it's the kind of thing that can be done by a professional adversary,
over time. The fact that subset sum analysis is theoretically
exponential time and therefore not feasible for very large sets may not
be relevant in practice.</p>
<p>In our example above it may not be hard to identify the two inputs from
Alice (1btc, 0.3btc) as corresponding to 3 outputs (0.8btc, 0.2btc,
0.3btc), albeit that the latter two - 0.2, 0.3 were part of CoinJoins.
Remember that this was a tradeoff - if we <em>didn't</em> make equal sized
outputs, to improve deniability/hiding, we'd no longer have any
ambiguity there.</p>
<h2>Breaking subset-sum with Lightning</h2>
<p><img alt="" src="../../../../../../20200603010653im_/https:/joinmarket.me/static/media/uploads/.thumbnails/amtdecorr2.png/amtdecorr2-711x392.png">{width="711"
height="392"}</p>
<p>Here's one way of addressing the fact that A can do subset-sum on such
a privacy-enhancing CoinJoinXT instantiation. The PTG is unspecified but
you can imagine it as something similar to the previous example.</p>
<p>Marked in blue is what the adversary A doesn't know, even if he has
identified the specific transaction/graph set (as we've said, that in
itself is already hard). Subset-sum analysis won't work here to
identify which output belongs to Alice and which to Bob; since 5.5 + 1.5
!= 6.6, nor does 5.4 fit, nor does such an equation fit with Alice's
input 5.8 on the right hand side of the equation.</p>
<p>The trick is that the 1.5 output is actually a <strong>dual funded Lightning
channel</strong> between Alice and Bob. The actual channel balance is shown in
blue again because hidden from A: (0.3, 1.2). If the channel is then
immediately closed we have fallen back to a case where subset sum works,
as the reader can easily verify.</p>
<p>But if, as is usually the intent, the channel gets used, the balance
will shift over time, due to payments over HTLC hops to other
participants in the Lightning network. This will mean that the final
closing balance of the channel will be something else; for example,
(0.1, 1.4), and then subset-sum will still not reveal which of the 2
outputs (5.4, 5.5) belong to Alice or Bob.</p>
<p>At a high level, you can understand this as a <strong>bleed-through and
amplification of off-chain privacy to on-chain.</strong></p>
<p>It's worth noting that you clearly get a significant part of this
effect from just the dual-funded Lightning channel; if you consider
change outputs in such a single funding transaction, you see the same
effect:</p>
<div class="highlight"><pre><span></span><span class="err">Alice</span>
<span class="err">2.46</span>
<span class="err"> -> Lightning funding 0.1</span>
<span class="err"> -> Change 2.41</span>
<span class="err"> -> Change 2.37</span>
<span class="err">2.42</span>
<span class="err">Bob</span>
</pre></div>
<p>It's easy to see that there is no delinking effect on the change-outs
<em>if</em> we know that the funding is equal on both sides. However, there's
no need for that to be the case; if the initial channel balance is
(Alice: 0.09, Bob: 0.01) then the change-outs are going to the opposite
parties compared to if the channel funding is (Alice: 0.05, Bob: 0.05).
So this concrete example should help you to understand a crucial aspect
of this:</p>
<ul>
<li>Such a fungibility effect is only achieved if the difference between
the two parties' initial inputs is small enough compared to the
size of the dual-funded Lightning channel</li>
<li>If the size of the inputs is very large compared to the Lightning
channel overall size, which currently at maximum is 2**24 satoshis
(about 0.16btc), then, in order to achieve this obfuscation effect,
we "converge" to the case of something like a 2-in and 2-out
equal-sized coinjoin. It's hard for 2 parties to arrange to have
inputs of equal sizes, and it somewhat loses the deniability feature
we were going for. (You can easily confirm for yourself that there
will be no ambiguity if Alice and Bob's inputs are of completely
different sizes).</li>
</ul>
<p>So how does the picture change if instead of just doing a single
dual-funded Lightning channel, we include it as an output in a
CoinJoinXT structure?</p>
<p>The answer again is deniability. Any contiguous subset of the entire
blockchain has the property of sum preservation, modulo fees: the input
total is \~= the output total. So no particular contiguous subset on the
blockchain flags itself as being such a CoinJoinXT structure - unless
subset sum works for some N subsets (2, as in our examples, or higher).
But with the dual funded Lightning output of the type shown here, at
least for the 2 of 2 case, this doesn't work.</p>
<h2>Remove all traces?</h2>
<p>What's been described up to now doesn't quite achieve the desired goal
of "deniability"; there are still what we might call "fingerprints"
in such a CoinJoinXT structure:</p>
<ul>
<li>Timing correlation: if we don't use nLockTime on these
transactions, then one party might choose to broadcast them all at
once. This is at the least a big clue, although not unambiguous. To
avoid it, have the pre-signed transactions in the PTG all be given
specific timelocks.</li>
<li>Shared control utxos. If we use 2 of 2, or N of N, multisig outputs,
of the current normal p2sh type, then they are observable as such,
and this could easily help A to find the "skeleton" of such a
CoinJoinXT structure. Of course, let's not forget that we can do
CoinJoinXT with various equal sized outputs too, mixing the
"intrinsic fungibility" and "deniability" approaches together,
as discussed, so it's not that CoinJoinXT with p2sh multisig
connecting utxos is useless. But we may want to focus on less
detectable forms, like Schnorr/MuSig based multisig with key
aggregation so that N of N is indistinguishable from 1 of 1, or the
new
<a href="https://eprint.iacr.org/2018/472">construction</a>
that allows an ECDSA pubkey to be effectively a 2 of 2 multisig.</li>
</ul>
<h2>Conclusion</h2>
<p><strong>Proof of Concept</strong> - I put together a some very simple <a href="https://github.com/AdamISZ/CoinJoinXT-POC">PoC
code</a>;
it only covers something like the above first "Example" with 2
parties. Going through such an exercise in practice at least allows one
to see concretely that (a) the interaction between the parties is very
minimal (sub-second) which is great of course, but it gets a little
hairy when you think about how to set up a template of such a
transaction chain that 2 parties can agree on using whatever utxos they
have available as inputs. A substantial chunk of that PoC code was
devoted to that - there is a general <code>Template</code> class for specifying a
graph of transactions, with parametrized input/output sizes.</p>
<p><strong>Practicality today</strong> - Although it can be done today (see previous),
there are barriers to making this work well. Ideally we'd have Schnorr
key aggregation for multisig, and support for dual funded Lightning
channels for the amount decorrelation trick mentioned. Without either of
those, such a transaction graph on the blockchain will be <em>somewhat</em>
identifiable, but I still think there can be a lot of use doing it as an
alternative to large sets of clearly identifiable CoinJoins.</p>
<p><strong>Cost tradeoffs</strong> - left open here is the tradeoffs in terms of
blockchain space usage for each "unit of fungibility", i.e. how much
it costs to gain privacy/fungibility this way. I think it's almost
impossible to come up with definitive mathematical models of such
things, but my feeling is that, exactly to the extent any
"deniability" is achieved, it's cost-effective, and to the extent
it's not, it's not cost-effective.</p>
<p><strong>Coordination model</strong> - Currently we have "in play" at least two
models of coordination for CoinJoin - Joinmarket's market-based model,
and the Chaumian server model currently championed by
<a href="https://github.com/nopara73/ZeroLink">ZeroLink</a>.
<strong>CoinJoinXT as an idea is orthogonal to the coordination mechanism</strong>.
The only "non-orthogonal" aspect, perhaps, is that I think the
CoinJoinXT approach may still be pretty useful with only 2 parties (or
3), more so that CoinJoin with only 2/3.</p>
<p>Finally, where should this fit in one's fungibility "toolchest"?
Lightning is <em>hopefully</em> going to emerge as a principal way that people
gain fungibility for their everyday payments. The area it can't help
with now, and probably not in the future due to its properties, is with
larger amounts of money. So you might naturally want to ensure that in,
say, sending funds to an exchange, making a large-ish payment, or
perhaps funding a channel, you don't reveal the size of your cold
storage wallet. I would see the technique described on this blog post as
fitting into that medium-large sized funds transfer situation. CoinJoin
of the pure "intrinsic fungibility" type, done in repeated rounds or
at least in very large anonymity sets, is the other alternative (and
perhaps the best) for large sizes.</p>The Steganographic Principle2018-04-15T00:00:00+02:002018-04-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2018-04-15:/blog/blog/the-steganographic-principle/<p>a framework for thinking about blockchain privacy issues</p><h3>The steganographic principle</h3>
<h1>The Steganographic Principle</h1>
<p>Some time ago I wrote
<a href="https://gist.github.com/AdamISZ/83a17befd84992a7ad74">this</a>
gist, which is an ill-formed technical concept about a way you could do
steganography leveraging randomness in existing network protocols; but I
also called it a "manifesto", jokingly, because I realised the thinking
behind it is inherently political.</p>
<h2>Cryptography is for terrorists, too</h2>
<p>There are a few reasons why the phrase "If you have nothing to hide, you
have nothing to fear" is wrong and insidiously so. One of the main ones
is simply this: my threat model is <strong>not only my government</strong>, even if
my government is perfect and totally legitimate (to me). But no
government is perfect, and some of them are literally monstrous.</p>
<p>So while it's true that there are uses of cryptography harmonious with a
PG13 version of the world - simply protecting obviously sensitive data
<em>within</em> the control of authorities - there are plenty where it is
entirely ethically right and necessary to make that protection
<strong>absolute</strong>.</p>
<p>The question then arises, as was raised in the above gist, what are the
properties of algorithms that satisfy the requirement of defence even
against hostile authorities?</p>
<p>The modern tradition of cryptography uses Kerckhoff's Law as one of its
axioms, and steganography does not fit into this model. But that's
because the tradition is built by people in industry who are fine with
people <strong>knowing they are using cryptography</strong>. In an environment where
that is not acceptable, steganography is not on a list of options - it's
more like the sine qua non.</p>
<h2>Steganography on blockchains</h2>
<p>On a blockchain, we have already understood this "freedom fighter"
model. It's an essential part of how the thing was even created, and why
it exists. And there are essentially two principal complaints about
Bitcoin and its blockchain, both of which are somewhat related to this:</p>
<ul>
<li>Privacy</li>
<li>Scalability</li>
</ul>
<p>The first is obvious - if we don't create "steganographic" transactions,
then governments, and everyone else, may get to know at least
<em>something</em> about our transactions. The second is less so - but in the
absence of scale we have a small anonymity set. Smaller payment network
effects and smaller anonymity sets obviously hamper use of these systems
by a "freedom fighter". But remember the scale limitations come directly
out of the design of the system with censorship resistance and
independent verification in mind.</p>
<p>Attempts to improve the privacy by altering the <em>way</em> in which
transactions are done have a tendency to make the scalability worse -
the obvious example being CoinJoin, which with unblinded amounts
inevitably involves larger numbers of outputs and larger numbers of
transactions even.</p>
<p>A less obvious example is Confidential Transcations; when we blind
outputs we need to use up more space to create the necessary guarantees
about the properties of the amounts - see the range proof, which with
Borromean ring signatures or bulletproofs need a lot of extra space. The
same is true of ring signature approaches generally to confidentiality.</p>
<p>You can trade off space usage for computation though - e.g. zkSNARKs
which are quite compact in space but take a lot of CPU time to create
(and in a way they take a lot of space in a different sense - memory
usage for proof creation).</p>
<h2>Localised trust</h2>
<p>You can improve this situation by localising trust in space or time.
There are obvious models - the bank of the type set up by digicash. See
the concept of <a href="https://en.wikipedia.org/wiki/Blind_signature">Chaumian
tokens</a>
generally. One project that looked into creating such things was
<a href="https://github.com/Open-Transactions/">OpenTransactions</a>,
another was Loom, also see Truledger.</p>
<p>Trust can be localised in time as well - and the aforementioned zkSnarks
are an example; they use a trusted setup as a bootstrap. This trust can
be ameliorated with a multiparty computation protocol such that trust is
reduced by requiring all participants to be corrupt for the final result
to be corrupt; but it is still trust.</p>
<h2>The tension between privacy and security</h2>
<p>For any attribute which is perfectly (or computationally) hidden, we
have a corresponding security downgrade. If attribute A is required to
satisfy condition C by the rules of protocol P, and attribute A is
blinded to A* by a privacy mechanism M, in such a way that we use the
fact that C* is guaranteed by A*, then we can say that P's security is
"downgraded" by M in the specific sense that the C-guarantee has been
changed to the C*-guarantee, where (inevitably) the C* guarantee is
not as strong, since it requires the soundess of M as well as whatever
assumptions already existed for the soundness of C.</p>
<p>However, the situation is worse - precisely because M is a privacy
mechanism, it reduces public verifiability, and specifically
verifiability of the condition C, meaning that if the C* guarantee
(which we <em>can</em> publically verify) fails to provide C, there will be no
public knowledge of that failure.</p>
<p>To give a concrete example of the above template, consider what happens
to Bitcoin under Confidential Transactions with Pedersen commitments
(set aside the range proof for a moment). Since Pedersen commitments are
perfectly hiding but only computationally binding, we have:</p>
<p>P = Bitcoin</p>
<p>A = Bitcoin amounts of outputs</p>
<p>C = amount balance in transactions</p>
<p>M = CT with Pedersen commitments</p>
<p>A* = Pedersen commitments of outputs</p>
<p>C* = Pedersen commitment balance in transactions</p>
<p>Here the downgrade in security is specifically the computational binding
of Pedersen commitments (note: that's assuming both ECDLP intractability
*and* NUMS-ness of a curve point). Without Pedersen/CT, there are
*no* assumptions about amount balance, since integers are "perfectly
binding" :) With it, any failure of the computational binding is
catastrophic, since we won't see it.</p>
<h2>The tension between privacy and scalability</h2>
<p>For any attribute A which is obfuscated by a privacy mechanism M in
protocol P (note: I'm choosing the word "obfuscation" here to indicate
that the hiding is not perfect - note the contrast with the previous
section), we have a corresponding scalability failure. M may obfuscate
an attribute A by expanding the set of possible values/states from A to
A[N]. To commit to the obfuscation soundly it must publish data of
order \~ N x size(A). Also note that it is <em>possible</em> for the
obfuscation goal to be achieved without an increase in space usage, if
multiple parties can coordinate their transactions, but here we ignore
this possibility because it requires all parties to agree that all
attributes except A to be identical (example: multiple participants must
accept their newly created outputs are equal value). This is not really
a "transaction" in the normal sense.</p>
<p>A concrete example: equal-sized Coinjoin in Bitcoin:</p>
<p>P = Bitcoin</p>
<p>A = receiver of funds in a transaction</p>
<p>A[N] = set of N outputs of equal size</p>
<p>M = Coinjoin</p>
<p>A less obvious example but fitting the same pattern; ElGamal commitment
based Confidential Transactions (as opposed to Pedersen commitments
based)</p>
<p>P = Bitcoin</p>
<p>A = output amount in a transaction</p>
<p>A[N] = ElGamal commitment to amount, here 2 curve points, N=2</p>
<p>M = ElGamal commitments</p>
<p>Here N=2 requires some explaining. An ElGamal commitment is perfectly
binding, and to achieve that goal the commitment must have 2 points, as
the input has two values (scalars), one for blinding and the other for
binding the amount. So we see in this case the expansion in practice is
more than just a single integer, it's from a single bitcoin-encoded
integer to two curve points. But the details obviously vary; the general
concept is to whatever extent we obfuscate, without throwing in extra
security assumptions, we require more data.</p>
<h2>Verification - public or private?</h2>
<p>The structure above is trying to make an argument, which I believe is
pretty strong - that this represents searching for privacy, in a
blockchain context, in slightly the wrong way.</p>
<p>If we try to make the <em>blockchain</em> itself private, we are slightly
pushing against its inherent nature. Its crucial feature is
<strong>public verifiability</strong>, and
while it's true that this does not require all attributes properties to
be "unblinded" nor "unobfuscated", we see above that introducing
blinding or obfuscation is problematic; you either degrade security in a
way that's not acceptable because it introduces invisible breaks, or you
degrade scalability (such as using a perfectly binding commitment
requiring no compression, or a zero knowledge proof taking up a lot of
space or computation time), or you degrade trustlessness (see: trusted
setup zkps). I have no absolute theorem that says that you cannot get
rid of all of these problems simultaneously; but it certainly seems
hard!</p>
<p>This is where the idea of a "steganographic blockchain" comes in; if
instead of trying to hide attributes of transactions, we try to make the
<em>meaning</em> of transactions be something not explicit to the chain, but
agreed upon by arbitrary participants using mechanisms outside it. This
allows one to leverage the blockchain's principal feature - censorship
resistant proof of state changes, in public, without inheriting its main
bugs - lack of privacy and scalability, and without degrading its own
security.</p>
<p>Examples:</p>
<ul>
<li>Colored coins</li>
<li>Crude example: atomic swaps</li>
<li>Lightning and second-layer</li>
<li>Chaumian tokens</li>
<li>Client-side validation (single use seals)</li>
<li>Scriptless scripts</li>
</ul>
<h2>High bandwidth steganography</h2>
<p>The biggest practical problem with steganography has always been
bandwidth; if you use non-random data such as images or videos, which
are often using compression algorithms to maximise their signal to noise
ratio, you have the problem of getting sufficient "cover traffic" over
your hidden message.</p>
<p>Note that this problem does not occur <strong>at all</strong> in cases where your
hidden message is embedded into another message which is random. This is
the case with digital signatures; ECDSA and Schnorr for example are both
publish as two random values each of which is about 32 bytes.</p>
<p>To go back to the previously mentioned example of scriptless scripts, we
can see that the atomic swap protocol based on it as described in my
<a href="https://web.archive.org/web/20200603112526/https://joinmarket.me/blog/blog/flipping-the-scriptless-script-on-schnorr/">blog
post</a>,
exploits this directly. On chain we see two (not obviously related)
transactions with Schnorr signatures that are, to the outside observer,
in no way related; the hiding of the connection is perfect, but the
binding/atomicity of the two payments is still secure, just not
perfectly so (it's based on the ECDLP hardness assumption, but then so
are ordinary payments).</p>
<p>Note how this is a different philosophy/approach to hiding/privacy:
since such a swap leaves no fingerprint on-chain, the concept of
anonymity set blurs; it's strictly all transactions (assuming Schnorr in
future, or ECDSA-2PC now), even if most people do not use the technique.
To get that same effect with an enforced privacy overlay mechanism M for
all participants, we tradeoff the security or scalability issues
mentioned above.</p>
<p>This is the reason for my slightly click-baity-y subtitle "High
Bandwidth Steganography". A big chunk of the Bitcoin blockchain is
random (as those who've tried to compress it have learned to their
chagrin), and so it's not quite as hard to usual to hide transaction
semantics (the ideal case will be inside signatures using scriptless
script type constructs), so in a sense we can get a very high bandwidth
of data communicated client to client without using any extra space on
chain, and without "polluting" the chain with extra security
assumptions.</p>Flipping the scriptless script on Schnorr2018-03-15T00:00:00+01:002018-03-15T00:00:00+01:00Adam Gibsontag:joinmarket.me,2018-03-15:/blog/blog/flipping-the-scriptless-script-on-schnorr/<p>using scriptless scripts for atomic swaps</p><h3>Flipping the scriptless script on Schnorr</h3>
<h2>Outline</h2>
<p>It's by now very well known in the community of Bitcoin enthusiasts that
the <a href="https://en.wikipedia.org/wiki/Schnorr_signature">Schnorr
signature</a>
may have great significance; and "everyone knows" that its significance
is that it will enable signatures to be aggregated, which could be
<strong>great</strong> for scalability, and nice for privacy too. This has been
elucidated quite nicely in a Bitcoin Core <a href="https://bitcoincore.org/en/2017/03/23/schnorr-signature-aggregation/">blog
post</a>.</p>
<p>This is very true.</p>
<p>There are more fundamental reasons to like Schnorr too; it can be shown
with a simple proof that Schnorr signatures are secure if the elliptic
curve crypto that prevents someone stealing your coins (basically the
"Elliptic Curve Discrete Logarithm Problem" or ECDLP for short) is
secure, and assuming the hash function you're using is secure (see <a href="https://blog.cryptographyengineering.com/2011/09/29/what-is-random-oracle-model-and-why-3/">this
deep dive into the random oracle
model</a>
if you're interested in such things). ECDSA doesn't have the same level
of mathematical surety.</p>
<p>Perhaps most importantly of all Schnorr signatures are <strong>linear</strong> in the
keys you're using (while ECDSA is not).</p>
<p>Which brings me to my lame pun-title : another way that Schnorr
signatures may matter is to do with, in a sense, the <strong>opposite</strong> of
Schnorr aggregation - Schnorr subtraction. The rest of this very long
blog post is intended to lead you through the steps to showing how
clever use of signature subtraction can lead to <span
style="text-decoration: underline;">one</span> very excellent outcome
(there are others!) - a private Coinswap that's simpler and better than
the private Coinswap outlined in my <a href="https://web.archive.org/web/20200506162002/https://joinmarket.me/blog/blog/coinswaps">previous blog
post</a>.</p>
<p>The ideas being laid out in the rest of this post are an attempt to
concretize work that, as far as I know, is primarily that of Andrew
Poelstra, who has coined the term "<strong>scriptless scripts</strong>" to describe a
whole set of applications, usually but not exclusively leveraging the
linearity of Schnorr signatures to achieve goals that otherwise are not
possible without a system like Bitcoin's
<a href="https://en.bitcoin.it/wiki/Script">Script</a>.
This was partly motivated by Mimblewimble (another separate, huge
topic), but it certainly isn't limited to that. The broad overview of
these ideas can be found in these
<a href="https://download.wpsoftware.net/bitcoin/wizardry/mw-slides/2017-05-milan-meetup/slides.pdf">slides</a>
from Poelstra's Milan presentation last May.</p>
<p>So what follows is a series of constructions, starting with Schnorr
itself, that will (hopefully) achieve a goal: an on-chain atomic
coinswap where the swap of a secret occurs, on chain, inside the
signatures - but the secret remains entirely invisible to outside
observers; only the two parties can see it.</p>
<p>If you and I agree between ourselves that the number to subtract is 7,
you can publish "100" on the blockchain and nobody except me will know
that our secret is "93". Something similar (but more powerful) is
happening here; remember signatures are actually just numbers; the
reason it's "more powerful" is that we can enforce the revealing of the
secret by the other party if the signature is valid, and coins
successfully spent.</p>
<p>Before we therefore dive into how it works, I wanted to mention why this
idea struck me as so important; after talking to Andrew and seeing the
slides and talk referenced above, I
<a href="https://twitter.com/waxwing__/status/862724170802761728">tweeted</a>
about it:</p>
<p><strong>If we can take the <em>semantics</em> of transactions off-chain in this kind
of way, it will more and more improve what Bitcoin (or any other
blockchain) can do - we can transact securely without exposing our
contracts to the world, and we can reduce blockchain bloat by using
secrets embedded in data that is already present. The long term vision
would be to allow the blockchain itself to be a *very* lean contract
enforcement mechanism, with all the "rich statefulness" .. client-side
;)<span style="text-decoration: underline;">
</span></strong></p>
<h4>Preliminaries: the Schnorr signature itself</h4>
<p><em>(Notation: We'll use <code>||</code> for concatenation and capitals for elliptic
curve points and lower case letters for scalars.)</em></p>
<p>If you want to understand the construction of a Schnorr signature well,
I can recommend Oleg Andreev's compact and clear
<a href="http://blog.oleganza.com/post/162861219668/eli5-how-digital-signatures-actually-work">description</a>
; also nice is Section 1 in the Maxwell/Poelstra Borromean Ring
Signatures
<a href="https://github.com/Blockstream/borromean_paper">paper</a>,
although there are of course tons of other descriptions out there. We'll
write it in basic form as:</p>
<div class="highlight"><pre><span></span><span class="err">s = r + e * x</span>
<span class="err">e = H(P||R||m)</span>
</pre></div>
<p>Note: we can hash, as "challenge" a la <a href="https://en.wikipedia.org/wiki/Proof_of_knowledge#Sigma_protocols">sigma
protocol</a>,
just <code>R||m</code> in some cases, and more complex things than just <code>P||R||m</code>,
too; this is just the most fundamental case, fixing the signature to a
specific pubkey; the nonce point <code>R</code> is always required).</p>
<p>For clarity, in the above, <code>x</code> is the private key, <code>m</code> is the message,
<code>r</code> is the "nonce" and <code>s</code> is the signature. The signature is published
as either <code>(s, R)</code> or <code>(s, e)</code>, the former will be used here if
necessary.</p>
<p>Apologies if people are more used to <code>s = r - ex</code>, for some reason it's
always <code>+</code> to me!</p>
<p>Note the linearity, in hand-wavy terms we can say:</p>
<div class="highlight"><pre><span></span><span class="err">s_1 = r_1 + e * x_1</span>
<span class="err">s_2 = r_2 + e * x_2</span>
<span class="err">e = H(P_1 + P2 || R_1 + R_2 || m)</span>
<span class="err">=></span>
<span class="err">s_1 + s_2 is a valid signature for public key (P_1 + P_2) on m.</span>
</pre></div>
<p>But this is <strong>NOT</strong> a useable construction as-is: we'll discuss how
aggregation of signatures is achieved properly later, briefly.</p>
<h4>Construction of an "adaptor" signature</h4>
<p>This is the particular aspect of Poelstra's "scriptless script" concept
that gets us started leveraging the Schnorr signature's linearity to do
fun things. In words, an "adaptor signature" is a not a full, valid
signature on a message with your key, but functions as a kind of
"promise" that a signature you agree to publish will reveal a secret, or
equivalently, allows creation of a valid signature on your key for
anyone possessing that secret.</p>
<p>Since this is the core idea, it's worth taking a step back here to see
how the idea arises: you want to do a similar trick to what's already
been done in atomic swaps: to enforce the atomicity of (spending a coin:
revealing a secret); but without Script, you can't just appeal to
something like <code>OP_HASH160</code>; if you're stuck in ECC land, all you have
is scalar multiplication of elliptic curve points; but luckily that
function operates similar to a hash function in being one-way; so you
simply share an elliptic curve point (in this case it will be <code>T</code>), and
the secret will be its corresponding private key. The beatiful thing is,
it <em>is</em> possible to achieve that goal directly in the ECC Schnorr
signing operation.</p>
<p>Here's how Alice would give such an adaptor signature to Bob:</p>
<p>Alice (<code>P = xG</code>), constructs for Bob:</p>
<ul>
<li>Calculate <code>T = tG</code>, <code>R = rG</code></li>
<li>Calculate <code>s = r + t + H(P || R+T || m) * x</code></li>
<li>Publish (to Bob, others): <code>(s', R, T)</code> with <code>s' = s - t</code> (so <code>s'</code>
should be "adaptor signature"; this notation is retained for the
rest of the document).</li>
</ul>
<p>Bob can verify the adaptor sig <code>s'</code> for <code>T,m</code>:</p>
<div class="highlight"><pre><span></span><span class="err">s' * G ?= R + H(P || R+T || m) * P</span>
</pre></div>
<p>This is not a valid sig: hashed nonce point is <code>R+T</code> not <code>R</code>;</p>
<p>Bob cannot retrieve a valid sig : to recover <code>s'+t</code> requires ECDLP
solving.</p>
<p>After validation of adaptor sig by Bob, though, he knows:</p>
<p>Receipt of <code>t</code> <=> receipt of valid sig <code>s = s' + t</code></p>
<h4>Deniability:</h4>
<p>This is a way of concretizing the concept that all of this will be
indistinguishable to an observer of the blockchain, that is to say, an
observer only of the final fully valid signatures:</p>
<p>Given any <code>(s, R)</code> on chain, create <code>(t, T)</code>, and assert that the
adaptor signature was: <code>s' = s - t</code>, with <code>R' = R - T</code>, so adaptor
verify eqn was: <code>s'G = R' + H(P || R'+T || m)P</code></p>
<h4></h4>
<h4>Moving to the 2-of-2 case, with Schnorr</h4>
<p>For the remainder, we're considering the matter of signing off
transactions from outpoints jointly owned (2 of 2) by Alice and Bob.</p>
<p>Start by assuming Alice has keypair <code>(x_A, P_A)</code>, and Bob <code>(x_B, P_B)</code>.
Each chooses a random nonce point <code>r_A</code>, <code>r_B</code> and exchanges the curve
points with each other (<code>P_A, R_A, P_B, R_B</code>) to create a
scriptPubKey/destination address.</p>
<h4>2-of-2 Schnorr without adaptor sig</h4>
<p>To avoid related-key attacks (if you don't know what that means see e.g.
the "Cancelation" section in
<a href="https://diyhpl.us/wiki/transcripts/scalingbitcoin/milan/schnorr-signatures/">https://diyhpl.us/wiki/transcripts/scalingbitcoin/milan/schnorr-signatures/</a>),
the "hash challenge" is made more complex here, as was noted in the
first section on Schnorr signatures. The two parties Alice and Bob,
starting with pubkeys <code>P_A</code>, <code>P_B</code>, construct for themselves a "joint
key" thusly:</p>
<div class="highlight"><pre><span></span><span class="err">P_A' = H(H(P_A||P_B) || P_A) * P_A ,</span>
<span class="err">P_B' = H(H(P_A||P_B) || P_B) * P_B ,</span>
<span class="err">joint_key = P_A' + P_B'</span>
</pre></div>
<p>Note that Alice possesses the private key for <code>P_A'</code> (it's
<code>H(H(P_A||P_B) || P_A) * x_A</code>, we call it <code>x_A'</code> for brevity), and
likewise does Bob. From now on, we'll call this "joint_key" <code>J(A, B)</code>
to save space.</p>
<p>Common hash challenge:</p>
<div class="highlight"><pre><span></span><span class="err">H(J(A, B) || R_A + R_B || m) = e</span>
<span class="err">s_agg = = r_A + r_B + e(x_A' + x_B')</span>
<span class="err">-> s_agg * G = R_A + R_B + e * J(A, B)</span>
</pre></div>
<p>Alice's sig: <code>s_A = r_A + e * x_A'</code>, Bob's sig: <code>s_B = r_B + e * x_B'</code>
and of course: <code>s_agg = s_A + s_B</code>.</p>
<p>There is, as I understand it, more to say on this topic, see
e.g.<a href="http://diyhpl.us/wiki/transcripts/bitcoin-core-dev-tech/2017-09-06-signature-aggregation/">here</a>,
but it's outside my zone of knowledge, and is somewhat orthogonal to the
topic here.</p>
<h4>2-of-2 with adaptor sig</h4>
<p>Now suppose Bob chooses <code>t</code> s.t. <code>T = t * G</code>, and Bob is going to
provide an adaptor signature for his half of the 2-of-2.</p>
<p>Then:</p>
<ol>
<li>Alice, Bob share <code>P_A, P_B, R_A, R_B</code> as above; Bob gives <code>T</code> to
Alice</li>
<li>Alice and Bob therefore agree on
<code>e = H(J(A, B) || R_A + R_B + T || m)</code> (note difference, <code>T</code>)</li>
<li>Bob provides adaptor <code>s' = r_B + e * x_B'</code> (as in previous section,
not a valid signature, but verifiable)</li>
<li>Alice verifies: <code>s' * G ?= R_B + e * P_B'</code></li>
<li>If OK, Alice sends to Bob her sig: <code>s_A = r_A + e * x_A'</code></li>
<li>Bob completes, atomically releasing <code>t</code>: first, construct
<code>s_B = r_B + t + e * x_B'</code>, then combine: <code>s_agg = s_A + s_B</code> and
broadcast, then Alice sees <code>s_agg</code></li>
<li>Alice subtracts:
<code>s_agg - s_A - s' = (r_B + t + e * x_B') - (r_B + e * x_B') = t</code></li>
</ol>
<p>Thus the desired property is achieved: <code>t</code> is revealed by a validating
"completion" of the adaptor signature.</p>
<p><strong>Note</strong>, however that this has no timing control, Bob can jam the
protocol indefinitely at step 6, forcing Alice to wait (assuming that
what we're signing here is a transaction out of a shared-control
outpoint); this is addressed in the fleshed out protocol in the next
section, though.</p>
<p>For the remainder, we'll call the above 7 steps the 22AS protocol, so
<code>22AS(Bob,t, Alice)</code> for Bob, secret <code>t</code>, and Alice. Bob is listed first
because he holds <code>t</code>.</p>
<p>Since this is the most important part of the construction, we'll
summarize it with a schematic diagram:</p>
<p><img src="/web/20200506162002im_/https://joinmarket.me/static/media/uploads/.thumbnails/22AS.jpg/22AS-1056x816.jpg" width="1056" height="816" alt="22AS protocol" /></p>
<p>So this <code>22AS</code> was a protocol to swap a coin for a secret, to do atomic
swaps we need to extend it slightly: have two transactions atomic via
the same secret <code>t</code>.</p>
<h3>The Atomic Swap construct, using 2-of-2 schnorr + adaptor signatures</h3>
<p>This is now <em>fairly</em> straightforward, inheriting the main design from
the existing "atomic swap" protocol.</p>
<p>A. Alice and Bob agree on a pair of scriptPubkeys which are based on 2
of 2 pubkeys using Schnorr, let's name them using <code>D</code> for destination
address (<code>A</code> is taken by Alice): <code>D_1</code> being 2-2 on (<code>P_A1</code>, <code>P_B1</code>) and
<code>D_2</code> being 2-2 on (<code>P_A2</code>, <code>P_B2</code>). Note that these pubkeys, and
therefore destination addresses, are not dependent in any way on
"adaptor" feature (which is a property only of nonces/sigs, not keys).</p>
<p>B. Alice prepares a transaction TX1 paying 1 coin into <code>D_1</code>, shares
txid_1, and requires backout transaction signature from Bob. Backout
transaction pays from txid_1 to Alice's destination but has locktime
<code>L1</code>.</p>
<p>C. Bob does the (nearly) exact mirror image of the above: prepares TX2
paying 1 coin into <code>D_2</code>, shares txid_2, requires backout transaction
signature from Alice. Backout transaction pays from txid_2 to Bob's
destination with locktime <code>L2</code> which is <em>significantly later</em> than <code>L1</code>.</p>
<p>D. Then Alice and Bob broadcast TX1 and TX2 respectively and both sides
wait until both confirmed. If one party fails to broadcast, the other
uses their backout to refund.</p>
<p>E. If both txs confirmed (N blocks), Alice and Bob follow steps 1-4 of
<code>22AS(Bob, t, Alice)</code> (described in previous section) for some <code>t</code>, for
both the scriptPubkeys <code>D_1</code> and <code>D_2</code>, in parallel, but with the same
secret <code>t</code> in each case (a fact which Alice verifies by ensuring use of
same <code>T</code> in both cases). For the first (<code>D_1</code>) case, they are signing a
transaction spending 1 coin to Bob. For the second, <code>D_2</code>, they are
signing a transaction spending 1 coin to Alice. Note that at the end of
these steps Alice will possess a verified adaptor sig <code>s'</code> for <em>both</em> of
the spend-outs from <code>D_1, D_2</code>.</p>
<p>E(a). Any communication or verification failure in those 1-4 steps (x2),
both sides must fall back to timelocked refunds.</p>
<p>F. The parties then complete (steps 5-7) the first <code>22AS(Bob, t, Alice)</code>
for the first transaction TX1, spending to <code>D_1</code> to give Bob 1 coin.
Alice receives <code>t</code> as per step 7.</p>
<p>F(a). As was mentioned in the previous section, Bob can jam the above
protocol at step 6: if he does, Alice can extract her coins from her
timelocked refund from <code>D_1</code> in the period between <code>L1</code> and <code>L2</code>. The
fact that <code>L2</code> is (significantly) later is what prevents Bob from
backing out his own spend into <code>D_2</code> <em>and</em> claiming Alice's coins from
<code>D_1</code> using the signature provided in step 5. (Note this time asymmetry
is common to all atomic swap variants).</p>
<p>G. (Optionally Bob may transmit <code>t</code> directly over the private channel,
else Alice has to read it from the blockchain (as per above <code>22AS</code>
protocol) when Bob publishes his spend out of <code>D_1</code>).</p>
<p>H. Alice can now complete the equivalent of steps 5-7 without Bob's
involvement for the second parallel run for <code>D_2</code>: she has <code>t</code>, and adds
it to the already provided <code>s'</code> adaptor sig for the transaction paying
her 1 coin from <code>D_2</code> as per first 4 steps. This <code>s' + t</code> is guaranteed
to be a valid <code>s_B</code>, so she adds it to her own <code>s_A</code> to get a valid
<code>s_agg</code> for this spend to her of 1 coin, and broadcasts.</p>
<h2>Summing up</h2>
<h4>Privacy implications</h4>
<p>In absence of backouts being published (i.e. in cooperative case), these
scriptPubkeys will be the same as any other Schnorr type ones (N of N
multisig will not be distinguishable from 1 of 1). The signatures will
not reveal anything about the shared secret <code>t</code>, or the protocol carried
out, so the 2 transaction pairs (pay-in to <code>D_1,D_2</code>, pay out from same)
will not be tied together in that regard.</p>
<p>This construction, then, will (at least attempt to) gain the anonymity
set of all Schnorr sig based transactions. The nice thing about
Schnorr's aggregation win is, even perhaps more than segwit, the
economic incentive to use it will be strong due to the size compaction,
so this anonymity set should be big (although this is all a bit pie in
the sky for now; we're a way off from it being concrete).</p>
<p>The issue of amount correlation, however, has <strong>not</strong> been in any way
addressed by this, of course. It's a sidebar, but one interesting idea
about amount correlation breaking was brought up by Chris Belcher
<a href="https://github.com/AdamISZ/CoinSwapCS/issues/47">here</a>
; this may be a fruitful avenue whatever the flavour of Coinswap we're
discussing.</p>
<h4>Comparison with other swaps</h4>
<p>Since we've now, in this blog post and the previous, seen 3 distinct
ways to do an atomic coin swap, the reader is forgiven for being
confused. This table summarizes the 3 different cases:</p>
<table>
<thead>
<tr>
<th><strong>Type</strong></th>
<th><strong>Privacy on-chain</strong></th>
<th><strong>Separate "backout/refund" transactions for non-cooperation</strong></th>
<th><strong>Requires segwit</strong></th>
<th><strong>Requires Schnorr</strong></th>
<th><strong>Number of transactions in cooperative case</strong></th>
<th><strong>Number of transactions in non-cooperative case</strong></th>
<th><strong>Space on chain</strong></th>
</tr>
</thead>
<tbody>
<tr>
<td>Atomic swap</td>
<td>None; trivially linkable</td>
<td>None; backout is directly in script</td>
<td>No</td>
<td>No</td>
<td>2 + 2</td>
<td>2 + 2</td>
<td>Medium</td>
</tr>
<tr>
<td>CoinSwap</td>
<td>Anonymity set: 2 of 2 transactions (+2 of 3 depending on setup)</td>
<td>Presigned backouts using H(X) and CLTV, break privacy if used</td>
<td>Yes</td>
<td>No</td>
<td>2 + 2</td>
<td>3 + 3</td>
<td>Large-ish</td>
</tr>
<tr>
<td>Scriptless script</td>
<td>Anonymity set: all Schnorr transactions</td>
<td>Presigned backouts uslng locktime; semi-break privacy (other txs may use locktime)</td>
<td>Yes</td>
<td>Yes</td>
<td>2 + 2</td>
<td>2 + 2</td>
<td>Small</td>
</tr>
</tbody>
</table>
<p>The reason that there are "3 + 3" transactions in the non-cooperative
case for CoinSwap is, in that case, both sides pay into a 2-of-2, then
in non-cooperation, they must both spend into the custom "HTLC" (IF
hash, pub, ELSE CLTV, pub), and then redeem *out* of it.</p>
<p>A fundamental difference for the latter 2 cases, compared with the
first, is they must pay into shared-ownership 2 of 2 outputs in the
pay-in transaction; this is to allow backout transactions to be arranged
(a two-party multi-transaction contract requires this; see e.g.
Lightning for the same thing). The first, bare atomic swap is a single
transaction contract, with the contract condtions embedded entirely in
that one transaction(for each side)'s scriptPubKey.</p>
<p>Finally, size on-chain of the transactions is boiled down to
hand-waving, because it's a bit of a complex analysis; the first type
always uses a large redeem script but one signature on the pay-out,
whether cooperative or non-cooperative; the second uses 2 or 3
signatures (assuming something about how we attack the anonymity set
problem) but no big redeem script in cooperative case, while takes up a
*lot* of room in the non-cooperative case, the third is always compact
(even non-cooperative backouts take no extra room).
Schnorr-sig-scriptless-scripts are the big winner on space.</p>
<h4>Extending to multi-hop; Lightning, Mimblewimble</h4>
<p>The first time I think this was discussed was in the mailing list post
<a href="https://lists.launchpad.net/mimblewimble/msg00086.html%20">here</a>,
which discusses how conceivably one could achieve the same setup as HTLC
for Mimblewimble lightning, using this scriptless-script-atomic-swap.
Doubtless these ideas are a long way from being fleshed out, and I
certainly haven't kept up with what's going on there :)</p>
<h4>Other applications of the scriptless script concept</h4>
<p>As a reminder, this document was just about fleshing out how the atomic
swap gets done in a Schnorr-signature-scriptless-script world; the
<a href="https://download.wpsoftware.net/bitcoin/wizardry/mw-slides/2017-05-milan-meetup/slides.pdf">slides</a>
give several other ideas that are related. Multisignature via
aggregation is of course part of it, and is already included even in the
above protocol (for 2 of 2 as a subset of N of N); earlier ideas like
pay-to-contract-hash and sign-to-contract-hash already exist, and don't
require Schnorr, but share a conceptual basis; same for ZKCP, etc.</p>
<h4>Cross chain swap</h4>
<p>I admit to not sharing <em>quite</em> the same breathless excitement about
cross-chain swaps as some people, but it is no doubt very interesting,
if somewhat more challenging (not least because of different "clocks"
(block arrivals) affecting any locktime analysis and confirmation
depth). Poelstra has however also made the very intriguing point that it
is <strong>not</strong> actually required for the two blockchains to be operating on
the same elliptic curve group for the construction to work.</p>SNICKER2017-09-15T00:00:00+02:002017-09-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2017-09-15:/blog/blog/snicker/<p>a proposal for non-interactive coinjoins.</p><h3>SNICKER</h3>
<h2>SNICKER - Simple Non-Interactive Coinjoin with Keys for Encryption Reused</h2>
<p>I'm going to do this backwards - start with the end goal user
experience, and then work backwards to the technical design. This way,
those not wanting to get lost in technical details can still get the
gist.</p>
<h3><img alt="Me misusing a meme as a symbol and not adding any text." height="330" src="../../../../../../20200510162733im_/https:/joinmarket.me/static/media/uploads/.thumbnails/evilplanbaby.jpg/evilplanbaby-400x330.jpg" width="400"></h3>
<p><em>Pictured above: me misusing a meme as a symbol and deliberately not
adding any text to it.</em></p>
<h3><strong>Scenario</strong></h3>
<p><strong>Alisa</strong> lives in Moscow; she is a tech-savvy Bitcoin user, uses Linux
and the command line, and runs a fully verifying Bitcoin Core node. She
doesn't have indexing enabled, but she (sometimes, or long-running)
runs a tool called <code>snicker-scan</code> on the blocks received by her node. It
scans recent Bitcoin blocks looking for transactions with a particular
pattern, and returns to her in a file a list of candidate transactions.
She pipes this list into another tool which uses her own Bitcoin wallet
and constructs proposals: new transactions involving her own utxos and
utxos from these newly found transactions, which she signs herself.
Then, for each one, she makes up a secret random number and sends (the
proposed transactions + the secrets), encrypted to a certain public key,
in each case, so no one but the owner can read it, to a Tor hidden
service which accepts such submissions. For now, her job is done and she
gets on with her day.</p>
<p><strong>Bob</strong> lives in New York. He's a Bitcoin enthusiast who uses it a lot,
and likes to test out new features, but has never written code and
isn't tech-savvy like that. A few hours after Alisa went to bed he
opens one of his mobile wallets and a message pops up:
<code>New coinjoin proposals found. Check?</code>. He heard about this, and heard
that you can improve your privacy with this option, and even sometimes
gain a few satoshis in the process. So he clicks <code>Yes</code>. In the
background his mobile wallet downloads a file of some 5-10MB (more on
this later!). Bob did this once before and was curious about the file;
when he opened it he saw it was text with lots of unintelligible
encrypted stuff like this:</p>
<p><code>QklFMQOVXvpqgjaJFm00QhuJ1iWsnYYV4yJLjE0LaXa8N8c34Hzg5CeQduV.....</code>\
<code>QklFMQI2JR50dOGEQdDdmeX0BwMH4c+yEW1v5/IyT900WBGdYRA/T5mqBMc.....</code></p>
<p>Now his mobile does some processing on this file; it takes a little
while, some seconds perhaps, processing in the background. At the end it
pops up a new message:
<code>Coinjoin transaction found. Would you like to broadcast it?</code> and
underneath it shows the transaction spending 0.2433 BTC out of his
wallet and returning 0.2434 BTC in one of the outputs. It shows that the
other inputs and outputs are not his, although one of them is also for
0.2434 BTC. Does he want to accept? Sure! Free money even if it's only
cents. Even with no free money, he knows that coinjoin makes his privacy
better. So he clicks <code>Yes</code> and it's broadcast. Done.</p>
<h3>The NIC in SNICKER</h3>
<p>Non-interactivity is a hugely desirable property in protocols; this is
particularly the case where privacy is a priority. Firstly, it avoids
the need to synchronize (<strong>Alisa</strong>, and her computer, had gone to sleep
when <strong>Bob</strong> performed his step). Second, to avoid malicious
interruption of an interactive protocol, it can help to identify the
participants, but that is very damaging to the whole point of a protocol
whose goal is privacy. Non-interactivity cuts this particular Gordian
knot; one side can send the message anonymously and the other
participant simply uses the data, but this has the limitation of the
sender finding the receiver, which means some weak identification of the
latter. Even better is if the request can be sent encrypted to the
receiver, then it can be broadcast anywhere for the receiver to notice.
That latter model is the most powerful, and is used here, but it does
have practicality drawbacks as we'll discuss.</p>
<p>So, note that in the above scenario <strong>Alisa</strong> and <strong>Bob</strong> do not meet,
do not synchronize, and need never meet or find out who each other are
in future either. Their "meeting" is entirely abstracted out to one
side publishing an encrypted message and the other side receiving <em>all</em>
such encrypted messages and only reading the one(s) encrypted to his
pubkey. The <em>all</em> part helps preserve Bob's privacy, if he finds a way
to broadcast the final transaction with a reasonable anonymity defence
(see e.g.
<a href="https://github.com/gfanti/bips/blob/master/bip-dandelion.mediawiki">Dandelion</a>;
I'm of the opinion that that battle - making Bitcoin transaction
broadcast anonymous - is something we <em>will</em> win, there is a massive
asymmetry in favour of the privacy defender there).</p>
<h3>Quick background - how to do a Coinjoin</h3>
<p>Here's the obligatory
<a href="https://bitcointalk.org/index.php?topic=279249.0">link</a>
to the Coinjoin OP. You can skip this section if you know Coinjoin well.</p>
<p>Otherwise, I'll give you a quick intro here, one that naturally leads
into the SNICKER concept:</p>
<p>Each input to a transaction requires (for the transaction to be valid) a
signature by the owner of the private key (using singular deliberately,
restricting consideration to p2pkh or segwit equivalent here) over a
message which is \~ the transaction. Each of these signatures can be
constructed separately, by separate parties if indeed the private key
for each input are owned by separate parties. The "normal" coinjoining
process thus involves the following steps (for now, not specifying <em>who</em>
carries out each step):</p>
<ul>
<li>Gather all of the inputs - the utxos that will be spent</li>
<li>Gather all of the destination addresses to various parties, and the
amounts to be paid</li>
<li>Distribute a "template" of the transaction to all parties (i.e.
the transaction without any signatures)</li>
<li>In some order all of the parties sign the transaction; whomever has
a transaction with all signatures complete, can broadcast it to the
Bitcoin network</li>
</ul>
<p>There are different protocols one can choose to get all these steps
done, ranging from simple to complex. A server can be the coordinating
party; blinding can be used to prevent the server knowing input-output
mapping.
<a href="http://crypsys.mmci.uni-saarland.de/projects/CoinShuffle/">Coinshuffle</a>
can be used, creating a kind of onion-routing approach to prevent
parties involved knowing the linkages (doesn't require a server to
coordinate, but requires more complex interactivity). One of the parties
in the join can be the "server", thus that party gains privacy that
the others don't (Joinmarket). Etc.</p>
<p>The difficulties created by any interactivity are considerably
ameliorated in a client-server model (see e.g. the old blockchain.info
<a href="https://en.bitcoin.it/wiki/Shared_coin">SharedCoin</a>(link
outdated) model), the serious tradeoff is the server knowing too much,
and/or a coordination/waiting problem (which may be considered
tolerable; see both SharedCoin and
<a href="https://github.com/darkwallet/darkwallet">DarkWallet</a>;
with a sufficient liquidity pool the waiting may be acceptable).</p>
<p>There are a lot of details to discuss here, but there is always <em>some</em>
interactivity (you can only sign once you know the full transaction,
assuming no custom sighashing^1^), and a model with a server is
basically always going to be more problematic, especially at scale.</p>
<p>So hence we try to construct a way of doing at least simple Coinjoins,
in at least some scenarios, without any server requirement or
coordination. Now I'll present the basic technical concept of how to do
this in SNICKER, in 2 versions.</p>
<h3>First version - snicKER = Keys for Encryption Reused</h3>
<p>To make the Coinjoin non-interactive, we need it to be the case that
Alisa can post a message for Bob, without explicitly requesting to
create a private message channel with him. This requires encrypting a
message that can then be broadcast (e.g. over a p2p network or on a
bulletin board).</p>
<p><em>(In case it isn't clear that either encryption or a private message
channel is required, consider that Alice must pass to Bob a secret which
identifies Bob's output address (explained below), critically, and also
her signature, which is on only her inputs; if these are seen in public,
the input-output linkages are obvious to anyone watching, defeating the
usual purpose of Coinjoin.)</em></p>
<h5>Encryption</h5>
<p>To achieve this we need a public key to encrypt a message to Bob. This
is the same kind of idea as is used in tools like PGP/gpg - only the
owner of the public key's private key can read the message.</p>
<p>In this "First version" we will assume something naughty on Bob's
part: that he has <strong>reused an address</strong>! Thus, a public key will exist
on the blockchain which we assume (not guaranteed but likely; nothing
dangerous if he doesn't) he still holds the private key for.</p>
<p>Given this admittedly unfortunate assumption, we can use a simple and
established encryption protocol such as
<a href="https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme">ECIES</a>
to encrypt a message to the holder of that public key.</p>
<p>Alisa, upon finding such a pubkey, call it <code>PB</code>, and noting the
corresponding utxo <code>UB</code>, will need to send, ECIES encrypted to <code>PB</code>,
several items (mostly wrapped up in a transaction) to Bob to give him
enough material to construct a valid coinjoin without any interaction
with herself:</p>
<ul>
<li>Her own utxos (just <code>UA</code> for simplicity)</li>
<li>Her proposed destination address(s)</li>
<li>Her proposed amounts for output</li>
<li>Her proposed bitcoin transaction fee</li>
<li>The full proposed transaction template using <code>UA</code> and <code>UB</code> as inputs
(the above 4 can be implied from this)</li>
<li>Her own signature on the transaction using the key for <code>UA</code></li>
<li>Her proposed destination address <strong>for Bob</strong>.</li>
</ul>
<h4>Destination</h4>
<p>The last point in the above list is of course at first glance not
possible, unless you made some ultra dubious assumptions about shared
ownership, i.e. if Alisa somehow tried to deduce other addresses that
Bob already owns (involving <em>more</em> address reuse). I don't dismiss this
approach <em>completely</em> but it certainly looks like a bit of an ugly mess
to build a system based on that. Instead, we can use a very well known
construct in ECC; in English something like "you can tweak a
counterparty's pubkey by adding a point that <em>you</em> know the private key
for, but you still won't know the private key of the sum". Thus in
this case, Alice, given Bob's existing pubkey <code>PB</code>, which is the one
she is using to encrypt the message, can construct a new pubkey:</p>
<div class="highlight"><pre><span></span><span class="err">PB2 = PB + k*G</span>
</pre></div>
<p>for some 32 byte random value <code>k</code>.</p>
<p>Alice will include the value of <code>k</code> in the encrypted message, so Bob can
verify that the newly proposed destination is under his control (again
we'll just assume a standard p2pkh address based on <code>PB2</code>, or a segwit
equivalent).</p>
<p>Assuming Bob somehow finds this message and successfully ECIES-decrypts
it using the private key of <code>PB</code>, he now has everything he needs to (if
he chooses), sign and broadcast the coinjoin transaction.</p>
<h4>A protocol for the most naive version, in broad strokes:</h4>
<ol>
<li>Alisa must have the ability to scan the blockchain to some extent;
she must find scriptSigs or witnesses containing pubkeys which were
later reused in new addresses/scriptPubKeys.</li>
<li>Alisa will use some kind of filtering mechanism to decide which are
interesting. The most obvious two examples are: amounts under
control in Bob's utxos matching her desired range, and perhaps age
of utxos (so likely level of activity of user) or some watermarking
not yet considered.</li>
<li>Having found a set of potential candidates, for each case <code>PB, UB</code>:
Construct a standard formatted message; here is a simple suggestion
although in no way definitive:</li>
</ol>
<p><code>{=html}
<!-- --></code>
8(?) magic bytes and 2 version bytes for the message type
k-value 32 bytes
Partially signed transaction in standard Bitcoin serialization
(optionally padding to some fixed length)</p>
<p>We defer discussing how in practice Bob will get access to the message
later; but note that if he has done this, he already knows the value of
<code>P_B</code> and will thus know also <code>U_B</code>. He ECIES-decrypts it, and
recognizes it's for him through correct magic bytes (other messages
encrypted to other pubkeys will come out random).</p>
<p>Then, this format has sufficient information for Bob to evaluate easily.
First, he can verify that <code>U_B</code> is in the inputs. Then he can verify
that for 1 of the 2 outputs (simple model) has a scriptPubKey
corresponding to <code>PB2 = PB + k*G</code>. He can then verify the output amounts
fit his requirements. Finally he can verify the ECDSA signature provided
on <code>U_A</code> (hence "partially signed transaction"). Given this he can, if
he chooses, sign on <code>UB</code> using <code>PB</code> and broadcast. He must of course
keep a permanent record of either <code>k</code> itself or, more likely, the
private key <code>k + x</code> (assuming <code>P = x * G</code>).</p>
<h3>A proof-of-concept</h3>
<p>Before going further into details, and discussing the second (probably
superior but not as obviously workable) version of SNICKER, I want to
mention that I very quickly put together some proof of concept code in
<a href="https://github.com/AdamISZ/SNICKER-POC">this github
repo</a>;
it uses
<a href="https://github.com/Joinmarket-Org/joinmarket-clientserver">Joinmarket-clientserver</a>
as a dependency, implements ECIES in a compatible form to that used by
<a href="https://electrum.org/">Electrum</a>,
and allows testing on regtest or testnet, admittedly with a bunch of
manual steps, using the python script <code>snicker-tool.py</code>. The workflow
for testing is in the README. To extend the testing to more wallets
requires some way to do ECIES as well as some way to construct the
destination addresses as per <code>PB2 = PB + kG</code> above. I did note that,
usefully, the partially signed transactions can be signed directly in
Bitcoin Core using <code>signrawtransaction</code> and then <code>sendrawtransaction</code>
for broadcast, but note that somehow you'll have to recover the
destination address, as receiver, too. Note that there was no attempt at
all to construct a scanning tool for any reused-key transactions here,
and I don't intend to do that (at least, in that codebase).</p>
<h2>Practical issues</h2>
<p>In this section will be a set of small subsections describing various
issues that will have to be addressed to make this work.</p>
<h3>Wallet integration</h3>
<p>One reason this model is interesting is because it's much more
plausible to integrate into an existing wallet than something like
Joinmarket - which requires dealing with long term interactivity with
other participants, communicating on a custom messaging channel,
handling protocol negotiation failures etc. To do SNICKER as a receiver,
a wallet needs the following elements:</p>
<ul>
<li>ECIES - this is really simple if you have the underlying secp256k1
and HMAC dependencies; see
<a href="https://github.com/spesmilo/electrum/blob/master/lib/bitcoin.py#L774-L817">here</a>
and
<a href="https://github.com/AdamISZ/SNICKER-POC/blob/master/ecies/ecies.py#L10-L50">here</a>;
note that the root construction in ECIES is ECDH.</li>
<li>The ability to calculate <strong>and store</strong> the newly derived keys of the
form <code>P' = P + kG</code> where <code>k</code> is what is passed to you, and <code>P</code> is
the pubkey of your existing key controlling the output to be spent.
I would presume that you would have to treat <code>k+x</code>, where <code>P=xG</code>, as
a newly imported private key. Note that we <em>cannot</em> use a
deterministic scheme for this from <code>P</code>, since that would be
calculatable by an external observer; it must be based on a secret
generated by "Alisa".This could be a bit annoying for a wallet,
although of course it's easy in a naive sense.</li>
<li>Ability to parse files containing encrypted coinjoin proposals in
the format outlined above - this is trivial.</li>
<li>Ability to finish the signing of a partially signed transaction.
Most wallets have this out of the box (Core does for example); there
might be a problem for a wallet if it tacitly assumes complete
ownership of all inputs.</li>
</ul>
<p>If a wallet only wanted to implement the receiver side (what we called
"Bob" above), that's it.</p>
<h4>Compatibility/consensus between different wallets</h4>
<p>The only "consensus" part of the protocol is the format of the
encrypted coinjoin proposals (and the ECIES algorithm used to encrypt
them). We could deal with different transaction types being proposed
(i.e. different templates, e.g. 3 outputs or 4, segwit or not), although
obviously it'll be saner if there are a certain set of templates that
everyone knows is acceptable to others.</p>
<h3>Notes on scanning for candidates</h3>
<p>There is no real need for each individual "Alisa" to scan, although
she might wish to if she has a Bitcoin node with indexing enabled. This
is a job that can be done by any public block explorer and anyone can
retrieve the data, albeit there are privacy concerns just from you
choosing to download this data. The data could be replicated on Tor
hidden services for example for better privacy. So for now I'm assuming
that scanning, itself, is not an issue.</p>
<p>A much bigger issue might be finding <strong>plausible</strong> candidates. Even in
this version 1 model of looking only for reused keys, which are
hopefully not a huge subset of the total utxo set, there are tons of
potential candidates and, to start with, none of them at all are
plausible. How to filter them?</p>
<ul>
<li>Filter on amount - if Alisa has X coins to join, she'll want to
work with outputs \< X.</li>
<li>Filter on age - this is more debatable, but very old utxos are less
likely to be candidates for usage.</li>
<li>An "active" filter - this is more likely to be how things work.
Are certain transactions intrinsically watermarked in a way that
indicates that the "Bob" in question is actually interested in
this function? One way this can happen is if we know that the
transaction is from a certain type of wallet, which already has this
feature enabled.</li>
</ul>
<h4>Bootstrapping</h4>
<p>If a set of users were using a particular wallet or service (preferably
a <em>large</em> set), it might be possible to identify their transactions
"Acme wallet transactions". Funnily enough, Joinmarket, because it
uses a set and unusual coinjoin pattern, satisfies this property in a
very obvious way; but there might be other cases too. See the notes in
"second version", below, on how Joinmarket might work specifically in
that case.</p>
<p>Better of course, is if we achieved that goal with a more user-friendly
wallet with a much bigger user-base; I'd ask wallet developers to
consider how this might be achieved.</p>
<p>Another aspect of bootstrapping is the Joinmarket concept - i.e. make a
financial incentive to help bootstrap. If creators/proposers are
sufficiently motivated they may offer a small financial incentive to
"sweeten the pot", as was suggested in the scenario at the start of
this post. This will help a lot if you want the user-set to grow
reasonably large.</p>
<h3>Scalability</h3>
<p>This is of course filed under "problems you really want to have", but
it's nevertheless a very real problem, arguably the biggest one here.</p>
<p>Imagine 10,000 utxo candidates that are plausible and 1000 active
proposers. Imagine they could all make proposals for a large-ish subset
of the total candidates, we could easily imagine 1,000,000 candidates at
a particular time. Each encrypted record takes 500-800 bytes of space,
let's say. Just the data transfer starts to get huge - hundreds of
megabytes? Perhaps this is not as bad as it looks, <em>if</em> the data is
being received in small amounts over long periods.</p>
<p>And let's say we can find a way to get the data out to everybody - they
still have to try to decrypt <strong>every</strong> proposal with <strong>every</strong> pubkey
they have that is a valid candidate (in version 1, that's reused keys,
let's say, or some subset of them). The computational requirement of
that is huge, even if some cleverness could reduce it (decrypt only one
AES block; use high performance C code e.g. based on libsecp256k1).
Again, perhaps if this is happening slowly, streamed over time, or in
chunks at regular integrals, it's not as bad. Still.</p>
<p>It's true that these problems don't arise at small scale, but then the
real value of this would be if it scaled up to large anonymity sets.</p>
<p>Even if this is addressed, there is another problem arising out of the
anonymous submission - any repository of proposals could be filled with
junk, to waste everyone's time. Apart from a
<a href="https://en.wikipedia.org/wiki/Hashcash">hashcash</a>-like
solution (not too implausible but may impose too much cost on the
proposer), I'm not sure how one could address that while keeping
submission anonymity.</p>
<p>At least we have the nice concept that this kind of protocol can improve
privacy on Bitcoin's blockchain without blowing up bandwidth and
computation for the Bitcoin network itself - it's "off-band", unlike
things like <a href="https://www.elementsproject.org/elements/confidential-transactions/investigation.html">Confidential
Transactions</a>
(although, of course, the effect of that is much more powerful). I think
ideas that take semantics and computation off chain are particularly
interesting.</p>
<h3>Conflicting proposals</h3>
<p>This is not really a problem: if Alisa proposes a coinjoin to Bob1 and
Bob2, and Bob1 accepts, then when Bob2 checks, he will find one of the
inputs for his proposed coinjoin is already spent, so it's not valid.
Especially in cases where there is a financial incentive, it just
incentives Bobs to be more proactive, or just be out of luck.</p>
<h3>Transaction structure and 2 party joins</h3>
<p>We have thus far talked only about 2 party coinjoins, which <em>ceteris
paribus</em> are an inferior privacy model compared to any larger number
(consider that in a 2 party coinjoin, the <em>other</em> party necessarily
knows which output is yours). The SNICKER model is not easily extendable
to N parties, although it's not impossible. But DarkWallet used 2 of 2
joins, and it's still in my opinion valuable. Costs are kept lower, and
over time these joins heavily damage blockchain analysis. A larger
number of joins, and larger anonymity set could greatly outweigh the
negatives<em>.</em></p>
<p>Structure: the model used in the aforementioned
<a href="https://github.com/AdamISZ/SNICKER-POC">POC</a>,
although stupid simple, is still viable: 2 inputs, one from each party
(easily extendable to 1+N), 3 outputs, with the receiver getting back
exactly one output of \~ the same size as the one he started with. The
proposer then has 1 output of exactly that size (so 2 equal outputs) and
one change. Just as in Joinmarket, the concept is that fungibility is
gained specifically in the equal outputs (the "coinjoin outputs"); the
change output is of course trivially linked back to its originating
input(s).</p>
<p>But there's no need for us to be limited to just one transaction
structure; we could imagine many, perhaps some templates that various
wallets could choose to support; and it'll always be up to the receiver
to decide if he likes the structure or not. Even the stupid X->X, Y->Y
"coinjoin" I mused about in my Milan presentation
<a href="https://youtu.be/IKSSWUBqMCM?t=47m21s">here</a>(warning:youtube)
might be fun to do (for some reason!). What a particularly good or
"best" structure is, I'll leave open for others to discuss.</p>
<h3>Second version - snicKER = Keys Encrypted to R</h3>
<p>We've been discussing all kinds of weird and whacky "Non-Interactive
Coinjoin" models on IRC for years; and perhaps there will still be
other variants. But arubi was mentioning to me yesterday that he was
looking for a way to achieve this goal <em>without</em> the nasty requirement
of reused keys, and between us we figured out that it is a fairly
trivial extension, <em>if</em> you can find a way to get confidence that a
particular existing utxo is co-owned with an input (or any input).
That's because if you have an input, you have not only a pubkey, but
also a <strong>signature</strong> (both will either be stored in the scriptSig, or in
the case of segwit, in the witness section of the transaction). An
<a href="https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm">ECDSA</a>
signature is published on the blockchain as a pair: <code>(r, s)</code>, where <code>r</code>
is the x-coordinate of a point <code>R</code> on the secp256k1 curve. Now, any
elliptic curve point can be treated as a pubkey, assuming someone knows
the private key for it; in the case of ECDSA, we call the private key
for <code>R</code>, <code>k</code>, that is: <code>R = kG</code>. <code>k</code> is called the nonce (="number used
once"), and is usually today calculated using the algorithm
<a href="https://tools.ietf.org/html/rfc6979">RFC6979</a>,
which determines its value deterministically from the private key
you're signing with, and the message. But what matters here is, the
signer either already knows <code>k</code>, or can calculate it trivially from the
signing key and the transaction. This provides us with exactly the same
scenario as in the first version; Bob knows the private key of <code>R</code>, so
Alisa can send a proposal encrypted to that public key, and can derive a
new address for Bob's destination using the same formula:</p>
<div class="highlight"><pre><span></span><span class="err">PB2 = R + k'G</span>
</pre></div>
<p>Here I used <code>k'</code> to disambiguate from the signature nonce <code>k</code>, but it's
exactly the same as before. As before, Bob, in order to spend the output
from the coinjoin, will need to store the new private key <code>k+k'</code>. For a
wallet it's a bit more work because you'll have to keep a record of
past transaction <code>k</code> values, or perhaps keep the transactions and
retrieve <code>k</code> as and when. Apart from that, the whole protocol is
identical.</p>
<h4>Finding candidates in the second version</h4>
<p>In version 2, we no longer need Bob to do something dubious (reusing
addresses). But now the proposer (Alisa) has a different and arguably
harder problem than before; she has to find transactions where she has
some reasonable presumption that a specific output and a specific input
are co-owned. You could argue that this is good, because now Alisa is
proposing coinjoins where linkages <em>are</em> known, so she's improving
privacy exactly where it's needed :) (only half true, but amusing). In
a typical Bitcoin transaction there are two outputs - one to
destination, one change; if you can unambiguously identify the change,
even with say 90% likelihood not 100%, you could make proposals on this
basis. This vastly expands the set of <em>possible</em> candidates, if not
necessarily plausible ones (see above on bootstrapping).</p>
<p>Additionally paradoxical is the fact that Joinmarket transactions <em>do</em>
have that property! The change outputs are unambiguously linkable to
their corresponding inputs through subset-sum analysis, see e.g.
<a href="https://github.com/AdamISZ/JMPrivacyAnalysis/blob/master/tumbler_privacy.md#jmsudoku-coinjoin-sudoku-for-jmtxs">here</a>.</p>
<p>Thus, Adlai Chandrasekhar's
<a href="http://adlai.uncommon-lisp.org:5000/">cjhunt</a>
tool (appears down as of writing),
<a href="https://github.com/adlai/cjhunt">code</a>,
identifies all very-likely-to-be Joinmarket transactions through
blockchain scanning, and its output could be used to generate candidates
(the proposed joins could be with those change outputs, using the `R`
values from one of the identified-as-co-owned inputs). See also
<a href="https://citp.github.io/BlockSci/chain/blockchain.html">BlockSci</a>.
Then if Joinmarket had both proposer- and receiver- side code
integrated, it would create a scenario where these type of coinjoins
would most likely be quite plausible to achieve.</p>
<h3>Conclusion</h3>
<p>I think this idea might well be viable. It's simple enough that there
aren't likely crypto vulnerabilities. The short version of the pros and
cons:</p>
<h4>Pros</h4>
<ul>
<li>No interactivity (the point), has many positive consequences, and
high anonymity standard</li>
<li>Relative ease of wallet integration (esp. compared to e.g.
Joinmarket), consensus requirement between them is limited.</li>
<li>Potentially huge anonymity set (different for version 1 vs version
2, but both very large)</li>
</ul>
<h4>Cons</h4>
<ul>
<li>For now only 2 parties and probably stuck there; limited coinjoin
model (although many transaction patterns possible).</li>
<li>Finding plausible candidates is hard, needs a bootstrap</li>
<li>Sybil attack on the encrypted messages; how to avoid the "junk
mail" problem</li>
</ul>
<p>Lastly, it should be fine with Schnorr (to investigate: aggregation in
this model), in version 1 and version 2 forms.</p>
<h3>Footnotes</h3>
<p>1. Sighashing - attempting a non-interactive coinjoin with some
interesting use of <code>SIGHASH_SINGLE</code> and <code>SIGHASH_ANYONECANPAY</code> seems at
least plausible (see
<a href="https://en.bitcoin.it/wiki/OP_CHECKSIG#Procedure_for_Hashtype_SIGHASH_SINGLE">here</a>),
although it's not exactly heartening that no one ever uses
<code>SIGHASH_SINGLE</code> (and its rules are arcane and restrictive), not to even
speak of watermarking. Hopefully the idea expressed here is better.</p>P(o)ODLE2016-06-15T00:00:00+02:002016-06-15T00:00:00+02:00Adam Gibsontag:joinmarket.me,2016-06-15:/blog/blog/poodle/<p>DLEQ proofs as tokens for anti-snooping in coinjoin</p><h3>P(o)ODLE</h3>
<blockquote>
<p><em>Here is a purse of monies ... which I am not going to give to you.</em></p>
</blockquote>
<p>- <a href="https://en.wikipedia.org/wiki/Bells_(Blackadder)">Edmund
Blackadder</a></p>
<p><img src="/web/20200712194227im_/https://joinmarket.me/static/media/uploads/.thumbnails/poodle.jpeg/poodle-225x308.jpeg" width="225" height="308" /></p>
<h3>P(o)ODLE, not POODLE</h3>
<p>This post, fortunately, has nothing to do with faintly ridiculous <a href="https://en.wikipedia.org/wiki/POODLE">SSL 3
downgrade
attacks</a>.
Irritatingly, our usage here has no made-up need for the parenthetical
(o), but on the other hand "podle" is not actually a word.</p>
<h3>The problem</h3>
<p>You're engaging in a protocol (like Joinmarket) where you're using
bitcoin utxos regularly. We want to enforce some scarcity; you can't use
the same utxo more than once, let's say. Utxos can be created all the
time, but at some cost of time and money; so it can be seen as a kind of
rate limiting.</p>
<p>So: you have a bitcoin utxo. You'd like someone else to know that you
have it, <strong>and that you haven't used it before, with them or anyone
else</strong>, <strong>in this protocol,</strong> but you don't want to show it to them. For
that second property (hiding), you want to make a <em>commitment</em> to the
utxo. Later on in the protocol you will open the commitment and reveal
the utxo.</p>
<p>Now, a <a href="https://en.wikipedia.org/wiki/Commitment_scheme">cryptographic
commitment</a>
is a standard kind of protocol, usually it works something like:</p>
<div class="highlight"><pre><span></span><span class="err">Alice->Bob: commit: h := hash(secret, nonce)</span>
<span class="err">(do stuff)</span>
<span class="err">Alice->Bob: open: reveal secret, nonce</span>
<span class="c">Bob: verify: h =?= hash(secret, nonce)</span>
</pre></div>
<p>Hashing a secret is <em>not</em> enough to keep it secret, at least in general:
because the verifier might be able to guess, especially if the data is
from a small-ish set (utxos in bitcoin being certainly a small enough
set; and that list is public). So usually, this protocol, with a
large-enough random nonce, would be enough for the purposes of proving
you own a bitcoin utxo without revealing it.</p>
<p>But in our case it doesn't suffice - because of the bolded sentence in
the problem description. You could pretend to commit to <em>different</em>
utxos at different times, simply by using different nonces. If you tried
to do that <em>just with me</em>, well, no big deal - I'll just block your
second use. But you <em>could </em>use the same utxos with different
counterparties, and they would be none the wiser, unless they all shared
all private information with each other. Which we certainly don't want.</p>
<p>Contrariwise, if you ditch the nonce and just use Hash(utxo) every time
to every counterparty, you have the failure-to-hide-the-secret problem
mentioned above.</p>
<p>In case you didn't get that: Alice wants to prove to Bob and Carol and
... that she owns utxo \(U\), and she never used it before. Bob and
Carol etc. are keeping a public list of all previously used commitments
(which shouldn't give away what the utxo is, for privacy). If she just
makes a commitment: Hash(\(U +\) nonce) and sends it to Bob and Carol,
they will check and see it isn't on the public list of commitments and
if not, OK, she can open the commitment later and prove honest action.
But her conversations with Bob and Carol are separate, on private
messaging channels. How can Bob know she didn't use <em>the same utxo as
previously used with Carol, but with a different nonce</em>?</p>
<h3>The solution</h3>
<p>This is a bit of a headscratcher; after several IRC discussions, Greg
Maxwell suggested the idea of <strong>proof of discrete logarithm
equivalence</strong> (hence the title), and pointed me at <a href="http://crypto.stackexchange.com/questions/15758/how-can-we-prove-that-two-discrete-logarithms-are-equal">this
crypo.stackexchange
thread</a>.
It's a cool idea (although note that that description is based on DL
rather than ECDL seen here): "shift" the EC point to a new
base/generator point, so that nobody else can read (crudely put), then
append a Schnorr signature acting as proof that the two points have the
same discrete logarithm (= private key) with respect to the two base
points. In detail, consider a Bitcoin private, public keypair \((x,
P)\) for the usual base point/generator \(G\), and consider a
<a href="https://en.wikipedia.org/wiki/Nothing_up_my_sleeve_number">NUMS</a>
alternative generator \(J\) ( a little more on this later).</p>
<p>$$P = xG$$</p>
<p>$$P_2 = xJ$$</p>
<p>Next, Alice will provide her commitment as \(H(P_2)\) in the
handshake initiation stage of the protocol. Then, when it comes time for
Alice to request private information from Bob, on their private message
channel, she will have to open her commitment with this data:</p>
<p>$$P, U, P_2, s, e$$</p>
<p>Here \(s,e\) are a Schnorr signature proving equivalence of the
private key (we called it \(x\) above) with respect to \(G,J\), but
of course without revealing that private key. It is constructed, after
choosing a random nonce \(k\), like this:</p>
<p>$$K_G = kG$$</p>
<p>$$K_J = kJ$$</p>
<p>$$e = H(K_G || K_J || P || P_2)$$</p>
<p>$$s = k + xe$$</p>
<p>Then Bob, receiving this authorisation information, proceeds to verify
the commitment before exchanging private information:</p>
<ol>
<li>Does \(H(P_2)\) equal the previously provided commitment? If yes:</li>
<li>Check that the commitment is not repeated on the public list (or
whatever the policy is)</li>
<li>Verify via the blockchain that \(P\) matches the utxo \(U\)</li>
<li>\(K_G = sG - eP\)</li>
<li>\(K_J = sJ - eP_2\)</li>
<li>Schnorr sig verify operation: Does \(H(K_G || K_J || P ||
P_2) = e\) ?</li>
</ol>
<p>Bob now knows that the utxo \(U\) has not been repeated (the simplest
policy) but Alice has not been exposed to a potential public leakage of
information about the utxo. (It should be noted of course! Bob knows the
utxo from now on, but that's for another discussion about Coinjoin
generally...)</p>
<h3>Why an alternate generator point \(J\)?</h3>
<p>Publishing \(H(P_2)\) gives no information about \(P\), the actual
Bitcoin pubkey that Alice wants to use; in that sense it's the same as
using a nonce in the commitment. But it also gives her no degree of
freedom, as a nonce does, to create different public values for the same
hidden pubkey. No one not possessing \(x\) can deduce \(P\) from
\(P_2\) (or vice versa, for that matter) - <strong>unless</strong> they have the
private key/discrete log of \(J\) with respect to \(G\). If anyone
had this number \(x^*\) such that \(J = x^{*}G\), then it would be
easy to make the shift from one to the other:</p>
<p>$$P_2 = xJ = x(x^{*}G) = x^{*}(xG) = x^{*}P$$</p>
<p>and apply a modular inverse if necessary.</p>
<p>This is why the concept of NUMS is critical. The construction of a NUMS
alternate generator is discussed in <a href="https://elementsproject.org/elements/confidential-transactions/">the same CT doc as
above</a>,
and also in <a href="https://github.com/AdamISZ/ConfidentialTransactionsDoc/blob/master/essayonCT.pdf">my CT
overview</a>,
at the end of section 2.2. Note I use \(J\) here in place of \(H\)
to avoid confusion with hash functions.</p>
<h3>Code and thoughts on implementation</h3>
<p>I did an abbreviated write up of the concept of this post in <a href="https://gist.github.com/AdamISZ/9cbba5e9408d23813ca8#defence-2-committing-to-a-utxo-in-publicplaintext-at-the-start-of-the-handshake">this
gist</a>,
as one of three possible ways of attacking the problem in Joinmarket:
<a href="https://github.com/JoinMarket-Org/joinmarket/issues/156">how can we prevent people initiating transactions over and over again
to collect information on
utxos</a>?
This algorithm is not intended as a <em>complete</em> solution to that issue,
but it's very interesting in its own right and may have a variety of
applications, perhaps.</p>
<p>The algorithm was fairly simple to code, at least in a naive way, and I
did it some time ago using Bitcoin's
<a href="https://github.com/bitcoin-core/secp256k1">libsecp256k1</a>
with the <a href="https://github.com/ludbb/secp256k1-py">Python binding by
ludbb</a>.
An initial version of my Python "podle" module is
<a href="https://github.com/JoinMarket-Org/joinmarket/blob/90ec05329e06beed0fbc09528ef6fb3d2c5d03ba/lib/bitcoin/podle.py">here</a>.</p>
<p>There are lots of tricky things to think about in implementing this; I
think the most obvious issue is how would publishing/maintaining a
public list work? If we just want each utxo to be allowed only one use,
any kind of broadcast mechanism would be fine; other participants can
know as soon as any \(H(P_2)\) is used, or at least to a reasonable
approximation. Even in a multi-party protocol like Joinmarket, the utxo
would be broadcast as "used" only after its first usage by each party,
so it would from then on be on what is effectively a blacklist. But if
the policy were more like "only allow re-use 3 times" this doesn't seem
to work without some kind of unrealistic honesty assumption.</p>