I'm going to do this backwards - start with the end goal user experience, and then work backwards to the technical design. This way, those not wanting to get lost in technical details can still get the gist.
Pictured above: me misusing a meme as a symbol and deliberately not adding any text to it.
Alisa lives in Moscow; she is a tech-savvy Bitcoin user, uses Linux and the command line, and runs a fully verifying Bitcoin Core node. She doesn't have indexing enabled, but she (sometimes, or long-running) runs a tool called
snicker-scan on the blocks received by her node. It scans recent Bitcoin blocks looking for transactions with a particular pattern, and returns to her in a file a list of candidate transactions. She pipes this list into another tool which uses her own Bitcoin wallet and constructs proposals: new transactions involving her own utxos and utxos from these newly found transactions, which she signs herself. Then, for each one, she makes up a secret random number and sends (the proposed transactions + the secrets), encrypted to a certain public key, in each case, so no one but the owner can read it, to a Tor hidden service which accepts such submissions. For now, her job is done and she gets on with her day.
Bob lives in New York. He's a Bitcoin enthusiast who uses it a lot, and likes to test out new features, but has never written code and isn't tech-savvy like that. A few hours after Alisa went to bed he opens one of his mobile wallets and a message pops up:
New coinjoin proposals found. Check?. He heard about this, and heard that you can improve your privacy with this option, and even sometimes gain a few satoshis in the process. So he clicks
Yes. In the background his mobile wallet downloads a file of some 5-10MB (more on this later!). Bob did this once before and was curious about the file; when he opened it he saw it was text with lots of unintelligible encrypted stuff like this:
Now his mobile does some processing on this file; it takes a little while, some seconds perhaps, processing in the background. At the end it pops up a new message:
Coinjoin transaction found. Would you like to broadcast it? and underneath it shows the transaction spending 0.2433 BTC out of his wallet and returning 0.2434 BTC in one of the outputs. It shows that the other inputs and outputs are not his, although one of them is also for 0.2434 BTC. Does he want to accept? Sure! Free money even if it's only cents. Even with no free money, he knows that coinjoin makes his privacy better. So he clicks
Yes and it's broadcast. Done.
Non-interactivity is a hugely desirable property in protocols; this is particularly the case where privacy is a priority. Firstly, it avoids the need to synchronize (Alisa, and her computer, had gone to sleep when Bob performed his step). Second, to avoid malicious interruption of an interactive protocol, it can help to identify the participants, but that is very damaging to the whole point of a protocol whose goal is privacy. Non-interactivity cuts this particular Gordian knot; one side can send the message anonymously and the other participant simply uses the data, but this has the limitation of the sender finding the receiver, which means some weak identification of the latter. Even better is if the request can be sent encrypted to the receiver, then it can be broadcast anywhere for the receiver to notice. That latter model is the most powerful, and is used here, but it does have practicality drawbacks as we'll discuss.
So, note that in the above scenario Alisa and Bob do not meet, do not synchronize, and need never meet or find out who each other are in future either. Their "meeting" is entirely abstracted out to one side publishing an encrypted message and the other side receiving all such encrypted messages and only reading the one(s) encrypted to his pubkey. The all part helps preserve Bob's privacy, if he finds a way to broadcast the final transaction with a reasonable anonymity defence (see e.g. Dandelion; I'm of the opinion that that battle - making Bitcoin transaction broadcast anonymous - is something we will win, there is a massive asymmetry in favour of the privacy defender there).
Here's the obligatory link to the Coinjoin OP. You can skip this section if you know Coinjoin well.
Otherwise, I'll give you a quick intro here, one that naturally leads into the SNICKER concept:
Each input to a transaction requires (for the transaction to be valid) a signature by the owner of the private key (using singular deliberately, restricting consideration to p2pkh or segwit equivalent here) over a message which is ~ the transaction. Each of these signatures can be constructed separately, by separate parties if indeed the private key for each input are owned by separate parties. The "normal" coinjoining process thus involves the following steps (for now, not specifying who carries out each step):
There are different protocols one can choose to get all these steps done, ranging from simple to complex. A server can be the coordinating party; blinding can be used to prevent the server knowing input-output mapping. Coinshuffle can be used, creating a kind of onion-routing approach to prevent parties involved knowing the linkages (doesn't require a server to coordinate, but requires more complex interactivity). One of the parties in the join can be the "server", thus that party gains privacy that the others don't (Joinmarket). Etc.
The difficulties created by any interactivity are considerably ameliorated in a client-server model (see e.g. the old blockchain.info SharedCoin(link outdated) model), the serious tradeoff is the server knowing too much, and/or a coordination/waiting problem (which may be considered tolerable; see both SharedCoin and DarkWallet; with a sufficient liquidity pool the waiting may be acceptable).
There are a lot of details to discuss here, but there is always some interactivity (you can only sign once you know the full transaction, assuming no custom sighashing1), and a model with a server is basically always going to be more problematic, especially at scale.
So hence we try to construct a way of doing at least simple Coinjoins, in at least some scenarios, without any server requirement or coordination. Now I'll present the basic technical concept of how to do this in SNICKER, in 2 versions.
To make the Coinjoin non-interactive, we need it to be the case that Alisa can post a message for Bob, without explicitly requesting to create a private message channel with him. This requires encrypting a message that can then be broadcast (e.g. over a p2p network or on a bulletin board).
(In case it isn't clear that either encryption or a private message channel is required, consider that Alice must pass to Bob a secret which identifies Bob's output address (explained below), critically, and also her signature, which is on only her inputs; if these are seen in public, the input-output linkages are obvious to anyone watching, defeating the usual purpose of Coinjoin.)
To achieve this we need a public key to encrypt a message to Bob. This is the same kind of idea as is used in tools like PGP/gpg - only the owner of the public key's private key can read the message.
In this "First version" we will assume something naughty on Bob's part: that he has reused an address! Thus, a public key will exist on the blockchain which we assume (not guaranteed but likely; nothing dangerous if he doesn't) he still holds the private key for.
Given this admittedly unfortunate assumption, we can use a simple and established encryption protocol such as ECIES to encrypt a message to the holder of that public key.
Alisa, upon finding such a pubkey, call it
PB, and noting the corresponding utxo
UB, will need to send, ECIES encrypted to
PB, several items (mostly wrapped up in a transaction) to Bob to give him enough material to construct a valid coinjoin without any interaction with herself:
UBas inputs (the above 4 can be implied from this)
The last point in the above list is of course at first glance not possible, unless you made some ultra dubious assumptions about shared ownership, i.e. if Alisa somehow tried to deduce other addresses that Bob already owns (involving more address reuse). I don't dismiss this approach completely but it certainly looks like a bit of an ugly mess to build a system based on that. Instead, we can use a very well known construct in ECC; in English something like "you can tweak a counterparty's pubkey by adding a point that you know the private key for, but you still won't know the private key of the sum". Thus in this case, Alice, given Bob's existing pubkey
PB, which is the one she is using to encrypt the message, can construct a new pubkey:
PB2 = PB + k*G
for some 32 byte random value
Alice will include the value of
k in the encrypted message, so Bob can verify that the newly proposed destination is under his control (again we'll just assume a standard p2pkh address based on
PB2, or a segwit equivalent).
Assuming Bob somehow finds this message and successfully ECIES-decrypts it using the private key of
PB, he now has everything he needs to (if he chooses), sign and broadcast the coinjoin transaction.
PB, UB: Construct a standard formatted message; here is a simple suggestion although in no way definitive:
8(?) magic bytes and 2 version bytes for the message type k-value 32 bytes Partially signed transaction in standard Bitcoin serialization (optionally padding to some fixed length)
We defer discussing how in practice Bob will get access to the message later; but note that if he has done this, he already knows the value of
P_B and will thus know also
U_B. He ECIES-decrypts it, and recognizes it's for him through correct magic bytes (other messages encrypted to other pubkeys will come out random).
Then, this format has sufficient information for Bob to evaluate easily. First, he can verify that
U_B is in the inputs. Then he can verify that for 1 of the 2 outputs (simple model) has a scriptPubKey corresponding to
PB2 = PB + k*G. He can then verify the output amounts fit his requirements. Finally he can verify the ECDSA signature provided on
U_A (hence "partially signed transaction"). Given this he can, if he chooses, sign on
PB and broadcast. He must of course keep a permanent record of either
k itself or, more likely, the private key
k + x (assuming
P = x * G).
Before going further into details, and discussing the second (probably superior but not as obviously workable) version of SNICKER, I want to mention that I very quickly put together some proof of concept code in this github repo; it uses Joinmarket-clientserver as a dependency, implements ECIES in a compatible form to that used by Electrum, and allows testing on regtest or testnet, admittedly with a bunch of manual steps, using the python script
snicker-tool.py. The workflow for testing is in the README. To extend the testing to more wallets requires some way to do ECIES as well as some way to construct the destination addresses as per
PB2 = PB + kG above. I did note that, usefully, the partially signed transactions can be signed directly in Bitcoin Core using
signrawtransaction and then
sendrawtransaction for broadcast, but note that somehow you'll have to recover the destination address, as receiver, too. Note that there was no attempt at all to construct a scanning tool for any reused-key transactions here, and I don't intend to do that (at least, in that codebase).
In this section will be a set of small subsections describing various issues that will have to be addressed to make this work.
One reason this model is interesting is because it's much more plausible to integrate into an existing wallet than something like Joinmarket - which requires dealing with long term interactivity with other participants, communicating on a custom messaging channel, handling protocol negotiation failures etc. To do SNICKER as a receiver, a wallet needs the following elements:
P' = P + kGwhere
kis what is passed to you, and
Pis the pubkey of your existing key controlling the output to be spent. I would presume that you would have to treat
P=xG, as a newly imported private key. Note that we cannot use a deterministic scheme for this from
P, since that would be calculatable by an external observer; it must be based on a secret generated by "Alisa".This could be a bit annoying for a wallet, although of course it's easy in a naive sense.
If a wallet only wanted to implement the receiver side (what we called "Bob" above), that's it.
The only "consensus" part of the protocol is the format of the encrypted coinjoin proposals (and the ECIES algorithm used to encrypt them). We could deal with different transaction types being proposed (i.e. different templates, e.g. 3 outputs or 4, segwit or not), although obviously it'll be saner if there are a certain set of templates that everyone knows is acceptable to others.
There is no real need for each individual "Alisa" to scan, although she might wish to if she has a Bitcoin node with indexing enabled. This is a job that can be done by any public block explorer and anyone can retrieve the data, albeit there are privacy concerns just from you choosing to download this data. The data could be replicated on Tor hidden services for example for better privacy. So for now I'm assuming that scanning, itself, is not an issue.
A much bigger issue might be finding plausible candidates. Even in this version 1 model of looking only for reused keys, which are hopefully not a huge subset of the total utxo set, there are tons of potential candidates and, to start with, none of them at all are plausible. How to filter them?
If a set of users were using a particular wallet or service (preferably a large set), it might be possible to identify their transactions "Acme wallet transactions". Funnily enough, Joinmarket, because it uses a set and unusual coinjoin pattern, satisfies this property in a very obvious way; but there might be other cases too. See the notes in "second version", below, on how Joinmarket might work specifically in that case.
Better of course, is if we achieved that goal with a more user-friendly wallet with a much bigger user-base; I'd ask wallet developers to consider how this might be achieved.
Another aspect of bootstrapping is the Joinmarket concept - i.e. make a financial incentive to help bootstrap. If creators/proposers are sufficiently motivated they may offer a small financial incentive to "sweeten the pot", as was suggested in the scenario at the start of this post. This will help a lot if you want the user-set to grow reasonably large.
This is of course filed under "problems you really want to have", but it's nevertheless a very real problem, arguably the biggest one here.
Imagine 10,000 utxo candidates that are plausible and 1000 active proposers. Imagine they could all make proposals for a large-ish subset of the total candidates, we could easily imagine 1,000,000 candidates at a particular time. Each encrypted record takes 500-800 bytes of space, let's say. Just the data transfer starts to get huge - hundreds of megabytes? Perhaps this is not as bad as it looks, if the data is being received in small amounts over long periods.
And let's say we can find a way to get the data out to everybody - they still have to try to decrypt every proposal with every pubkey they have that is a valid candidate (in version 1, that's reused keys, let's say, or some subset of them). The computational requirement of that is huge, even if some cleverness could reduce it (decrypt only one AES block; use high performance C code e.g. based on libsecp256k1). Again, perhaps if this is happening slowly, streamed over time, or in chunks at regular integrals, it's not as bad. Still.
It's true that these problems don't arise at small scale, but then the real value of this would be if it scaled up to large anonymity sets.
Even if this is addressed, there is another problem arising out of the anonymous submission - any repository of proposals could be filled with junk, to waste everyone's time. Apart from a hashcash-like solution (not too implausible but may impose too much cost on the proposer), I'm not sure how one could address that while keeping submission anonymity.
At least we have the nice concept that this kind of protocol can improve privacy on Bitcoin's blockchain without blowing up bandwidth and computation for the Bitcoin network itself - it's "off-band", unlike things like Confidential Transactions (although, of course, the effect of that is much more powerful). I think ideas that take semantics and computation off chain are particularly interesting.
This is not really a problem: if Alisa proposes a coinjoin to Bob1 and Bob2, and Bob1 accepts, then when Bob2 checks, he will find one of the inputs for his proposed coinjoin is already spent, so it's not valid. Especially in cases where there is a financial incentive, it just incentives Bobs to be more proactive, or just be out of luck.
We have thus far talked only about 2 party coinjoins, which ceteris paribus are an inferior privacy model compared to any larger number (consider that in a 2 party coinjoin, the other party necessarily knows which output is yours). The SNICKER model is not easily extendable to N parties, although it's not impossible. But DarkWallet used 2 of 2 joins, and it's still in my opinion valuable. Costs are kept lower, and over time these joins heavily damage blockchain analysis. A larger number of joins, and larger anonymity set could greatly outweigh the negatives.
Structure: the model used in the aforementioned POC, although stupid simple, is still viable: 2 inputs, one from each party (easily extendable to 1+N), 3 outputs, with the receiver getting back exactly one output of ~ the same size as the one he started with. The proposer then has 1 output of exactly that size (so 2 equal outputs) and one change. Just as in Joinmarket, the concept is that fungibility is gained specifically in the equal outputs (the "coinjoin outputs"); the change output is of course trivially linked back to its originating input(s).
But there's no need for us to be limited to just one transaction structure; we could imagine many, perhaps some templates that various wallets could choose to support; and it'll always be up to the receiver to decide if he likes the structure or not. Even the stupid X->X, Y->Y "coinjoin" I mused about in my Milan presentation here(warning:youtube) might be fun to do (for some reason!). What a particularly good or "best" structure is, I'll leave open for others to discuss.
We've been discussing all kinds of weird and whacky "Non-Interactive Coinjoin" models on IRC for years; and perhaps there will still be other variants. But arubi was mentioning to me yesterday that he was looking for a way to achieve this goal without the nasty requirement of reused keys, and between us we figured out that it is a fairly trivial extension, if you can find a way to get confidence that a particular existing utxo is co-owned with an input (or any input). That's because if you have an input, you have not only a pubkey, but also a signature (both will either be stored in the scriptSig, or in the case of segwit, in the witness section of the transaction). An ECDSA signature is published on the blockchain as a pair:
(r, s), where
r is the x-coordinate of a point
R on the secp256k1 curve. Now, any elliptic curve point can be treated as a pubkey, assuming someone knows the private key for it; in the case of ECDSA, we call the private key for
k, that is:
R = kG.
k is called the nonce (="number used once"), and is usually today calculated using the algorithm RFC6979, which determines its value deterministically from the private key you're signing with, and the message. But what matters here is, the signer either already knows
k, or can calculate it trivially from the signing key and the transaction. This provides us with exactly the same scenario as in the first version; Bob knows the private key of
R, so Alisa can send a proposal encrypted to that public key, and can derive a new address for Bob's destination using the same formula:
PB2 = R + k'G
Here I used
k' to disambiguate from the signature nonce
k, but it's exactly the same as before. As before, Bob, in order to spend the output from the coinjoin, will need to store the new private key
k+k'. For a wallet it's a bit more work because you'll have to keep a record of past transaction
k values, or perhaps keep the transactions and retrieve
k as and when. Apart from that, the whole protocol is identical.
In version 2, we no longer need Bob to do something dubious (reusing addresses). But now the proposer (Alisa) has a different and arguably harder problem than before; she has to find transactions where she has some reasonable presumption that a specific output and a specific input are co-owned. You could argue that this is good, because now Alisa is proposing coinjoins where linkages are known, so she's improving privacy exactly where it's needed :) (only half true, but amusing). In a typical Bitcoin transaction there are two outputs - one to destination, one change; if you can unambiguously identify the change, even with say 90% likelihood not 100%, you could make proposals on this basis. This vastly expands the set of possible candidates, if not necessarily plausible ones (see above on bootstrapping).
Additionally paradoxical is the fact that Joinmarket transactions do have that property! The change outputs are unambiguously linkable to their corresponding inputs through subset-sum analysis, see e.g. here.
Thus, Adlai Chandrasekhar's cjhunt tool (appears down as of writing), code, identifies all very-likely-to-be Joinmarket transactions through blockchain scanning, and its output could be used to generate candidates (the proposed joins could be with those change outputs, using the `R` values from one of the identified-as-co-owned inputs). See also BlockSci. Then if Joinmarket had both proposer- and receiver- side code integrated, it would create a scenario where these type of coinjoins would most likely be quite plausible to achieve.
I think this idea might well be viable. It's simple enough that there aren't likely crypto vulnerabilities. The short version of the pros and cons:
Lastly, it should be fine with Schnorr (to investigate: aggregation in this model), in version 1 and version 2 forms.
1. Sighashing - attempting a non-interactive coinjoin with some interesting use of
SIGHASH_ANYONECANPAY seems at least plausible (see here), although it's not exactly heartening that no one ever uses
SIGHASH_SINGLE (and its rules are arcane and restrictive), not to even speak of watermarking. Hopefully the idea expressed here is better.
Adam Gibson (22)