This is a bit speculative. If you have expertise in this area, I’d love to hear your thoughts. The basic idea is that one can get something like intuitionist logic as a meta-logic derived by relaxing from a functional truth-valuation to a relational one. This logic, when extended to probability, might form a useful basis for modeling common-sense reasoning. A slight reinterpretation might form the basis for a novel perspective on quantum probabilities.
Intuitionism by ‘lifting’ classical logic
Classical logic considers truth-valuation to be a function from statements to a Boolean (denoted as 0,1 here). Relaxing the valuation from a function to a relation leads to a new valuation that is a function on the power-set of the Booleans: { {}, {1}, {0}, {0,1} }. Let’s say that the relation is “there exists a proof that.” We can have a proof-valuation that says “statement A is related to 0 and 1” which is read as saying that there exists a proof that A is true and also a proof that A is false — which means we have derived a contradiction. If the proof valuation is {}, then we say that there does not exist a proof about the truth value of A — which means that A is undecidable. Relating a statement to either {0} or {1} means that we can prove that A is true (and not false) or vice versa, which is what is meant by constructivist T/F values. Denote the proof-valuation {0,1} as C for ‘contradiction’, and {} as N for ‘undecidable’. Let’s denote the proof-valuation function as V.
We can also talk about membership of 0,1 in the return value of V: “0 in V(A) and not 1 in V(A)” is equivalent to V(A)={1}=T which means that S is proven to be true and not false. This is stronger than in classical logic where we might derive a contradiction without knowing it. That is, proving S “is” true is not incompatible with proving that it is also false in classical logic — the axioms are inconsistent in that case. The proof-valuation thus seems a bit too strong: V(A)=T is equivalent to “A is proved true and this system is consistent.”
We might wish to make weaker statements that do not assert consistency of the system. In that case, we could state that “1 is in V(A)”, which means that it is possible to prove that A is true, but without making any claim about whether A can also be proved to be false. This corresponds to the classical-logic proof that a statement is true, at least in a system subject to the incompleteness theorem. Another way to state this: V(A) is in {{0,1},{1}}.
We can make other ‘vague’ statements like this as well. Let’s also let denote the set { {1}, {0} } as R for a ‘real’ statement. All statements in classical logic fall into this category, so that the law of the excluded middle holds. Denote { {}, {0,1} } as U for ‘unreal.’ Denote { {} ,{0} ,{0,1} } as D for ‘not true’ and { {}, {1}, {0,1} } as G for ‘not false’, and M and L for { {1}, {0,1} } and { {0}, {0,1} } for ‘possibly true/false (but also possibly inconsistent)’. It’s important to note that saying V is in D (‘not true’) is not inconsistent with saying that V is in G (‘not false’): if V is in their intersection, then V in { {}, {0,1} }, which means that either V is undecidable or the system is inconsistent.
These values live in the powerset of the powerset of the Booleans: we’ve now climbed up two meta-levels above classical logic. We might want a third valuation, distinct from the truth- and proof- valuations, to discuss this level. Let’s call this the justification-valuation, J. Thus, J(A) = R is read as “there is justification that the proof-value of A is R.” One can continue up this meta-ladder indefinitely, but for now this is as high as I want to go. At this level, we can already discuss many interesting topics.
For instance, we can see that classical logic, as noted, is tantamount to the assumption that all statements have J(A) =R: all are well-formed AND the system of axioms is consistent. Proof-by-contradiction is reified: the proof-value of A is exactly C={0,1}. Arriving at a contradiction means that the system is not closed in R. We also have a more natural way to express statements that are undecidable, or might be undecidable. Providing a constructive proof takes the justification-value of a statement from a more-vague to a less-vague state. Framing intuitionist logic as simply dropping the law of the excluded middle prevents discussing the reasons why a proved truth-value may not be forthcoming (ie, distinguishing undecidability from contraditions).
Intuitionism and probability
In this section, I’m going to slightly reinterpret the meaning of the valuation V: rather than proof, I’d like to talk about V as “having empirical evidence (+ reasoning) that demonstrates.”
Suppose Q(A) is a ‘metaprobability’ of A: the probability distribution over the possible values of V(A). The ‘regular’ probability P(A) for a statement to be either true or false because the conditional probability for P(A) given that V(A) is in R: that is, assume the question is ‘real,’ what are the relative probabilities of the truth values? This corresponds to the common notion of probabilities, but it’s not the whole story because intuitionist logic seems to be somewhat more natural for folks. Constructing probability theory for intuitionist logic might resolve some unintuitive effects of using standard probability theory for everyday reasoning.
Let L be the indicator function that denotes whether the question implied by A is a real question or not, ie L(A)={1 if V(A) in R, 0 if Q(A) in U}. Q(A | L(A)=1) denotes the probability of the statement being proved at all, and the total probability is Q(A)=Q(A|L(A)) * Q(L(A)). The conditional metaprobability Q(A|L(A)) is all that exists in classical probability theory. This new theory relates to Dempster-Shafer theory, in which probabilities need not sum to 1: in this case, it is because the metaprobability does sum to 1, but there are additional states other than T/F that the probability can ‘leak’ into.
If Q is high for V(A)=C, this indicates we have strong belief that a contradiction exists — we’ve assumed contradictory information. This results in revising our model, questioning assumptions. By operating at this meta-level, our reasoning system can cope with contradictions without breaking down.
If Q is high for V(A) = N, this means we don’t have enough information to trust our answer. This kind of situation drives us to gather more data (observation or experiments).
If Q=0.5 for V(A)=T and V(A) =F, then we are very confident that our reasoning+evidence is correct, but that we cannot answer the question better than chance. This allows us to distinguish from the situation where we have no evidence — where Q is large for V(A) = N. In this situation, we might still assign a small but finite metaprobability to each of V(A)=T and V(A) = F, according to Laplace’s principle of insufficient reason. But more importantly, now we can also encode that we have very little confidence in our determination that these two probabilities are equal! This is the key issue motivation for Dempster-Shafer theory. Their theory gets hung up on how to interpret the missing probability; much the same as the difficulties with the use of ‘Null’ in programming & databases. Here we have two distinct possibilities (for under/over-constrained truth values, respectively N/C) with very clear meanings.
As far as betting goes, if an agent has that meta-probabilities for T,F are both small, the agent might prefer not to bet, even if the odds ratio between T & F is large. Alternatively, perhaps one could assign the remaining meta-probability equally between T & F, for the purposes of taking the expectation. In that case, the effective log-odds would be roughly zero, even though the log-odds of the T/F split is quite large. Then the risk/reward tradeoff looks very different. This needs to be fleshed out.
Intuitionistic extensions to Bayesian reasoning
Bayesian probability theory is an extremely powerful tool, but it has limitations. The following quote speaks to a key limitation, one I’ve experienced first-hand:
It is very important to understand the following point. Probability theory always gives us the estimates that are justified by the information that was actually used in the calculation. Generally, a person who has more relevant information will be able to do a different (more complicated) calculation, leading to better estimates. But of course, this presupposes that the extra information is actually true. If one puts false information into a probability calculation, then the probability theory will give optimal estimates based on false information: these could be very misleading. The onus is always on the user to tell the truth and nothing but the truth; probability theory has no safety device to detect falsehoods.
G. L Bretthorst, “Baysian Spectrum Analysis and Parameter Estimation,” Springer-Verlag series ‘Lecture Notes in Statistics’ # 48, 1988, p30-31. [Emphasis mine]
I should disclaim this by noting that the log of the evidence (that is, the Bayesian generalization of the ‘chi-squared’ value) does indicate when an event is extremely improbable according to a model. Nevertheless, there are pitfalls here, which intuitionist probability theory could avoid, by providing a way to explicitly represent contradictory and insufficient information in a direction orthogonal to the odds ratio between the probabilities for truth and falsity. It seems to me that probability is overloaded, but not because probability theory needs to be modified. Rather, common-sense reasoning is more akin to intuitionist logic, while probability theory is the natural extension of classical logic. Thus, creating a probability theory appropriate for intuitionism might prove valuable for modeling common-sense reasoning.
As a concrete example, consider the following scenario posed by N. Taleb: you are told that a coin is fair, but then you watch as the coin is flipped and results in 100 ‘heads’ in a row. You are offered a bet on the next toss. How would you bet? A naive application of probability theory would result in still believing the initial assertion that the coin was fair, while more sophisticated application would conclude that the coin toss had been rigged. We would be less surprised to have been lied to than to see an event with 30 orders of magnitude of improbability. However, if we had rounded our initial probability of being lied to down to zero (suppose we had reason to trust the person tossing the coin), then we would never be able to update our probability to a finite value – Bayesian updating is always multiplicative.
We have some freedom to determine the updating rules for intuitionist probabilities. In particular, one might desire that probability is transferred away from ‘classical’ T/F values and toward C when we have contradictory evidence/assumptions. Also, by reserving probability for N when we have little/no information, we could set T and/or F to zero and still allow them to become finite later. I haven’t worked this all out yet.
Intuitionism and quantum probability?
I have a hunch that one could interpret the complex probability amplitudes in quantum mechanics by a similar process of ‘lifting’ the state space from classical truth values to some other meta-valuation, possibly the ‘observation’ valuation. That is, one would say “the probability that statement A is observed to be true is 50%”, rather than saying “the probability that the statement is true is 50%.” This is in the spirit of the Copenhagen interpretation of QM. Note that, before an event has occurred, the probability of its outcome having been observed to be anything is zero! That is, the N-value (not enough information) comes into play naturally. This also takes care of the counterfactual difficulties: since the measurement was not conducted, the probability of the outcome will always be 100% on the N-value.
On the other hand, we can rule out the possibility of contradictory observations: the C-value always has zero probability, because objective reality is single-valued. I suspect that this, together with the usual normalization condition, suffices to determine the Born rule. Together they reduce the degrees of freedom of the meta-probability from 4 to 2. This allows mapping the meta-probability to a complex number, which also has two degrees of freedom. I haven’t figured out how to do this mapping yet.
Possible interesting consequences: wavefunction collapse might viewed as the transition from indeterminate state (N) to definite state (T/F). There could be a change in the reason that the marginal probabilities are 50/50 for the outcome of one of the spin measurements in a Bell entanglement experiment when the polarizers are perpendicular: initially, it would be due to lack of evidence (ie, all metaprobability is concentrated on the N-value), but once the measurement on the other side has been made and the result transmitted, the probability switches to being 50% true and 50% false, because now the outcome is a matter of classical probability. Strange quantum-information effects such as negative conditional entropy and superdense coding might be easier to comprehend in this interpretation.
This idea was partly inspired by the two-state-vector formalism, which holds that the wavefunction is not a complete determination of the quantum state. Rather, two wavefunctions are necessary, one from the past and one from the future. Representing the incompleteness of the state via constructive/intuitionistic logic seemed natural: the ‘truth’ of theorems are contingent upon the actual historical progress of mathematics. Rather than being a Platonic ideal that is true or false for all time, theorems only assume a ‘truth’ value once a proof (or disproof) exists. This same contingency (or contextuality, if you will) naturally applies to statements deriving their truth from the observed outcomes of experiments.
[Edit: This was inspired in part by Ronnie Hermens’ thesis: “Quantum Mechanics: From Realism to Intuitionism –A mathematical and philosophical investigation“]