PROBABILITY: THE LOGIC OF THE LAW – A RESPONSE
(This is an author-produced
electronic version of an article published in Oxford Journal of Legal Studies following peer review. The definitive
publisher-authenticated version 1995 14 Oxford Journal of Legal Studies 51-68 is available online here)
In their article
‘Probability: the logic of the law’ (1993), Robertson and Vignaux
(henceforth RV) argue that the (quantitative) rules of probability are the
‘uniquely determined set of rules for conducting inference’; and RV make it
clear that they intend no qualification to this -- so that they really are
saying that rational inference always involves following these rules as closely
as possible. I wish to contend that,
while compliance with logical rules (including quantitative rules of
probability, where they are applicable), is necessary for satisfactory
inference-drawing or legal fact-finding, it is very far from sufficient; and
that the most important and interesting aspects of satisfactory fact-finding
lie elsewhere.
There is a considerable literature on
the application of rules of probability to legal inference; but in this article
I will take two new approaches. The
first is narrow and focused, giving a complete hypothetical case, outlining the
kind of reasoning one might expect to find in practice, and considering how
such a case could be decided by application of quantitative rules of
probability. The second is broad and
eclectic, relating the debate to wider philosophical issues concerning the formalization
of human rationality and the nature of the human mind. In between, I briefly outline my views on
the proper role for quantitative probability in legal decision-making, and I
question RV’s approach to combining probabilities.
1 A hypothetical case
Let me begin with
my hypothetical case. The facts are
fairly simple but realistically messy; and the problem is greatly simplified by
the fact that there is only one controversial witness (upon whose demeanour
nothing turns), no significant questions of accuracy of recollection or
inadvertent reconstruction of events, and no evidence of conversations the
precise terms of which could have legal significance. Yet I will be saying that it would be patently absurd to try to
decide this case by applying quantitative rules of probability.
In 1982, a de facto couple
William Smith and Jane Jones buy a house in Carlton, Melbourne, as joint
tenants, for $80,000. They borrow
$50,000 from a bank, on the security of a mortgage over the property, which
they both sign as borrowers. The
balance comes from the sale of jointly owned property. They have no children, they are both
employed as high school teachers, they each own a car and have some individual
savings, but (so far as the evidence goes) they have no other major assets.
About two years later, William buys a
house in Collingwood, Melbourne, for $85,000, borrowing $60,000 from a building
society. The building society has a
stated policy of not lending to anyone with another house, and in the form
applying for the loan William states that he has no interest in any other house
or land, that he intends to reside in the Collingwood house, and (in giving the
sources of the $25,000 he has to contribute) that a Ms Jane Jones is to provide
$8,000 towards the acquisition.
About three months after settlement of
the purchase, William goes to live in the Collingwood house, ending the de
facto relationship.
At that time, there is still about
$50,000 owing on the Carlton house, and the title deed is held by the
bank. No lawyers are consulted, and
nothing is done to change the legal ownership of this property, and no document
evidencing any agreement affecting the ownership is produced to the court.
For the next nine years, Jane lives in
the Carlton house, paying all outgoings and mortgage interest, and paying off
all but $1,000 of the mortgage principal.
William lives in his Collingwood house, paying off his mortgage to the
building society. Neither of them forms
any other long-term relationship, and they remain on reasonably cordial terms.
In 1993, Jane dies suddenly. Her will appoints her sister (who has lived
in Western Australia since 1980) her executrix and sole beneficiary of her
estate. Three months later, William
goes to the bank, pays the $1,000 owing on the mortgage, obtains the title
deed, and claims outright ownership of the Carlton house as surviving joint
tenant.
Jane’s sister brings proceedings
seeking a declaration that the estate beneficially owns the property; or else
has a half interest in it, and/or the benefit of a charge arising out of the
payment of $49,000 of the mortgage principal.
The only controversial oral evidence
in the case (which lasts one day) is given by William. He says he purchased the Collingwood house
mainly as an investment but also as somewhere to live if the relationship with
Jane should break down; that although at one stage he intended to borrow $8,000
from Jane, he was in the event able to find the whole $25,000 himself, with the
help of a personal loan of $8,000 from another financier; that Jane gave him no
financial assistance with the purchase, and there was no agreement or
discussion between them concerning his interest in the Carlton property or his
liability under the joint mortgage to the bank. He says that, although he told lies to the building society to
obtain the loan, he regards giving evidence on oath in court as a very serious
matter, and he is telling the truth.
The only bank records still available
from the time of the purchase of the Collingwood property are some of William’s
bank statements: no records are
available concerning his alleged personal loan, or of Jane’s bank accounts at
the time. William gives evidence about
the sources of the other $17,000 of the purchase price (savings, earnings, sale
of car), but cross-examination shows he does not account for at least $4,000 of
this $17,000.
The judge gives the following reasons
for her decision in the case.
I first consider how much reliance
can be placed on William’s sworn evidence.
(1) He was vigorously cross-examined, but maintained his version of
events, and nothing in his demeanour assists me one way or the other. (2) If his evidence is true, he was prepared
deliberately to misrepresent important matters to the Building Society to gain
a financial advantage, so he may be willing now to give false evidence to gain
a financial advantage. Further, (3) it
seems unlikely that, in a reasonably cordial parting between these two people,
there should be no agreement or even discussion concerning the house they had
lived in or the joint mortgage liability in respect of it, particularly if
another house was being purchased as part of the separation process. (If
William had admitted to a discussion, he would have been cross-examined about
its terms, and his answers tested against a range of circumstances). It also seems unlikely (4) that the
Collingwood property was purchased primarily as an investment, and not as part
of the process of separation, having regard to the means of the parties and the
fact that they parted soon afterwards, or (5) that Jane made no contribution at
all to the price of the Collingwood property, having regard to the intention
stated to the Building Society and William’s failure to give any explanation of
the source of $4,000 of the price. (6)
While I could give William the benefit of the doubt (that his evidence was
false, and deliberately so) on one or two of these matters (3) to (5), the
three of them added to (2) lead me to decide that he is probably willing to lie
on oath in order to advance his case, and probably has done so.
I
now have to consider what, if any, positive inferences can be drawn from the
other facts and evidence, and in particular whether I can infer that there was
an agreement by William to give up his interest in the Carlton house in return
for some appropriate consideration. (7)
Such an agreement has some support from William’s statements to the Building
Society which, if there was such an agreement, could be considered true in
substance if not strictly true in form; although (8) I must allow for the fact
that they were made by a person I believe is probably willing to lie on
oath. (9) I think it is quite probable
that Jane did provide the $8,000 which William told the Building Society she
would provide, since I have no reason to doubt that her intention was as
William said it was in his application, and (10) I have no confidence in
William’s evidence about a personal loan from a financier; and (11) I think it
is also quite probable that, in one way or another, Jane provided the $4,000
which William did not account for.
(12) I would expect that the consideration, in any such agreement, for
William’s interest in the Carlton house would approximate the value of a half
interest, that is about $15,000 or perhaps a little more, and there is positive
support in the evidence for payments by Jane to William of no more than about
$12,000; but the evidence does not rule out other adjustments between
the parties. (13) The parties failed
over nine years to give proper legal effect to any such agreement, but this by
no means precludes the existence of such an agreement, because no lawyers were
involved in the separation, the parties remained on cordial terms, and Jane
would have got possession of the title deed when she had finished paying off
the mortgage. (14) Indeed the fact that
for as long as nine years William was living in and paying the mortgage on his
house, while Jane was living in and paying the mortgage on the Carlton house,
supports the existence of an agreement of that general kind. (15) I am more confident about drawing an
inference against William, because whereas Jane is unable to tell her story,
and there is no suggestion that there is any other person in a position to know
the facts, William has probably given deliberately false evidence, suggesting
that the truth is probably against his interests; and all in all, I am
satisfied, on the balance of probabilities, that such an agreement was made.
The judge goes on to decide that the lack of any evidence of documentation of the agreement is overcome by part performance, and decides that Jane’s estate is entitled to the Carlton house (subject to reimbursement of $1,000 to William). In case an appeal court might take a different view, she also considers whether or not a case was made out for severance of the joint tenancy, and/or for an equitable charge in favour of Jane’s estate; but I will not set out her reasons on these matters. The judge gives her decision the morning after the hearing, and then goes on to another case, no less resistant to decision by application of quantitative rules of probability.
The judge’s reasoning is not
conclusive, and some may disagree with her decision; but no one could deny that
the reasons are rational and quite persuasive.
I believe most people would prefer to have their cases decided on the
basis of reasons of that kind, than to have them decided wholly by a
mathematical computation of probabilities, which almost no one could
understand, so that a case like my example might be decided by a computed
probability of 0.51 in favour of one result or another - even if such a
computation were possible. More to
the point, I contend it is not generally possible (as well as being wildly
impractical).
B.2 The application of quantitative
probabilities
How would RV’s
ideal judge tackle this case?
According to them (p462), it should be
decided by application of Bayes’ theorem, which is an important theorem of
probability theory, devised in the eighteenth century by the Revd Thomas
Bayes. It states:
P(H|E) = P(H) x P(E|H) / [P(H) x P(E|H) +
P(not-H) x P(E|not-H)]
where:
P(H|E) is the probability of the
hypothesis, given the truth of the evidence (posterior probability of the
hypothesis);
P(H) is the probability of the hypothesis,
before considering the evidence (prior probability of the hypothesis);
P(E|H) is the probability of the evidence,
given the truth of the hypothesis;
P(not-H) is the (prior) probability of the
falsity of the hypothesis;
P(E|not-H) is the probability of the evidence,
given the falsity of the hypothesis.
To apply this
theorem, real numbers should be used to represent degrees of belief, with 1
representing certainty of truth and 0 representing certainty of falsity. Intermediate degrees of (reasonable) belief
should be represented by appropriate intermediate numbers, with 0.5
representing equal likelihood of truth and falsity, and so on. RV recommend that inference proceed by
assigning such numbers to the probabilities of hypotheses, given certain
evidence, and vice versa. These
probabilities should then be manipulated in accordance with the uniquely
determined rules of probability (including Bayes’ theorem) to find the
probability for the ultimate question before the court.
In attempting to apply this method to
my hypothetical case, an initial problem is that every one of the judge’s
statements (2) to (15) are judgements of probability entirely unsupported by
any applications of rules of probability.
They are judgements or opinions, based on nothing more than common
sense, experience of the world, and beliefs as to how people behave (folk
psychology). If it is accepted, as I
believe it must be, that the case has to be decided through consideration of
statements of this general kind (not necessarily the same ones as the judge
used), then the postulated ideal judge would either have to accept them
unsupported by rules of probability, or somehow justify them by applying rules
of probability.
I would be intrigued to see an attempt
to justify a numerical probability for any of these statements by applying
rules of quantitative probability to certain or reliable information - for
example, the statement in (2) to the effect that a person who has lied in a
substantial way to a building society, to get a favourable loan, may be willing
to give false evidence in court many years later, to win a case. But I don’t believe I will see such an
attempt, because (I think) it would be an absurdity. And the same applies to all the other statements.
If this is accepted, then the
application of the rules of probability has to begin after statements of that
general kind are accepted. This is
already a huge inroad into the suggestion that the rules of (quantitative)
probability are the rules for conducting inference; because some very important
steps in conducting the inference in this case (perhaps the most important)
must first be taken without regard to those rules.
Overlooking this, RV’s ideal judge now
tries to reach his conclusion from statements of this kind by applying the
rules of probability to them. First, he
will need to represent the probability involved in each statement by a
number. From the previous discussion,
the number cannot be calculated. The
judge just has to use his common sense, experience of the world, and folk
psychology, to hit upon a number which he thinks fairly represents the
probability in question. If he is
anything like me, the best he could do would be to give a range - say, from
0.05 to 0.1 to things which seem really unlikely; 0.2 to 0.3 to things which
seem rather unlikely; 0.4 to 0.6, say, for things which seem moderately likely;
and 0.7 to 0.9, say, for things which seem really likely.
One problem with this is that
mechanical application of the rules of probability to such rubbery figures seems
hardly likely to give a result which will decide the case. A more precise (and associated) problem is
that in fact the processes of inference from such statements, which are likely
to occur in any realistic case, are not mere manipulations of numerical
probabilities, but do themselves involve common-sense judgements of likelihood
of same type as those made in reaching the basic statements. For example, step (6) in the hypothetical
case, by which a conclusion was drawn from statements (2) to (5), involved
among other things a common-sense view as to the unlikelihood of an innocent
explanation for three, as opposed to two or one, probable falsehoods. And step (15) involved a common-sense view
as to the probability of the truth being unfavourable to William.
As an exercise, I have written a
judgement for the hypothetical case, which applies Bayes’ theorem; and set it
out in a postscript (Section B.5). It
required two assumptions of prior probabilities of hypotheses, and twelve
Bayesian steps, each involving two assumptions of numerical probabilities of
evidence, given the truth or falsity of hypotheses: twenty-six guesses in all.
In all twenty-six, I found I had virtually no confidence in the numbers
I initially selected (in some cases partly because of unsureness of exactly
what question I was asking, as well as because I just had to guess the answer);
and I felt I had to check the numbers against the plausibility of the results,
and then adjust (and re-adjust) the numbers, in order to arrive at numbers in
which I had very slightly more confidence.
That is, I had to cheat. Such
little confidence as I ended up with depended very heavily on my common-sense
assessment of the plausibility of the intermediate results and the
conclusion. The exercise strongly
suggests that in realistic situations Bayes’ theorem can fairly be regarded as
a procedure for checking the consistency of one’s intuitions as to probability
- and not as anything more than this.
I think my hypothetical case shows
that, for ordinary contested cases, it is fanciful to envisage a process by
which a court manipulates probabilities fixed upon for certain basic statements
(premisses) to arrive at a decision of the case (conclusion). In all steps from the premisses to the
conclusion, a judge will generally have in the forefront of her mind the actual
particular circumstance of the case, and will be making common-sense judgements
of (non-quantitative) probability in making these steps (as well as in
determining upon the premisses).
Indeed, the ultimate decision on the facts will generally itself be a
common-sense judgement of non-quantitative probability concerning the overall
situation, of very much the same kind as gave rise to the premisses - and very
often the judge will (rightly) be more confident of reaching a correct overall
conclusion ‘on the balance of probabilities’ than of assigning even approximate
numerical probabilities to the premisses.
One may call RV’s model of
fact-finding an actuarial model.
I would suggest a better model for most judicial fact-finding would be a
recognition model, assimilating it to a decision as to whether a person
now in one’s presence is a particular person with whom one was acquainted many
years ago, but has not seen since. In
arriving at this decision in a doubtful case, one will pay close attention to
component features and characteristics - eyes, nose, mouth, facial shape, hair,
height, manner, etc. - having regard to the changes likely to have resulted
from the passage of years. However, one
does not assess numerical probabilities concerning each of these
characteristics (e.g. that this is X’s nose, etc.) and compute from that a
probability of identity. Rather, one
will make an overall judgement, in which the contributions of the component
characteristics cannot be clearly separated out. This is a process of inference, to which quantitative rules make
no explicit contribution.
The superiority of the recognition
model will be even more pronounced in most court cases involving disputed
questions of fact, than in my hypothetical case; because most of such cases
depend even more heavily on common-sense reasoning of the type I have outlined,
because (inter alia):
(1) Generally there will be different versions
of the facts given by different witnesses.
(2) There will often be difficult questions, not
just as to honesty, but also as to the accuracy of recollection and the
possibility of reconstruction of events partially recalled.
(3) The vexed issue of demeanour is sometimes
important, in which case it needs to be considered in connection with the
nature of the evidence being given from time to time by the witness in
question.
(4) In property and contract cases, very often
the precise terms of imperfectly remembered conversations have important legal
consequences.
In their article, RV discuss various
objections to their thesis, including some a little like mine: that evidence must be interpreted, that
people actually compare hypotheses, that the complexity of evidence precludes
the application of quantitative rules of probability, and that decisions are
made in a holistic way rather than by dissection of elements. Their answers are abstract: I think the full impossibility of their
position is best demonstrated by a realistic example.
3 The role of probability
It is not my
position that quantitative rules of probability are unimportant in legal
fact-finding - although I do say they are less important than RV imagine. And I am certainly not suggesting disregard
of quantitative rules of probability.
What I am saying is that logical rules of all kinds, including
quantitative rules of probability, should be a guide, adjunct, and corrective
for plausible common-sense reasoning, but not a substitute for it.
And there is much in RV’s article I
agree with. I accept their
characterisation of (quantitative) probability as a numerical measure of the
strength of belief based on rational consideration of available evidence (p462)
(although, as I will note later, I do not agree with everything they say about
‘the mind projection fallacy’). I think
they make out a powerful case for the view that, for those inferences where a
quantitative analysis of probability is appropriate, their quantitative rules of
probability (p463-8) are the uniquely determined set of rules for conducting
inference; and I agree that the application of such rules is by no means
limited to frequentist probabilities (p469) - so I agree with their rejection
of any valid distinction between different kinds of mathematical
probability. I also agree that
application of quantitative rules of probability can help avoid fallacies
detected in untutored thinking (p460), and I will return to this.
(a) Non-quantitative probability
However, as argued
above, I say that much legal inference is necessarily non-quantitative. And I suggest that the probability which the
law requires for a finding for the plaintiff in a civil case, as well as for a
finding of guilt in a criminal case, is not primarily a quantitative
probability at all.
In criminal cases, juries are never
directed in terms which refer to a quantitative probability (such as 0.9 or
0.95); but rather are directed in terms requiring a finding ‘beyond reasonable
doubt’, with a minimum of elaboration on these hallowed words. Quantitative ideas are suggested by the
civil standard of proof of ‘the balance of probabilities’, and civil juries are
sometimes directed in terms that it is enough to find for plaintiffs that the
scales of justice be weighed down, by ever so little, in their favour; but even
in civil cases, I think the better view is that what is required is reasonable
satisfaction, for which a mere quantitative preponderance of probability may be
insufficient.
This view was strongly put by Dixon J
of the High Court of Australia in Briginshaw v Briginshaw (1938) 60 CLR
336 at 361-2:
... when the law requires the proof of any fact, the tribunal must feel an actual persuasion of its occurrence or existence before it can be found. It cannot be found as a result of a mere mechanical comparison of probabilities independently of any belief in its reality. No doubt an opinion that a state of facts exists may be held according to indefinite gradations of certainty ... [A]t common law ... it is enough that the affirmative of an allegation is made out to the reasonable satisfaction of the tribunal. But reasonable satisfaction is not a state of mind that is attained or established independently of the nature and consequences of the fact or facts to be proved. The seriousness of an allegation made, the inherent unlikelihood of an occurrence of a given description or the gravity of the consequences flowing from a particular finding are considerations which must affect the answer to the question whether the issue has been proved to the reasonable satisfaction of the tribunal.
I suggest there are
two main reasons why legal standards of proof are not primarily matters of
mathematical probability, reasons which themselves suggest guidelines for the
proper role of mathematical probability in legal fact-finding.
First, as argued above, in the general run of cases, decision-making is not a matter of mathematical computation, but is rather a matter of coming to a decision by the exercise of judgement, to the best of one’s ability and in accordance with common sense. Common sense, experience of the world, and folk psychology are generally far more important to fact-finding than quantitative rules of probability; and it is consistent with this that the ultimate question for the court should generally be defined in common-sense rather than purely quantitative terms.
Secondly, mathematical probabilities
can be based on the slightest and most general of information, whereas
reasonable fact-finding in particular cases requires, so far as reasonably
possible, that there be evidence directed to the particular facts.[1] Accordingly, courts should decline to give
effect to mere numerical probabilities, in the absence of evidence directed to
the particular facts, particularly where a reasonable endeavour by the party
with the onus of proof should have produced such evidence.
(b) The place of quantitative probabilities
The two reasons
given above suggest that the role for quantitative probabilities should be
greatest where their common-sense probative force in particular cases is
greatest, and/or where the party with the onus of proof cannot reasonably be
expected to produce evidence bearing more directly on the particular case.
Obvious examples of the former
category are DNA evidence in identification of criminals and paternity cases,
fingerprint evidence, and ballistic evidence; and I will say a little more
about this class of case in the next section.
As for the second category, consider
two much-discussed hypothetical examples: the gatecrasher case and the blue bus
case. The former concerns a rodeo
attended by 1,000 people, when only 400 tickets were sold, and the promoter
sues all 1,000 for the price of admission; and the latter concerns a case
brought by a person injured at night by a negligently-driven bus which the
plaintiff genuinely cannot identify, and the only evidence involving the
defendant is that 80 percent of buses operating in that area are operated by
the defendant’s Blue Bus Company. The
consensus seems to be that in neither case would the defendant(s) have to give
evidence in order to escape liability; but I think there is an important
difference between them.
In the gatecrasher case, I think it is
clear that a suggested 0.6 probability that each defendant did not pay for
admission could not justify a decision in favour of the promoter, even in the
absence of evidence from the defendants; because it would be unreasonable for
the promoter to seek to make out a case in this way, without evidence of the
ticketing system (or explaining its absence), of the possible means of access
without payment, of challenges to and explanations from particular spectators,
and so on.
However, in the blue bus case, it may
be that no other evidence is reasonably available to the plaintiff, in which
case I think the Blue Bus Company should have to give evidence if it is to
escape liability. Suppose that the
plaintiff has advertised without success for witnesses; and subpoenas
the records of the Blue Bus Company, and also of the Red Bus Company which runs
the other 20 percent of buses - and that these records leave the odds unchanged
when applied to the place and time range of the accident. Despite the absence
of evidence relating to the particular case, and of evidence excluding a
maverick outsider bus, it seems to me that if the Blue company does not call
its relevant drivers to deny that they were involved (or call some and explain
inability to call the rest), then the court could draw an inference against it,
on the balance of probabilities.[2] If the Blue company does call evidence, then
the case becomes one in which the quantities are just part of the overall
material which the court has to assess.
And despite my general sympathy with
Dixon J’s approach, I can envisage cases in which a bare mathematical
probability would justify a decision, even without assistance from a
defendant's failure to give evidence.
Suppose that a woman lives with two men and her child, and although it
is clear that one of the men is the child’s father, there is nothing in the
circumstances (or the birth certificate) to suggest which one. The three adults are killed in an accident,
and actual paternity is relevant to succession. If blood tests give respective probabilities of paternity of 0.55
and 0.45 for the two men, and there is just no other evidence reasonably
available to the child (or anyone else), I think a court should find paternity
on the basis of this bare mathematical probability, rather than make no finding.
(c) The role of quantitative rules
In those cases
where mathematical probabilities are relevant, application of the correct
quantitative rules can help avoid fallacies of inference, just as logical rules
can do.
In this general area, the work of Kahneman and Tversky (see Kahneman, Slovic, and Tversky 1982, and Cohen 1977), suggesting how common-sense reasoning can go astray through illogicality and unconscious biases, is relevant. One of their famous examples illustrates the ‘conjunction fallacy’: the failure to understand that the likelihood of a conjunction of events or states of affairs can never be greater than the likelihood of any element of that conjunction. This is the case of Linda.
Linda is 31 years old, single,
outspoken and very bright. She majored
in philosophy. As a student she was
deeply concerned with issues of discrimination and social justice, and
also participated in antinuclear
demonstrations.
Experiments were
conducted in which subjects were asked to rank, in order of probability, a
number of statements about Linda.
Perhaps the most telling result concerned 142 subjects, asked to rank in
order of probability the statements ‘Linda is a bank teller’ and ‘Linda is a
bank teller and is active in the feminist movement’. 85% ranked the latter as more likely (Tversky and Kahneman 1973).
Of course, this is a logical error -
and yet even this example suggests a strength of common-sense reasoning. The example is highly artificial: no such question is likely to present itself
in actual decision-making. What is
likely to arise is a question of the most appropriate categorisation of this
young woman, with a view to assessing her character and conduct. On that question (admittedly not the
question asked), the common-sense response is not inappropriate.
In legal fact-finding, Bayes’ theorem
can alert tribunals to the necessity of taking account of ‘prior probabilities’
when dealing with statistical evidence.
For example, if DNA evidence shows that only one person in ten thousand
could have the DNA markers of the perpetrator of a crime and the accused has
those markers, that does not of itself mean that there is 0.9999 probability
that the accused committed the crime.
On the other hand, if there are (say) one million people who conceivably
could have committed that crime, and thus about 100 with the same DNA markers,
it is also wrong to argue that the DNA evidence is irrelevant.
What has to be taken into account is
the effect of the other evidence in the case.
For example, if there is no other evidence which picks the accused out
from (say) 5,000 other persons who could have committed the crime, then Bayes’
theorem shows that the DNA statistic will only give an overall probability of
about O.67,[3] far too low for
proof beyond reasonable doubt. If other
evidence indicates that the accused is one of only 500 who could have committed
the crime, then, even in the absence of any further evidence isolating the
accused, Bayes’ theorem shows that the DNA evidence will increase the
probability that he committed the crime to about 0.95,[4] and it may be that
not much more is needed for proof beyond reasonable doubt.
This is just an
example: there are many ways in which
the quantitative rules of probability can help ensure that evidence of
probabilities is not misapplied.[5]
(d) Combining probabilities
A very significant
difference between the non-quantitative approach of Dixon J, and the
quantitative approach recommended by RV, concerns the combination of
probabilities.
RV make the sweeping assertions
(pp472-3) that ‘the task of the court is to determine the odds that the
defendant is liable’; and that if this liability can arise through any of two
or three different events, then it will be sufficient to prove the requisite
degree of probability that one or more of these events occurred, without
proving which one. Similarly, they assert (p464n) ‘that in civil cases where
liability must be proved on the balance of probabilities the essential elements
of liability, e.g., duty of care, breach of duty and loss or damage must each
be proved to a higher standard’.
So, they would say, if a plaintiff was
entitled to succeed on any of three entirely independent grounds, and each is
shown to have an independent probability of 0.25, then the probability of the
defendant being liable is about 0.25 + (0.25 x 0.75) + (0.25 x 0.5625), or
about 0.58; and the plaintiff wins. But
if the plaintiff needs to show duty, breach, and damage, and shows a
probability of each (independent of the others) of 0.65, then the probability
of the defendant being liable is only 0.65 x 0.65 x 0.65, or about 0.275, and
the plaintiff loses.
Looking first at the latter case, in
relation to damages it may be necessary to distinguish between those events
which allegedly did in fact occur, and those events advantageous to the
plaintiff which allegedly would have occurred but for the breach of duty. In so far as probability concerning damages
relates to the latter class of events, any damages awarded to the plaintiff
will be reduced proportionately to the extent to which this probability is less
than 1: where the probability is small,
this is called the loss of a chance.
There could thus be no justification for taking this probability into
account against the plaintiff winning the case.[6] But quite apart from this, contrary to RV’s
contention, the law seems to be that normally the tribunal finds each element,
and indeed each fact, on the balance of probabilities, and then treats it as
certain (169 CLR at 642-3); so that there is generally no question of a reduced
overall probability being given by multiplying component probabilities.
I contend that this is generally
justified, because on the Dixon approach the tribunal is not assessing
numerical probabilities at all; and strict application of RV’s view would
require the tribunal to bring into account in the plaintiff’s favour all
possible permutations and combinations of all more or less probable facts which
could establish liability, making the decision process impossibly complex. Furthermore, I think RV’s view would have
questionable results in certain cases, such as cases of alternative
defendants. If a plaintiff proved a
0.65 probability that someone was liable, plus a 0.65 probability that it was
one defendant and a 0.35 probability that it was the other, then arguably she
should succeed against the former; but RV would say that she should fail
completely, because her best case was only a 0.42 probability against the first
defendant. Consider also the case of
alternative plaintiffs: if in two cases
heard together, the court finds (1) a 0.65 probability that one plaintiff owned
certain property, (2) a 0.35 probability that the other (competing) plaintiff
owned it, and (3) a 0.65 probability that the defendant negligently destroyed it,
RV’s view would mean that the defendant would win both cases.
However, I accept that courts should
be well aware that if it is less than certain that each of two independent
events occurred, then it is even less certain that both occurred; and courts should
also be aware of the underlying mathematical rules. And I accept that it will sometimes be appropriate to bracket
aspects of a plaintiff's case, and look for satisfaction, on the balance of
probabilities, about the combination of these aspects. But I think it will rarely be appropriate to
express this quantitatively, in terms (say) of a 0.65 probability for each of
two independent elements giving a 0.42 probability for their combination; and I
think that whether or not this approach should be taken will depend, not so
much on rules of probability, as on considerations of reasonableness or rules
of law.
As for the other case, that of adding
probabilities of alternative grounds of liability, again I think RV’s general
proposition is incorrect as a matter of law.
For example, suppose that a purchaser of some property seeks to set
aside the purchase on three independent grounds: undue influence presumed from a solicitor-client relationship,
misrepresentation that the property has feature A (which it certainly doesn't
have), and fundamental breach in that feature B (which was certainly promised)
is totally defective. The court decides
that the probability of each alternative (i.e. the existence of the solicitor-client
relationship and consequent presumed undue influence, the making of the
misrepresentation and reliance on it, and the fundamental defectiveness of
property B) is 0.25. 1 think it is
clear that each ground would be decided against the plaintiff, and the
defendant would win the case even if the probabilities are independent.
However, where a liability-creating
event (say, a cause of loss under an insurance policy: RV 1993, pp472-3) can occur in more than one
way, a court may be justified in bracketing these ways together, if the evidence
gives appropriate satisfaction that the event occurred in one or other of these
ways. But again, I think that whether
or not this bracketing is justified will depend less on rules of probability
than on considerations of reasonableness or rules of law.
4 Wider questions
(a) The philosophical debate
It should be
recognised that the debate concerning quantitative and non-quantitative
probability in legal fact-finding is related to a wider philosophical debate as
to whether human rationality can be formalized - and indeed as to the nature of
the human mind, in that if human rationality could be formalized, it would seem
reasonable to believe in the adequacy of the computational view of the human
brain and mind.
One important thread in the wider
debate can be traced through Hume’s argument about the circularity of induction
(Hume 1739), Popper’s elaboration of this argument (Popper 1959), Hempel’s
paradox of confirmation (Hempel 1965), Goodman’s strictures on analogical
argument and his new riddle of induction (Goodman 1965, 1970), and Putnam’s
critiques of Bayesian analysis and the Carnap theory of confirmation based on
it (Putnam 1979, 1981, 1983).
I find convincing Putnam’s arguments
to the conclusion that human rationality cannot be formalized, at least without
formalizing complete human psychology (and possibly not even then); and that
the various puzzles, paradoxes, and riddles of induction are related to the
need for prior probabilities for the application of Bayes’ theorem (TMM
pp114-26).
Bayes’ theorem can never itself give
us the probabilities that it needs to get started, in particular the prior
probability of the hypothesis being considered, and the prior probability of
each piece of evidence. Since common-sense
reasoning is generally required to produce these ‘priors’, there seems little
justification for attempting to exclude it entirely, in favour of purely
quantitative rules, in later stages of the reasoning process.
(b) An evolutionary slant
Our brains are
capable of performing marvellous algorithmic procedures - for instance, in the
pre-conscious processes associated with seeing. These capabilities, which have yet to be approached by today’s
computers, make possible our access to a vast amount of visual information,
continually ‘updated’ in ‘real time’. As examples of the computational
complexity required for these processes, one may consider the computations that
must be involved in achieving continuous stereoscopic vision, by analysis of
similarities and differences between the signals from two eyes; and those
involved in achieving the apparent stability of a viewed scene despite
movements of one’s eyes, head, and body.
The computational virtuosity and
reliability of these processes contrasts starkly with our lack of virtuosity
and reliability in consciously performing even simple mental arithmetic. Some gifted (or partially gifted) persons
have a facility for tapping into some of the brain's algorithmic capacities -
think, for example, of the twins described by Oliver Sacks (1986, pp185-203),
who enjoyed exchanging prime numbers of six digits. But most of us perform mental arithmetic in the same clumsy and
fallible way as we conduct other conscious common-sense reasoning, at a
mathematical level incomparably lower than that of the computations our brains
are constantly carrying out in order to enable us to see, hear, balance, catch
balls, etc.
Natural selection in evolution has
equipped us with this prodigious computing capacity. If satisfactory decisions on matters important for our survival
and reproduction could be made by algorithms, one might have expected that
evolution would have ensured that they be made by using this capacity, with no
interference from our very fallible conscious processes. And yet, we are in fact so constituted that,
whenever in life we are faced by a novel situation in which an important
decision or action is required, our conscious minds are brought to bear - and
we have no alternative but to use our sloppy common-sense reasoning, with all
its fallacies and biases.
To me, this suggests that our
conscious common-sense reasoning must have some advantages over even the most
marvellous of algorithmic procedures, so that efforts to replace common-sense
plausible reasoning completely by algorithms of the type discussed by Bayes, or
Carnap, or RV, are misguided. A better
course is to recognize a more modest role for algorithms, a role of assistance
and correction, and to pay careful attention to the actuality of common-sense reasoning
as it works in real life.
(c) ‘The mind projection fallacy’
Finally, I note
that RV refer (1993, p460) to ‘the fallacy of regarding probability as a
property of objects and processes in the real world rather than a measure of
our own uncertainty’. If all they mean
to say is that probability, as a measure of our uncertainty, does not imply
uncertainty in the real world, this would be unexceptionable. But they also seem to be saying that it is a
fallacy to think there is in fact any uncertainty or indeterminism at all in
the world - and that is a very dubious position. In a footnote, they add: ‘Readers may have encountered references to
quantum theory as a non-deterministic foundation for physics ... They should be
aware that this theory is by no means uncontroversial’; and they refer to a
forthcoming article.
I am astounded by this. Quantum theory is the best-established
physical theory we have. And while its
interpretation is very controversial, there is nevertheless a substantial
consensus among physicists that, at the atomic level, there is irreducible
indeterminacy and indeterminism in the world.
This is apparent from superior popular expositions such as those by
Polkinghorne (1984) and Davies (Davies and Brown 1986), text books such as those
by Feynman (Feynman et al 1963) and Dirac (1958), collections of learned
articles such as that of Wheeler and Zurek (1983), and deep philosophical
considerations such as that by d’Espagnat (1989). What RV have done is to refer to one of the few dissenters from
this consensus, so as to suggest a controversy which barely exists - and then
completely ignore the consensus itself, by asserting, contrary to this
consensus, that it is a fallacy to attribute probability to processes in the
world!
What has this to do with legal
fact-finding? Not much directly; but
such distortions should not go unchallenged - and acceptance of a deterministic
view of the world (which I say is unjustified) may lead one more readily to
slip into thinking, without question, that reasoning must be algorithmic. But to pursue this would require another
article, or a book.
First, I consider
the hypothesis ‘William has lied on oath when he believed it would help his
case’. I take the prior probability of
this to be 0.025.
The first piece of evidence on this is
that William told the building society that he had no house. The probability of this, given the
hypothesis, I put at 0.5 (remembering that William could have tried other
financiers for whom owning another property was no disqualification, and also
that, if William has lied, there is some chance that he did not in fact
beneficially own any other property); and its probability, if the hypothesis is
false, I put at 0.05. So the first step
is:
P(H|E) = (0.025 x 0.5) / [(0.025 x 0.5) + (0.975 x 0.05)]
= 0.2
The second piece of evidence on this
is that William said on oath that he had no discussion with Jane about the
house and or the mortgage. The
probability of him saying this, given the hypothesis, I put at 0.3; and its
probability, if the hypothesis is false, I put at 0.15. So the second step is:
P(H|E) = (0.2 x 0.3) / [(0.2 x 0.3) + (0.8 x 0. 15)]
= 0.33
The third piece of evidence on this is
that William said on oath that he purchased the house as an investment. The probability of him saying this, given
the hypothesis, I put at 0.3; and its probability, if the hypothesis is false,
I put at 0.15. So the third step is:
P(H|E) = (0.33 x 0.3) / [(0.33 x 0.3)+(0.67 x 0.15)]
= 0.5
The fourth piece of evidence on this
is that William said on oath that Jane made no contribution towards his
purchase. The probability of him saying
this, given the hypothesis, I put at 0.7; and its probability, if the
hypothesis is false, I put at 0.3. So
the fourth step is:
P(H|E)
= (0.5 x 0.7) / [(0.5 x 0.7) + (0 5 x 0.3)]
= 0.7
The fifth piece of evidence on this is
that William has said three things on oath with respective probabilities of
0.15, 0.15, and 0.3 (if the hypothesis is false), and 0.3, 0.3, and 0.7 (if the
hypothesis is true). I think there is
an additional (im)probability in this, if the hypothesis is false, of 0.4; as
against a probability of 0.6 if the hypothesis is true. So the fifth step is:
P(H|E) = (0.7 x 0.6) / [(0.7 x 0.6) + (0.3 x 0.4)]
= 0.78
Next, I consider the hypothesis ‘There
was an agreement whereby William gave up his interest in the Carlton house for
an appropriate consideration’. I take
the prior probability of this, in the circumstances of a cordial separation in
which William went to a house of his own, to be 0.3.
The first piece of evidence on this is
that William told the building society that he had no house. Given that there is a 0.78 probability that
William has lied on oath before me, I put the probability of this, given the
hypothesis, at 0.6; and its probability, if the hypothesis is false, I put at
0.4. So the first step is:
P(H|E) = (0.3 x 0.6) / [(0.3 x 0.6) +(0.7 x 0.4)]
= 0.39
The second piece of evidence on this
is that William told the building society that Jane would provide $8,000. The probability of this, given the
hypothesis, I put at 0.4; and its probability, if the hypothesis is false, I
also put at 0.4 (since this evidence seems neutral as between the hypothesis,
on the one hand, and an intention to lend and or make some other agreement, on
the other). So the second step leaves
the probability of the hypothesis unchanged.
The third piece of evidence on this is
that William cannot account for $4,000 of the price. The probability of this, given the hypothesis (and given that
there is a 0.78 probability that William lied on oath), I put at 0.6; and its
probability, if the hypothesis is false, I put at 0.3. So the third step is:
P(H|E) = (0.39 x 0.6) / [(0.39 x 0.6) + (0.61 x 0.3)]
= 0.56
The fourth piece of evidence on this
is that the evidence as to consideration concerns no more than $12,000, whereas
a half interest was worth about $15,000 or a little more. Having regard to the non-availability of
witnesses for Jane's estate, I put the probability of this, given the
hypothesis, at 0.5; and its probability, if the hypothesis is false (and trying
to avoid double counting of the two previous pieces of evidence), I also put at
0.5. So the fourth step leaves the
probability of the hypothesis unchanged.
The fifth piece of evidence on this is
that no lawyers were consulted and no documentation of the agreement has been
produced. The probability of this,
given the hypothesis, I put at 0.3 (remembering there could have been some
documentation not now available to Jane’s estate); and its probability, if the
hypothesis is false, I put at 0.8 (not more, since the falsity of the
hypothesis includes the making of a lesser agreement). So the fifth step is:
P(H|E) = (0.56 x 0.3) / [(0.56 x 0.3) + (0.44 x 0.8)]
= 0.32
The sixth piece of evidence on this is
that William and Jane, for nine years have lived in separate houses, paying off
separate mortgages, and William has had no association with the Carlton
house. The probability of this, given
the hypothesis, I put at 0.9; and its probability, if the hypothesis is false,
I put at 0.3. So the sixth step is:
P(H|E) = (0.32 x 0.9) / [(0.32 x 0.9) + (0.68 x 0.3)]
= 0.59
The final piece of evidence on this is
that William has with probability 0.78 lied on oath. The probability of this, given the hypothesis (and trying to
avoid double counting), I put at 0.05; and its probability, if the hypothesis
is false, I put at 0.025. So the final
step is:
P(H|E) = (0.59 x 0.05) / [(0.59 x 0.05) + (0.41 x 0.025)]
= 0.74
REFERENCES
Cohen, L. J. (1977), The Probable and the Provable (Oxford: Oxford University Press).
Davies, P. and Brown, J. R. (1986), The Ghost in the Atom (Cambridge: Cambridge University Press).
d’Espagnat, B. (1989), Reality and the Physicist (Cambridge: Cambridge University Press).
Dirac, P. A. M. (1958), The Principles of Quantum Mechanics, 4th ed. (Oxford: Oxford University Press).
Feynman, R., Leighton, R., and Sands, M. (1963), The Feynman Lectures in Physics (Reading MA: Addison-Wesley).
Goodman, N. (1965), Fact, Fiction and Forecast (New York: Bobbs-Merrill).
Goodman, N. (1970), ‘Seven strictures on similarity’, in Foster and
Swanson (1970).
Hempel, C. G. (1965), Aspects of Scientific Explanation (London: Macmillan).
Hodgson, D. (1991), The Mind Matters (Oxford: Oxford University Press).
Hume, D. (1739), A Treatise of Human Nature.
Kahneman, D., Slovic, P., and Tversky, A. (eds) (1982), Judgement under Uncertainty: Heuristics and Biases (Cambridge: Cambridge University Press).
Polkinghorne, J. (1984), The Quantum World (London: Longman).
Popper, K. R. (1959), The Logic of Scientific Discovery (London: Hutchinson).
Putnam, H. (1979), Mathematics, Matter and Method (Cambridge: Cambridge University Press).
Putnam, H. (1981), Reason, Truth and History (Cambridge: Cambridge University Press).
Putnam, H. (1983), Realism and Reason (Cambridge: Cambridge University Press).
Robertson, B. and Vignaux, G. A. (1993), ‘Probability: the logic of the law’, Oxford Journal of Legal Studies 13, 457-78.
Sacks, O.
(1986), The Man who Mistook his Wife for a Hat (London: Picador).
Tversky, A. and Kahneman, D. (1983), ‘Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment’, Psychological Review 90.
Wheeler, J. A. and Zurek, W. H. (eds) (1983), Quantum Theory and Measurement (Princeton: Princeton University Press).
…………………………………………
………………………………..…………
[1] I think the fallacy of Murphy J’s view in TNT Management v Brooks (1979) 23 ALR 345 is that the probabilities he uses are based on material inadequate to support an inference in a particular case.
[2] It follows that I disagree with the reasoning of the majority in SGIC v Laube (1984) 37 SASR 31; cf. Eggleston (1987) and see Rose v Abbey Orchard Property Investments [1987] Aust Torts Reports 80-121.
[4] P(H|E) = 0.002 x 1 / [(0.002 x 1) + (0.998 x
0.0001)] = 0.9524716.
[5] The fallacy which, on one view, was suggested by the decision in Chamberlain v The Queen (1984) 153 CLR 521 may be regarded as being a misapplication of quantitative rules of probability: cf. Shepherd v The Queen (1990) 170 CLR 573.
[6] Malec v J. C. Hutton Pty Ltd (1990) 169 CLR 638. However, where damages are an element of the cause of action, the plaintiff does have to prove, on the balance of probabilities, the loss of a chance or commercial opportunity which itself has some value: Sellars v Adelaide Petroleum NL (1992-4) 179 CLR 332.